[consulting] From Highly Formatted Print to Web -- Best Solution?

SHEN Liang shenzhuxi at gmail.com
Mon May 16 10:43:20 UTC 2011


check my module
http://drupal.org/project/fileviewer

On Fri, Apr 8, 2011 at 2:01 AM, Ted <ted-drupalists at webfirst.com> wrote:
>
> On 4/7/2011 1:06 PM, Shai Gluskin wrote:
> > Here is my idea that I want feedback on: They should take a screen
> > shot of the whole page (the pages are small, like 5in. x 7in.) for
> > each day and upload the image. In order to get search-engine traffic
> > for those pages, I thought they should just copy and paste the text
> > from the pdf into a text field, losing all the formatting. Using css,
> > I'll make sure the text field displays off-screen. This way they get
> > the cleanest/easiest data entry and best-looking presentation while
> > still allowing search engines to drive traffic to the page based on
> > the text contents of a field that will be offscreen.
> >
> > Does that make sense?
> >
> > Any other ideas?
> >
>
> We've used a pdf-based toolchain to emulate the google docs pdf
> quickview function, including "highlighting" and copying text.
>
> http://pdftoxml.sourceforge.net/ - Gives you the coordinates and
> dimensions of each piece of text
> pdftoppm - Exports each page to a PNG image (with appropriate arguments)
> convert - ImageMagick tools to convert PNGs to indexed format, make
> thumbnails, etc.
>
> You'll most likely want to use some XSLT to cut down the size of the XML
> file from pdftoxml. These tools can be used to automate your idea above
> (with or without support for highlighting/copying the book text).
>
> Ted
>
> _______________________________________________
> consulting mailing list
> consulting at drupal.org
> http://lists.drupal.org/mailman/listinfo/consulting


More information about the consulting mailing list