[drupal-docs] generating Drupal Handbook pdf's manually

Djun Kim puregin at puregin.org
Wed Apr 20 23:52:32 UTC 2005


Quoting Moshe Weitzman <weitzman at tejasa.com>:

> bryan on the web wrote:
>> It seems to me that is might make more sense in the long run to 
>> build Server-Side PDF generation into the book module using PHP.  
>> There is a costly PDF library for PHP called PDFlib
>

I have set up a script which will automatically generate PDF from the HTML
'printer friendly' version. The preliminary results may be found here

      http://www.puregin.org/files/Drupal_handbooks.pdf

A good deal more tweaking is possible (and required, I think).  The point is
that this is generated entirely automatically, so we could run the job from
cron once per day, (more frequently if the rate of change of documentation
picks up)

  All of this is done using freely available open-source tools (wget,
perl, html2ps, ps2pdf, ImageMagick).  I'm happy to contribute scripts,
configuration files and installation help if someone wants to set this up on
the
documentation site.

> IMO, the fundamental issues here are not about generating the .pdf. 
> They are around empowering a group of authors to contribute content 
> which is later transformed into a *pretty* book. Pretty implies 
> consistency, graphics, sidebars, footnotes, etc.

   I strongly agree.

   Transforming great content into a beautiful book is non-trivial. Typesetting
and print production are difficult and delicate arts, at their best when
readability figures first among aesthetic factors.  Long before one gets to
that point, however, much can be done to make a better book.

   That having been said, I'm strongly in favour of 'good enough is better than
nothing at all'.  Having a printable copy will help with the task of 
editing. The process of getting something that looks good on paper will 
require debate
and decision-making which will lead to a better document.

   Moshe mentions consistency, and I agree that this is something we should try
to improve at the level of form and content.  A style guide would be useful
here, as well as some high-level editorial policy decisions.

   An example of a useful style rule would be 'no use of heading tags
in handbook pages (e.g., no <H1>, <H2>, etc.)'

   An example of a useful policy decision would be splitting the handbook into
several documents, targeted at different groups of users.   It would also be
useful to have this document be versioned, and track Drupal Core 
releases.  This would let us archive sections such as 'how to convert 
from Drupal 2.0 to
Drupal 3.0' and hopefully never look at them again :)

>
> Charlie contributed a process which takes a bit of work, and produces 
> decent results (I suppose - I haven't actually seen the output). How 
> can we do better?

   The automatic generation of PDF would be greatly helped by 
consistency in the
authored HTML.  Structural tags such as H1's should not be gratuitously 
embedded
in sections.   There's a certain amount of noise - dubious HTML, inconsistent
treatment of characters and entities, source formatting, line-end conventions,
which it would be nice to clean up from time to time.

   Could we build a process for documentation which follows a software release
model?  People contribute, contributions are discussed, filtered, moderated,
and applied to the documentation; at a release point, there is a 'freeze',
existing problems get fixed, and the HTML gets converted (mostly 
automatically,
but with experienced human interaction where required) into a 'publication'
format such as TeX, DocBook, MIF....  An 'official' print document is 
released,
in PDF, and the publication format is used to automatically generate 
HTML which
gets imported into a Drupal Book for the next round of community editing.



-- 
puregin at puregin.org
http://www.puregin.org




More information about the drupal-docs mailing list