[drupal-docs] generating Drupal Handbook pdf's manually

Djun Kim puregin at puregin.org
Fri Apr 22 02:27:35 UTC 2005


    A more conventional approach would be to encourage authors
to tag individual words or <span class='index-entry'
key='phrase'>phrases</span>.  This might be something we could
ask authors to do.

Your idea is interesting, though. I'm trying to envision how this
might work.  This might be a useful way to deal with some of
the 'remaining 30%' of indexing which is difficult to do automatically.
This tends to be the subtle and challenging portion of the work.
There's two main activities:

  1) eliminating, simplifying, or correcting automatically
     generated entries; and
  2) generating new entries which were missed by the automatic
     process.

    This approach could be quite useful in deciding if a given
set of references are consistent, e.g., that 'logging in'  doesn't
accidentally reference a paragraph about error logging.

There's some practical challenges to adding new tags via this:
I think a book of 350 to 400 pages would eventually have an index of
25 - 30 pages, in two columns, with about 35 entries per column,
so that gives about 1500-2000 entries.  We'd have to find
some clever means to present that many taxonomy items
in an accessible way.  This ties into the issue raised by
Anisa in

http://lists.drupal.org/archives/drupal-support/2005-04/msg00060.html

    Lots of questions occuring to me...

    How can we specify how a given index term is intended to be
used?

    If we have a source text which is used to create both
paper copy and web based documents, can we use indexing
information to generate a page-oriented index, as well as
a 'node' based index, i.e., to provide hints to a search engine?

    Another approximiate indexing approach would be to try to
extract indexing information from search logs.

Quoting Charlie Lowe <cel4145 at cyberdash.com>:

> I should clarify. The index page in the handbook on drupal.org would 
> point to the specific page. It's the pdf where it would end up 
> posting to the title page of the section it refers to.
>
> Charlie Lowe wrote:
>> Somewhat crazy idea: What if when we get to Drupal 4.6 we could use 
>> the folksonomy module to tag individual doc pages? Handbook 
>> maintainers/creators could assign the tags. Then use the folksonomy 
>> tags to produce an index section of tags used in the book? Granted, 
>> it wouldn't point to the exact page, but rather that subsection, but 
>> it's an interesting application of soon to be existing Drupal 
>> technology to solve the problem. And a nice way to provide an onsite 
>> index to the book, too.
>>
>> Djun Kim wrote:
>>
>>>
>>>    It is not possible to do this using existing tools and methods.
>>> The generated Postscript is DSC compliant, so in principle, some
>>> kind of automatic indexing could be attempted from the Postscript
>>> file.  However, experience and the wisdom of my elders cautions me
>>> against much faith in automatic indexing.  Careful automatic
>>> indexing of clean source files should be able to generate about 70%
>>> of index entries, in my experience.  The remaining 30% requires
>>> human attention, and of course takes much longer than 30% of
>>> the time required for indexing :)
>>>



-- 
puregin at puregin.org
http://www.puregin.org




More information about the drupal-docs mailing list