[drupal-docs] generating Drupal Handbook pdf's manually

Charlie Lowe cel4145 at cyberdash.com
Fri Apr 22 05:48:32 UTC 2005


I don't know that I have a good answer to most of your questions. But, 
I'm thinking more along the lines of the DMOZ Open Directory Project way 
of indexing websites versus the Google way--human determined instead of 
automatic/algorithmic. I suspect that in this instance, the automatic 
indice generation cleanup by human participation might take as much work 
as doing it manually to begin with.

Similarly, one of the problems with the automatic generation of indices 
is that it would be difficult to make it heirarchical in a useful 
way--most good indexes have indices and sub-indices. So, for the sake of 
thinking in new directions, suppose that the folksonomy module could 
include (maybe it does already?) an operator of some kind that denoted 
heirarchy. An index entry like

blocks
   - paypal

would be entered in something like

blocks < paypal



Djun Kim wrote:
>    A more conventional approach would be to encourage authors
> to tag individual words or <span class='index-entry'
> key='phrase'>phrases</span>.  This might be something we could
> ask authors to do.
> 
> Your idea is interesting, though. I'm trying to envision how this
> might work.  This might be a useful way to deal with some of
> the 'remaining 30%' of indexing which is difficult to do automatically.
> This tends to be the subtle and challenging portion of the work.
> There's two main activities:
> 
>  1) eliminating, simplifying, or correcting automatically
>     generated entries; and
>  2) generating new entries which were missed by the automatic
>     process.
> 
>    This approach could be quite useful in deciding if a given
> set of references are consistent, e.g., that 'logging in'  doesn't
> accidentally reference a paragraph about error logging.
> 
> There's some practical challenges to adding new tags via this:
> I think a book of 350 to 400 pages would eventually have an index of
> 25 - 30 pages, in two columns, with about 35 entries per column,
> so that gives about 1500-2000 entries.  We'd have to find
> some clever means to present that many taxonomy items
> in an accessible way.  This ties into the issue raised by
> Anisa in
> 
> http://lists.drupal.org/archives/drupal-support/2005-04/msg00060.html
> 
>    Lots of questions occuring to me...
> 
>    How can we specify how a given index term is intended to be
> used?
> 
>    If we have a source text which is used to create both
> paper copy and web based documents, can we use indexing
> information to generate a page-oriented index, as well as
> a 'node' based index, i.e., to provide hints to a search engine?
> 
>    Another approximiate indexing approach would be to try to
> extract indexing information from search logs.
> 




More information about the drupal-docs mailing list