[drupal-docs] generating Drupal Handbook pdf's manually
Charlie Lowe
cel4145 at cyberdash.com
Fri Apr 22 05:48:32 UTC 2005
I don't know that I have a good answer to most of your questions. But,
I'm thinking more along the lines of the DMOZ Open Directory Project way
of indexing websites versus the Google way--human determined instead of
automatic/algorithmic. I suspect that in this instance, the automatic
indice generation cleanup by human participation might take as much work
as doing it manually to begin with.
Similarly, one of the problems with the automatic generation of indices
is that it would be difficult to make it heirarchical in a useful
way--most good indexes have indices and sub-indices. So, for the sake of
thinking in new directions, suppose that the folksonomy module could
include (maybe it does already?) an operator of some kind that denoted
heirarchy. An index entry like
blocks
- paypal
would be entered in something like
blocks < paypal
Djun Kim wrote:
> A more conventional approach would be to encourage authors
> to tag individual words or <span class='index-entry'
> key='phrase'>phrases</span>. This might be something we could
> ask authors to do.
>
> Your idea is interesting, though. I'm trying to envision how this
> might work. This might be a useful way to deal with some of
> the 'remaining 30%' of indexing which is difficult to do automatically.
> This tends to be the subtle and challenging portion of the work.
> There's two main activities:
>
> 1) eliminating, simplifying, or correcting automatically
> generated entries; and
> 2) generating new entries which were missed by the automatic
> process.
>
> This approach could be quite useful in deciding if a given
> set of references are consistent, e.g., that 'logging in' doesn't
> accidentally reference a paragraph about error logging.
>
> There's some practical challenges to adding new tags via this:
> I think a book of 350 to 400 pages would eventually have an index of
> 25 - 30 pages, in two columns, with about 35 entries per column,
> so that gives about 1500-2000 entries. We'd have to find
> some clever means to present that many taxonomy items
> in an accessible way. This ties into the issue raised by
> Anisa in
>
> http://lists.drupal.org/archives/drupal-support/2005-04/msg00060.html
>
> Lots of questions occuring to me...
>
> How can we specify how a given index term is intended to be
> used?
>
> If we have a source text which is used to create both
> paper copy and web based documents, can we use indexing
> information to generate a page-oriented index, as well as
> a 'node' based index, i.e., to provide hints to a search engine?
>
> Another approximiate indexing approach would be to try to
> extract indexing information from search logs.
>
More information about the drupal-docs
mailing list