[support] HTML Import
Fred Jones
fredthejonester at gmail.com
Tue Nov 13 17:45:56 UTC 2007
I have a project in which I am being given ~ 10K RTF files, each
representing one page in a dead-tree book which was scanned and then
manually edited. I want to make these nodes in Drupal. I have a
converter that converts them into HTML. The resultant is very close to
valid HTML and is not too bad in fact.
So the task now is to import these files into Drupal. They are
hierarchically stored in VolumeX/PageY.html where VolumeX is a folder
and PageY.html is the HTML for that page. I have found two options for
this import:
wgHTML: http://drupal.org/project/wgHTML
Import HTML: http://drupal.org/project/import_html
The first doesn't actually import apparently, it just runs a real-time
search for each page request. Seems non-ideal to me at first glance. The
second seems reasonable, but a bit complex to configure and run.
My inclination is to try to work with the second module and see if I can
get it to work and do the import once on my local Windows PHP5 machine.
Any thoughts or advice on this topic? :)
Thanks!
More information about the support
mailing list