[support] Indexing / Searching file contents

Austin Einter austin.einter at gmail.com
Sat Sep 22 13:59:24 UTC 2012


Hi All
I was doing bit R&D on how can we do searching in file contents.
To search a particular token or word in a text/doc/pdf file, probably we
can use apache solr / tika combination.
With the help of apache solr attachment module, probably we can search a
specific word in a attached word document.

Please correct me if wrong in my assumption so far.

On further study, I came to know if a node is created and a document is
attached to the node then only in next cron run, the attached document will
be indexed and one will be able to search a specific word in that document
after certain delay (hope default delay is 2 minute).

Now the question is if somebody just uploads a document, and there is no
need to attach that document to any node, is there any way out, we can get
the document indexed in next cron run. I can programatically create a node
and attach the document so that I will be able to index it. But just for
indexing a document, do not want to create a node programatically. Because
of this reason, day by day, number of nodes will keep incrementing.

Any suggestion highly appreciated.

Thanks
Austin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.drupal.org/pipermail/support/attachments/20120922/b73aedea/attachment.html 


More information about the support mailing list