We imported a large set of files via the import_html module and made them all book nodes. What we actually did was import them into a local install, and then we ran a PHP script to massage the imported data (to make some adjustments and corrections, some of which we did on the body field of the records in node_revisions). Then I uploaded that DB to the web server, and clicked "Re-index site" and now 100% has been indexed.
When I run a search for "Hull" however, I get 7 results, which is very nice, but node 21206 has that word--twice in fact, yet it doesn't appear in the results list. Node 10037 and 23089 do show up in the results, so nodes both before and after the 'missing' one are working.
I am using the standard search facility. I didn't adjust the default settings of 3 for "Minimum word length to index:" and "Simple CJK handling" is checked.
Any ideas how I can debug this?
Thanks!
We imported a large set of files via the import_html module and made them all book nodes. What we actually did was import them into a local install, and then we ran a PHP script to massage the imported data (to make some adjustments and corrections, some of which we did on the body field of the records in node_revisions). Then I uploaded that DB to the web server, and clicked "Re-index site" and now 100% has been indexed.
When I run a search for "Hull" however, I get 7 results, which is very nice, but node 21206 has that word--twice in fact, yet it doesn't appear in the results list. Node 10037 and 23089 do show up in the results, so nodes both before and after the 'missing' one are working.
I am using the standard search facility. I didn't adjust the default settings of 3 for "Minimum word length to index:" and "Simple CJK handling" is checked.
I tried using the trip_search module, and while it's better, it still doesn't find all search results. For example, if I run a search using Drupal's regular search module for a certain word like "Hull," I get 2 results. With trip_search, I get 19, but if I execute this SQL:
select * from node_revisions where body like '%Hull%';
I get 34 results! And I confirmed that all 34 are unique nodes as well.
I suppose if no better solution exists, I will just build a module around that SQL. It seems to be the only 100% solution.
Unless I am missing something here.
Thanks.