[drupal-devel] [bug] wrong search indexing in some cases

robertgarrigos drupal-devel at drupal.org
Mon Aug 22 01:29:37 UTC 2005

Issue status update for 
Post a follow up: 

 Project:      Drupal
 Version:      4.5.5
 Component:    search.module
 Category:     bug reports
 Priority:     normal
-Assigned to:  Anonymous
+Assigned to:  robertgarrigos
 Reported by:  robertgarrigos
 Updated by:   robertgarrigos
-Status:       active
+Status:       patch (ready to be committed)
 Attachment:   http://drupal.org/files/issues/search.module_0.patch (607 bytes)

I enclose a patch for this.

Please, forgive me if this is not the right way of doing. It's the
first time I'm using cvs with my macosx. Also the first time I'm using
diff to get a patch file, so take it as a simple "hello world" patch
file, which should work  and fix the problem anyway.


Previous comments:

Tue, 28 Jun 2005 15:16:10 +0000 : robertgarrigos

In some cases, search module doesn't index some words, for instance,
when there are only tags between words. In that case they are indexed
all together:

This is part of a real node text in one of my web pages (in catalan):


this got indexed like this in the search_index table:

17321735instrumentació        169        1

which means I couldn't get a search result over 'instrumentació'

I fixed that by adding a white space into the code of search.moulde

original file (lines 253-254):
      // Strip heaps of stuff out of it.
      $wordlist = preg_replace("'<[\/\!]*?[^<>]*?>'si", '',

fixed file (lines 253-254):
      // Strip heaps of stuff out of it.
      $wordlist = preg_replace("'<[\/\!]*?[^<>]*?>'si", ' ',


Wed, 29 Jun 2005 21:12:59 +0000 : benshell

Have you tried this on 4.6.x?  I read this issue because I'm also having
search indexing problems, but this particular problem looks like it has
been fixed on 4.6.1.  On line 344 on the search.module, I'm reading

  // Strip off all ignored tags to speed up processing, but insert
space before/after
  // them to keep word boundaries.
  $text = str_replace(array('<', '>'), array(' <', '> '), $text);
  $text = strip_tags($text, '<'. implode('><', array_keys($tags))


Thu, 30 Jun 2005 06:02:29 +0000 : robertgarrigos

No, I haven't. The web page I was having this problem is on a shared
server running php 4, thus no way to get drupal 4.6 on it.


Mon, 22 Aug 2005 00:08:03 +0000 : robertgarrigos

This is not yet fixed with 4.5.5. Apparently there is no problem with
4.6.x versions.

More information about the drupal-devel mailing list