[drupal-devel] [bug] wrong search indexing in some cases
robertgarrigos
drupal-devel at drupal.org
Mon Aug 22 01:29:37 UTC 2005
Issue status update for
http://drupal.org/node/25923
Post a follow up:
http://drupal.org/project/comments/add/25923
Project: Drupal
Version: 4.5.5
Component: search.module
Category: bug reports
Priority: normal
-Assigned to: Anonymous
+Assigned to: robertgarrigos
Reported by: robertgarrigos
Updated by: robertgarrigos
-Status: active
+Status: patch (ready to be committed)
Attachment: http://drupal.org/files/issues/search.module_0.patch (607 bytes)
I enclose a patch for this.
Please, forgive me if this is not the right way of doing. It's the
first time I'm using cvs with my macosx. Also the first time I'm using
diff to get a patch file, so take it as a simple "hello world" patch
file, which should work and fix the problem anyway.
robertgarrigos
Previous comments:
------------------------------------------------------------------------
Tue, 28 Jun 2005 15:16:10 +0000 : robertgarrigos
In some cases, search module doesn't index some words, for instance,
when there are only tags between words. In that case they are indexed
all together:
This is part of a real node text in one of my web pages (in catalan):
1732/1735<br\><b>Instrumentació:</b>
this got indexed like this in the search_index table:
17321735instrumentació 169 1
which means I couldn't get a search result over 'instrumentació'
I fixed that by adding a white space into the code of search.moulde
file:
original file (lines 253-254):
// Strip heaps of stuff out of it.
$wordlist = preg_replace("'<[\/\!]*?[^<>]*?>'si", '',
$wordlist);
fixed file (lines 253-254):
// Strip heaps of stuff out of it.
$wordlist = preg_replace("'<[\/\!]*?[^<>]*?>'si", ' ',
$wordlist);
------------------------------------------------------------------------
Wed, 29 Jun 2005 21:12:59 +0000 : benshell
Have you tried this on 4.6.x? I read this issue because I'm also having
search indexing problems, but this particular problem looks like it has
been fixed on 4.6.1. On line 344 on the search.module, I'm reading
this:
// Strip off all ignored tags to speed up processing, but insert
space before/after
// them to keep word boundaries.
$text = str_replace(array('<', '>'), array(' <', '> '), $text);
$text = strip_tags($text, '<'. implode('><', array_keys($tags))
.'>');
------------------------------------------------------------------------
Thu, 30 Jun 2005 06:02:29 +0000 : robertgarrigos
No, I haven't. The web page I was having this problem is on a shared
server running php 4, thus no way to get drupal 4.6 on it.
------------------------------------------------------------------------
Mon, 22 Aug 2005 00:08:03 +0000 : robertgarrigos
This is not yet fixed with 4.5.5. Apparently there is no problem with
4.6.x versions.
More information about the drupal-devel
mailing list