[development] [PATCH] drupal/filter: fix broken multi-line HTML tags

Jeremy Kerr jk at ozlabs.org
Mon Jul 21 23:36:14 UTC 2008

Currently, the filter module will mis-handle HTML tags that span more
than one line. For example:

<a href="#"

Will be filtered to:

<a href="#"

This is caused by the attempt to match '.>' in filter_xss(). Since
the . doesn't (by default) match a newline, the filter will try to
escape the closing > on the <a> tag.

This results in aggregated feeds with badly-formed HTML.

This change removes the . match, so that a newline as the last
character in a opening HTML tag is accepted. So that we always require
at least one character in a tag, we also change the [^>]* to [^>]+

Signed-off-by: Jeremy Kerr <jk at ozlabs.org>


 modules/filter/filter.module |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: drupal-6.2/modules/filter/filter.module
--- drupal-6.2.orig/modules/filter/filter.module	2008-07-01 09:28:22.000000000 +1000
+++ drupal-6.2/modules/filter/filter.module	2008-07-22 09:24:48.000000000 +1000
@@ -983,7 +983,7 @@ function filter_xss($string, $allowed_ta
     <(?=[^a-zA-Z!/])  # a lone <
     |                 # or
-    <[^>]*.(>|$)      # a string that starts with a <, up until the > or the end of the string
+    <[^>]+(>|$)       # a string that starts with a <, up until the > or the end of the string
     |                 # or
     >                 # just a >
     )%x', '_filter_xss_split', $string);

More information about the development mailing list