[PATCH] drupal/filter: fix broken multi-line HTML tags
Currently, the filter module will mis-handle HTML tags that span more than one line. For example: <a href="#"
link</a>
Will be filtered to: <a href="#" >link</a> This is caused by the attempt to match '.>' in filter_xss(). Since the . doesn't (by default) match a newline, the filter will try to escape the closing > on the <a> tag. This results in aggregated feeds with badly-formed HTML. This change removes the . match, so that a newline as the last character in a opening HTML tag is accepted. So that we always require at least one character in a tag, we also change the [^>]* to [^>]+ Signed-off-by: Jeremy Kerr <jk@ozlabs.org> --- modules/filter/filter.module | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: drupal-6.2/modules/filter/filter.module =================================================================== --- drupal-6.2.orig/modules/filter/filter.module 2008-07-01 09:28:22.000000000 +1000 +++ drupal-6.2/modules/filter/filter.module 2008-07-22 09:24:48.000000000 +1000 @@ -983,7 +983,7 @@ function filter_xss($string, $allowed_ta ( <(?=[^a-zA-Z!/]) # a lone < | # or - <[^>]*.(>|$) # a string that starts with a <, up until the > or the end of the string + <[^>]+(>|$) # a string that starts with a <, up until the > or the end of the string | # or > # just a > )%x', '_filter_xss_split', $string);
Jeremy Kerr wrote:
Currently, the filter module will mis-handle HTML tags that span more than one line. For example:
<a href="#"
link</a>
Will be filtered to:
<a href="#" >link</a>
This is caused by the attempt to match '.>' in filter_xss(). Since the . doesn't (by default) match a newline, the filter will try to escape the closing > on the <a> tag.
This results in aggregated feeds with badly-formed HTML.
This change removes the . match, so that a newline as the last character in a opening HTML tag is accepted. So that we always require at least one character in a tag, we also change the [^>]* to [^>]+
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
---
modules/filter/filter.module | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
Index: drupal-6.2/modules/filter/filter.module =================================================================== --- drupal-6.2.orig/modules/filter/filter.module 2008-07-01 09:28:22.000000000 +1000 +++ drupal-6.2/modules/filter/filter.module 2008-07-22 09:24:48.000000000 +1000 @@ -983,7 +983,7 @@ function filter_xss($string, $allowed_ta ( <(?=[^a-zA-Z!/]) # a lone < | # or - <[^>]*.(>|$) # a string that starts with a <, up until the > or the end of the string + <[^>]+(>|$) # a string that starts with a <, up until the > or the end of the string | # or > # just a > )%x', '_filter_xss_split', $string);
Your work is respected and appreciated; please, though, can you create an issue in the drupal.org queue against the Drupal project (category filter.module) and attach this patch there? Then set it patch (code needs review).
Your work is respected and appreciated; please, though, can you create an issue in the drupal.org queue against the Drupal project (category filter.module) and attach this patch there? Then set it patch (code needs review).
Sure! will do that now. Cheers, Jeremy
participants (2)
-
Earl Miles -
Jeremy Kerr