From jk@ozlabs.org Mon Jul 21 23:37:24 2008 From: Jeremy Kerr To: development@drupal.org Subject: [development] [PATCH] drupal/filter: fix broken multi-line HTML tags Date: Tue, 22 Jul 2008 09:36:14 +1000 Message-ID: <1216683374.199267.203058232764.qpush@pingu> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============2401887509031055721==" --===============2401887509031055721== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Currently, the filter module will mis-handle HTML tags that span more than one line. For example: link Will be filtered to: This is caused by the attempt to match '.>' in filter_xss(). Since the . doesn't (by default) match a newline, the filter will try to escape the closing > on the tag. This results in aggregated feeds with badly-formed HTML. This change removes the . match, so that a newline as the last character in a opening HTML tag is accepted. So that we always require at least one character in a tag, we also change the [^>]* to [^>]+ Signed-off-by: Jeremy Kerr --- modules/filter/filter.module | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: drupal-6.2/modules/filter/filter.module =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D --- drupal-6.2.orig/modules/filter/filter.module 2008-07-01 09:28:22.00000000= 0 +1000 +++ drupal-6.2/modules/filter/filter.module 2008-07-22 09:24:48.000000000 +10= 00 @@ -983,7 +983,7 @@ function filter_xss($string, $allowed_ta ( <(?=3D[^a-zA-Z!/]) # a lone < | # or - <[^>]*.(>|$) # a string that starts with a <, up until the > or the= end of the string + <[^>]+(>|$) # a string that starts with a <, up until the > or the= end of the string | # or > # just a > )%x', '_filter_xss_split', $string); --===============2401887509031055721==-- From merlin@logrus.com Tue Jul 22 00:36:57 2008 From: Earl Miles To: development@drupal.org Subject: Re: [development] [PATCH] drupal/filter: fix broken multi-line HTML tags Date: Mon, 21 Jul 2008 17:36:18 -0700 Message-ID: <48852B82.8070301@logrus.com> In-Reply-To: <1216683374.199267.203058232764.qpush@pingu> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============5265677043246478829==" --===============5265677043246478829== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Jeremy Kerr wrote: > Currently, the filter module will mis-handle HTML tags that span more > than one line. For example: >=20 > > link >=20 > Will be filtered to: >=20 > >link >=20 > This is caused by the attempt to match '.>' in filter_xss(). Since > the . doesn't (by default) match a newline, the filter will try to > escape the closing > on the tag. >=20 > This results in aggregated feeds with badly-formed HTML. >=20 > This change removes the . match, so that a newline as the last > character in a opening HTML tag is accepted. So that we always require > at least one character in a tag, we also change the [^>]* to [^>]+ >=20 > Signed-off-by: Jeremy Kerr >=20 > --- >=20 > modules/filter/filter.module | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) >=20 > Index: drupal-6.2/modules/filter/filter.module > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > --- drupal-6.2.orig/modules/filter/filter.module 2008-07-01 09:28:22.000000= 000 +1000 > +++ drupal-6.2/modules/filter/filter.module 2008-07-22 09:24:48.000000000 += 1000 > @@ -983,7 +983,7 @@ function filter_xss($string, $allowed_ta > ( > <(?=3D[^a-zA-Z!/]) # a lone < > | # or > - <[^>]*.(>|$) # a string that starts with a <, up until the > or t= he end of the string > + <[^>]+(>|$) # a string that starts with a <, up until the > or t= he end of the string > | # or > > # just a > > )%x', '_filter_xss_split', $string); Your work is respected and appreciated; please, though, can you create=20 an issue in the drupal.org queue against the Drupal project (category=20 filter.module) and attach this patch there? Then set it patch (code=20 needs review). --===============5265677043246478829==-- From jk@ozlabs.org Tue Jul 22 00:47:18 2008 From: Jeremy Kerr To: development@drupal.org Subject: Re: [development] [PATCH] drupal/filter: fix broken multi-line HTML tags Date: Tue, 22 Jul 2008 10:42:59 +1000 Message-ID: <200807221042.59970.jk@ozlabs.org> In-Reply-To: <48852B82.8070301@logrus.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============6096979382170325988==" --===============6096979382170325988== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit > Your work is respected and appreciated; please, though, can you > create an issue in the drupal.org queue against the Drupal project > (category filter.module) and attach this patch there? Then set it > patch (code needs review). Sure! will do that now. Cheers, Jeremy --===============6096979382170325988==--