[consulting] HTML Character Sanitization Solution

Travis Carden travis.carden at gmail.com
Fri Apr 23 20:43:19 UTC 2010


That's a good question, Nancy. I've encountered the same thing where a
client copies curly quotes into a page title and then the Pathauto-generated
path doesn't work. Unfortunately HTML Purifier can't do anything about that
because it's implemented as an input filter, and you can't apply input
formats to titles.


On Fri, Apr 23, 2010 at 3:38 PM, nan wich <nan_wich at bellsouth.net> wrote:

> Does it do titles too? My problems is that even the entities for things
> like left double quotes create headaches for titles, in some cases causing
> Drupal to store the title as a blank. This, in turn, causes Pathauto to
> screw up. CKEditor does not fix this because it only deals with the body.
>
>
> *Nancy E. Wichmann, PMP*
>
> Injustice anywhere is a threat to justice everywhere. -- Dr. Martin L.
> King, Jr.
>
>
>  ------------------------------
> *From:* Travis Carden <travis.carden at gmail.com>
> *To:* A list for Drupal consultants and Drupal service/hosting providers <
> consulting at drupal.org>
> *Sent:* Fri, April 23, 2010 4:11:28 PM
> *Subject:* Re: [consulting] HTML Character Sanitization Solution
>
> For correcting invalid (x)HTML—even Microsoft Word crap—I know of no better
> solution than HTML Purifier <http://drupal.org/project/htmlpurifier>,
> which actually does an outstanding job, in my experience. It can be a little
> too restrictive for some use cases as it filters out JavaScript,
> OBJECT/EMBED, and IFRAME (and you can't configure it not to, as far as I can
> tell). In some such situations it can be helpfully paired with Video
> Filter <http://drupal.org/project/video_filter> and Iframe Filter<http://drupal.org/project/iframe_filter>or
> insertFrame <http://drupal.org/project/insertFrame>. (I don't have a
> solution for using it with JavaScript.) I suspect this module would solve
> most people's issues—I *think* it will even strip non-ASCII characters.
> Benjamin Finklea gives a good explanation of the module's installation and
> use in his book Drupal 6 Search Engine Optimization<http://amazon.com/o/ASIN/1847198228/ref=nosim/traviscardenc-20>from Packt.
>
> Unfortunately, I can't use HTML Purifier with my current client because
> it's too restrictive for his needs. So what I'm looking for is something
> that does nothing other than strip or (preferably) convert non-ASCII
> characters to their equivalent HTML entities. e.g. convert “My problem,” he
> said, “is simple—WYSIWYGs.” would become &ldquo;My problem,&rdquo; he said,
> &ldquo;is simple&mdash;WYSIWYGs.&rdquo;. I have a sense that a good WYSIWYG
> should do this, but I haven't had any success with FCKEditor's "paste from
> Word" feature. Has anyone else? Does TinyMCE do any better?
>
> _______________________________________________
> consulting mailing list
> consulting at drupal.org
> http://lists.drupal.org/mailman/listinfo/consulting
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.drupal.org/pipermail/consulting/attachments/20100423/1ff57247/attachment.html 


More information about the consulting mailing list