[consulting] HTML Character Sanitization Solution

nan wich nan_wich at bellsouth.net
Fri Apr 23 20:38:23 UTC 2010


Does it do titles too? My problems is that even the entities for things like left double quotes create headaches for titles, in some cases causing Drupal to store the title as a blank. This, in turn, causes Pathauto to screw up. CKEditor does not fix this because it only deals with the body.
 
Nancy E. Wichmann, PMP
Injustice anywhere is a threat to justice everywhere. -- Dr. Martin L. King, Jr.




________________________________
From: Travis Carden <travis.carden at gmail.com>
To: A list for Drupal consultants and Drupal service/hosting providers <consulting at drupal.org>
Sent: Fri, April 23, 2010 4:11:28 PM
Subject: Re: [consulting] HTML Character Sanitization Solution

For correcting invalid (x)HTML—even Microsoft Word crap—I know of no better solution than HTML Purifier, which actually does an outstanding job, in my experience. It can be a little too restrictive for some use cases as it filters out JavaScript, OBJECT/EMBED, and IFRAME (and you can't configure it not to, as far as I can tell). In some such situations it can be helpfully paired with Video Filter and Iframe Filter or insertFrame. (I don't have a solution for using it with JavaScript.) I suspect this module would solve most people's issues—I think it will even strip non-ASCII characters. Benjamin Finklea gives a good explanation of the module's installation and use in his book Drupal 6 Search Engine Optimization from Packt.

Unfortunately, I can't use HTML Purifier with my current client because it's too restrictive for his needs. So what I'm looking for is something that does nothing other than strip or (preferably) convert non-ASCII characters to their equivalent HTML entities. e.g. convert “My problem,” he said, “is simple—WYSIWYGs.” would become &ldquo;My problem,&rdquo; he said, &ldquo;is simple&mdash;WYSIWYGs.&rdquo;. I have a sense that a good WYSIWYG should do this, but I haven't had any success with FCKEditor's "paste from Word" feature. Has anyone else? Does TinyMCE do any better?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.drupal.org/pipermail/consulting/attachments/20100423/e2ba630c/attachment-0001.html 


More information about the consulting mailing list