I was pretty much in the same position as you with Firedoglake and Crooks & Liars, even though they use Wordpress (despite my efforts to get them to move to Drupal LOL), TinyMCE is still used for the post editor and they both have a lot of contributors that love to directly paste from Word into TinyMCE. What I ended up doing was using HTMLPurifier as an extra level of filtering on form submits. It has worked out great for filtering out all the extra crap Word puts in. The only problem people have had was using indent in Word as a blockquote and losing that (since it does an actual indent with paragraph styling). The best part has been that I haven't had a single problem with a tag left unclosed since using that. Before making this move, I was getting an email every day or so where someone pasted something in and it broke the layout. In 3 months of using this system, I haven't had any. The only few emails I got was from people pasting into the RTE (usually from Gmail) and losing the text, simply because it was included inside some javascript that GMail uses. That does force them to use the "paste as text" or "paste as word" button more. We don't use RTE for comment posting, but it is used for the comment moderators when editing comments. The comments are still ran through HTMLPurifier when submitted and has ran great. The only other problem I ran into was with embeds being pasted directly into the post. To fix that I ended up grepping out all embed code and replacing it with a token prior to running through purifier. Afterwards the tokens are replaced back with the actual embed code. This opened up another feature I introduced of creating whitelists/blacklists for the embeds. I can decide which domains to allow/disallow embedding from and check them before reinserting the embed code into the post. Jamie Holly http://www.intoxination.net http://www.hollyit.net Skype:intoxination Phone: 1-513-252-2919 Sean Robertson wrote:
Disallowing paste might actually help solve my other issue. If they can't paste from Word, then they can't paste some document prepared for some other media and have to think about typing something more in the web idiom.
My clients would kill me if I did that. The problem is that in political campaigns, much of what's posted has to get vetted by several people, and these people are all older types who seem to have to see it in a word doc before it's real to them.