[drupal-support] Charset problem
Steve Dondley
stevedondley at comcast.net
Wed Mar 2 00:55:53 UTC 2005
> Switching Drupal's encoding from UTF-8 to something else is not very
> advisable and further more it is unnecessary as Unicode includes any
> character in ISO-8859-1. For example, I bet your feed is broken now,
> as it is still saying it's encoded with UTF-8. You will experience
> similar problems with e-mails sent by Drupal for example.
>
> A properly set up drupal site should handle any characters through
> UTF-8 and correctly convert everything into it. If you have old
> content, convert it to UTF-8 with iconv before importing. Oh and note
> that stuff like "curly quotes" is actually not ISO-8859-1, it's
> Windows-1252.
>
Yes, I understand the distinction between ISO-8859-x and Windows-1252.
At any rate, the content is coming in from e-mails handled via the
mailhandler modules (which are often written by people using MS
applications). These are not converted to standard UTF-8 format as best
as I can tell. I have already decided against changing the output
format to ISO-8859-1 and changed it back into UTF-8. Instead, I'm in
the middle of addressing this problem by inserting conversion functions
into the mailhandler module. However, PHP needs to be compiled
"with-iconv" in order for this to work. This can be a problem for some
users.
But you say that Drupal already handles this conversion. I'm not so
sure about that. I did a grep on 'iconv' and only place I saw it used
was in an xml parser function in common.inc for the feeds.
More information about the drupal-support
mailing list