[drupal-support] Charset problem

Ross Kendall drupal at rosskendall.com
Tue Mar 1 13:14:39 UTC 2005


I think I've sent this link before, but just in case...

The Absolute Minimum Every Software Developer Absolutely, Positively 
Must Know About Unicode and Character Sets (No Excuses!)
http://www.joelonsoftware.com/articles/Unicode.html

Hope it's helpful to someone.

Steven Wittens wrote:

> Steve Dondley wrote:
>
>> I'm using mysql 3.23 and Drupal 4.5 on an Apache/Linux server, PHP 
>> version 4.10.  My older version of MySQL stores all text as latin1, 
>> the equivalent of iso-8859-1 extended.  But I notice that Drupal 
>> outputs pages using the utf-8 character set. This is causing problems 
>> with the extended iso-8859-1 characters (Micorsoft's curly quotes, 
>> etc.) and they usually show up as question marks in the text.
>>
>> To solve the problem, I changed the charset argument in the 
>> drupal_set_header() function to iso-8859-1.  This took care of the 
>> problem.  But now, of course, any UTF-8 encoded text shows  funky 
>> characters.
>>
>> What's the best way to get Drupal to output both UTF-8 and iso-8859-1 
>> extended characters properly?
>
>
> Switching Drupal's encoding from UTF-8 to something else is not very 
> advisable and further more it is unnecessary as Unicode includes any 
> character in ISO-8859-1.  For example, I bet your feed is broken now, 
> as it is still saying it's encoded with UTF-8. You will experience 
> similar problems with e-mails sent by Drupal for example.
>
> A properly set up drupal site should handle any characters through 
> UTF-8 and correctly convert everything into it. If you have old 
> content, convert it to UTF-8 with iconv before importing. Oh and note 
> that stuff like "curly quotes" is actually not ISO-8859-1, it's 
> Windows-1252.
>
>
> Steven Wittens




More information about the drupal-support mailing list