[drupal-devel] Email filter

Mark Leicester mark.leicester at efurbishment.com
Thu Mar 17 16:32:00 UTC 2005


Hi Steven,

After some investigation I discovered the mailhandler was already using 
the complementary function to mime_header_encode(): 
imap_mime_header_decode(). I've altered mailhandler to call 
drupal_convert_to_utf8() with the charset that 
imap_mime_header_decode() returns. I have patched listhandler to do the 
same thing with the from address. Now international characters will be 
preserved in users' names created by listhandler.

Another question. How should we handle a situation where a user has not 
compiled their php with iconv support? I made this mistake initially, 
and as a result drupal_convert_to_utf8() returned empty strings. 
drupal_convert_to_utf8() checks for the availability of several 
libraries, and returns nothing if none are available. I wonder if 
drupal_convert_to_utf8() shouldn't be patched to return the original 
string if no conversion library exists?

Cheers,
Mark



On 13 Mar 2005, at 15:41, Steven Wittens wrote:

>
>> On another related topic, does anyone know how to tweak the 
>> mailhandler and listhandler to deal with the non-ASCII characters 
>> (such as ø, etc.) that come through in the usernames, subject lines 
>> and message bodies? At the moment these characters are all converted 
>> to ?. I've tried some experiments with drupal_convert_to_utf8() but 
>> I've had no luck so far.
>
> Message bodies should be easy to convert with 
> drupal_convert_to_utf8(), provided they are transferred in 8-bit mode 
> (Content-Transfer-Encoding), which is what the large majority of mail 
> clients does today. You will need iconv/mbstring/recode support. I 
> would very much advise against hacking in utf8_encode() as Tim Altman 
> suggests, as this function can only handle ISO-8859-1 and not 
> Windows-1252 (the Microsoft-specific variant of ISO-8859-1, with smart 
> quotes, euro-sign, etc), which is used a lot.
>
> For subject lines and such, the situation is trickier, as a separate 
> method of encoding these parameters is used. See RFC 2047:
> http://www.rfc-editor.org/rfc/rfc2047.txt
>
> We have a function mime_header_encode(), but no mime_header_decode().
>
> As far as "characters are being converted to '?'" goes, is this is a 
> real question mark or the replacement character U+FFFD (�)?
>
> Steven Wittens
>




More information about the drupal-devel mailing list