[drupal-devel] Email filter
Mark Leicester
mark.leicester at efurbishment.com
Thu Mar 17 16:32:00 UTC 2005
Hi Steven,
After some investigation I discovered the mailhandler was already using
the complementary function to mime_header_encode():
imap_mime_header_decode(). I've altered mailhandler to call
drupal_convert_to_utf8() with the charset that
imap_mime_header_decode() returns. I have patched listhandler to do the
same thing with the from address. Now international characters will be
preserved in users' names created by listhandler.
Another question. How should we handle a situation where a user has not
compiled their php with iconv support? I made this mistake initially,
and as a result drupal_convert_to_utf8() returned empty strings.
drupal_convert_to_utf8() checks for the availability of several
libraries, and returns nothing if none are available. I wonder if
drupal_convert_to_utf8() shouldn't be patched to return the original
string if no conversion library exists?
Cheers,
Mark
On 13 Mar 2005, at 15:41, Steven Wittens wrote:
>
>> On another related topic, does anyone know how to tweak the
>> mailhandler and listhandler to deal with the non-ASCII characters
>> (such as ø, etc.) that come through in the usernames, subject lines
>> and message bodies? At the moment these characters are all converted
>> to ?. I've tried some experiments with drupal_convert_to_utf8() but
>> I've had no luck so far.
>
> Message bodies should be easy to convert with
> drupal_convert_to_utf8(), provided they are transferred in 8-bit mode
> (Content-Transfer-Encoding), which is what the large majority of mail
> clients does today. You will need iconv/mbstring/recode support. I
> would very much advise against hacking in utf8_encode() as Tim Altman
> suggests, as this function can only handle ISO-8859-1 and not
> Windows-1252 (the Microsoft-specific variant of ISO-8859-1, with smart
> quotes, euro-sign, etc), which is used a lot.
>
> For subject lines and such, the situation is trickier, as a separate
> method of encoding these parameters is used. See RFC 2047:
> http://www.rfc-editor.org/rfc/rfc2047.txt
>
> We have a function mime_header_encode(), but no mime_header_decode().
>
> As far as "characters are being converted to '?'" goes, is this is a
> real question mark or the replacement character U+FFFD (�)?
>
> Steven Wittens
>
More information about the drupal-devel
mailing list