[drupal-devel] multibyte/mbstring in Drupal

Steven Wittens steven at acko.net
Fri May 27 11:41:06 UTC 2005


>I'm not trying to argue here; if you decide on a "no overloading" policy
>in the end, then so be it - it's just that all of the problems raised
>so far seem solvable. :o)
>  
>
What I'm trying to avoid is the situation where every call to a string 
function has to have luggage around it. A lot of the standard string 
functions work perfectly fine on UTF-8, it is not needed to use mbstring 
for them.

Only some functions like strtolower() and strtoupper() mess up UTF-8 
when you don't have mbstring. Another unsafe example is substr(): you 
cannot use it without mbstring in most cases, because you need to split 
on a character boundary. So we need the wrappers for at least these 
functions to ensure Drupal is still UTF-8-safe without mbstring.

Furthermore, mbstring behaves subtly different for some functions, e.g. 
throwing warnings when the standard ones don't complain. It seems to me 
that limiting our usage of mbstring to a few well-known and tested cases 
is much less likely to cause problems rather than overloading everywhere.

I also believe that because of PHP's lack of distinguishing characters 
vs bytes, mbstring overloading is a bad idea regardless. If we allow 
mbstring overload, there is no guarantee for a simple PHP API call anymore.

Steven Wittens




More information about the drupal-devel mailing list