[development] Drupal and i18n, the holy grail?

Khalid B kb at 2bits.com
Sun Feb 26 04:42:48 UTC 2006


On 2/25/06, Rob Thorne <rob at torenware.com> wrote:
> This is why Arabic is a great script for doing these kinds of tests:

The first problem with Arabic is that it used to have very different
encodings, even from the same vendor. This is now mostly a
non-issue with utf-8. Some sites still use windows-1256 though.

> it's contextual (meaning that letters need to be drawn differently
> depending upon what glyphs they're near),

This is also a non-issue now. All operating systems handle the
contextualization well. At least Linux KDE and Microsoft does,
as well as  MS IE and Firefox. Even Konqueror is decent here.
Not sure about Safari, but I imagine it would be working too.

> bi-directional (byte
> direction != physical layout), and written right-to-left, so very often,
> page layout needs to change as well.  If you've done Arabic right,
> you'll do almost anything else right as well.

This is where it gets tricky. The analogy I use is that of when you write
your CSS to standards, it mostly works in FireFox, and life is good.
Then you start testing in MS IE, and find that your layout is broken.
You then have to go iteratively  and fix the bugs.

Arabic is the same, and there are many surprises. For example, numbers
in  the middle of an Arabic string, or a line that has Arabic and English in it.

The rule of thumb: whatever  you estimate for doing this, multiply it by
ten and go by trial and error. At least for the first project or two.

Other issues: Arabic in URLs does not work well. If you want pathauto
for example to have Arabic in the alias, then think twice: MS IE and
Firefox will treat these differently (MS IE leaves the letters as Arabic,
Firefox encodes them to %CE, ..etc.)

> Although you would get this by testing Arabic as well.  So we'd solve a
> lot of problems if we'd just develop Drupal in Arabic and then localize
> it into English, count to think.

The way I see Arabic being used in Drupal (as well as Hebrew, being the other
Semitic language  in use today), is that sites are mono-lingual, or
"indifferent"
lingual.

By monolingual, I mean that the site is purely Arabic, e.g. http://csc-sy.net/
By indifferent, I mean a site with an  English front end, but content can be
either Arabic or English, e.g. http://manalaa.net or  http://foolab.org. This is
true for blogs mostly.


More information about the development mailing list