On Wed, 10 Oct 2007 16:29:19 +0200 "Gábor Hojtsy" <gabor@hojtsy.hu> wrote:
- currencies are easy if the module in question has that in t(), like t('@symbol @amount') can be translated to '@symbol@amount' or '@amount@symbol', although I am not sure you mean this by currencies
I've to dig more into t(). I'll see if I can bend it to my needs but the following paragraph doesn't give me too much hope.
- float numbers are printed with number_format(), or they should be at least, which does not support different formatting based on languages. Feel free to suggest new functionality to do this, which can be tested in real life scenarios in D6 and integrated to D7. - date formats are Drupal settings and as such are translatable with the settings translations features of i18n and localizer
uh you gave me the chance to pontificate... why should I miss it? ;) Ready for a full Encyclic? I've used i18n in the past (4.7?) and I had mixed feelings. I've read about localizer and I like some of its feature and some of the idea of the author, mainly: different language version of the same web site should be very lightly coupled. First I'd say there are 3 kind of users of a cms: readers, editors and programmers. I'd sum up what I expect: - currency, dates, float in the localised format (measures too? maybe later lbs to Kg??? why not?) - avoiding duplication of code ( t() + the former should be enough ) - building bridges between content in different languages (more on this later) - SEO ( people won't actually remember urls (nearly sure) and rarely they will read them (less sure) ) Having a multi-language site means: - avoiding code duplication [ t() + localised numbers/dates ] - help people switch language I don't think editors need (may have) to much help (more on this later) Avoiding code duplication helps programmers. Helping people to switch language is useful just for people that know more than one language. I think people switch because they find something in one language and they would like to know if there is more/different content on the same topic in another language. The switch may be from mother tongue to foreign language or the opposite or from any foreign language. You're Italian, you know French, you look on a search engine in French (cos you know the topic is *very French*, then you see there is Italian content etc...). You land on Italian content then you see there is more material in English and you'd like to switch... They look for the same stuff they were looking... just in another language... you've to provide bridges between similar or when possible same content across languages. Based on my experience different versions of a web site may grow up with completely different speed, have slightly different content etc... so the coupling should be very light. So I took care of programmers and readers. Now a brief parenthesis on editors. If editors ignore their site is multi-language... each version will take it's way so it won't actually be a multi-language site, you just have to provide facilities to programmers and content will be so lightly coupled there won't be any automatic way to help people switch between different languages. If editors will be conscious the site is multi-language they are the one who will need the facilities to build bridges between languages. OK... let's get a bit more technical and provide some details. First let me court the category I belong to. Programmers don't provide content... they provide the interface. Interface has few content to be translated. Programmers don't have to be distracted by translations while coding. I think providing functions that format the usual things accordingly to the language set could be something that can be introduced smoothly. People will just have to surround stuff with those functions. If they do, they will have localised output, if they don't they will have what they were having in the past. As you'd expect for t() vs. simple strings. Everything including modules can be updated gradually. t() is perfect as it is and I'd consider t() just for the interface and not for the content. ids are too hard to remember. t('this_is_the_id_of_a_sentence_that_was_shorter_than_the_id') Maybe one transparent addition to t() could be to change the "base" language, so that people could input something like t('frase','it'), so that programmers could concentrate on coding and not thinking in something that's not their mother language. I understand the extra cost of adding indexes to the translation table but maybe some programmers may appreciate it. I'm not among them... but maybe some may appreciate it. I need localised formatting shortly and I'm willing to write it. To me it looks a no-brainer if I knew: 1) where to put it to be sure to make those functions available everywhere 2) which variable should be chosen to decide which language is set for the session. I'd do it kosher. If it is not clear where to put this stuff, fortunately the code where I need this stuff is well defined and I can write some terrible hack without too much shame in the module I'm just writing for other purposes. To be clear about the second point there is content and there is interface: people may want to have interface in their language and read content in others. If no specific choice is made by the reader how you decide which interface you should use is a bit tricky but I think is just a matter of "taste". At first landing time you could chose: 1) according to the browser cfg 2) according to the language of the content but what if the user switch language? Should you change the interface and the content, just the content and keep the interface in sync with the browser cfg... Anyway you've to provide a clear path to the user to switch the interface let them know they can switch content and interface independently. I would pair interface (menu etc...) and number, currency... localisation. Differentiating between localisation of currency, dates, float and interface is very flexible but I think could be even confusing. So a function that localise dates, float etc... should take as an argument a function that decide accordingly to browser settings, setup of user profile and a policy defined through settings in the admin interface which format should be used. In the admin panel it should be possible to add languages and add format strings for dates, float etc... Content language shouldn't be a "variable" as content should be lightly coupled. Content language should be similar to the root of a taxonomy, language facilities should just help people to move between different taxonomies/languages. People should be helped to land on the "right" language and then be helped to stay focused (filter by language) on the content of that language till they want it (it.site.com, en.site.com, site.com, site.it, browser setting etc... + collapse taxonomies). There are stuff that actually aren't categorised so "language" shouldn't actually be the root of different taxonomy, you may have important content that you didn't have the chance to translate and you want to show it to everyone too. You assign a language or a jolly to every node and a language or a jolly to a vocabulary then you build bridges between taxonomies and nodes. So language should be a propriety of vocabularies and nodes and there should be a jolly language. Jolly language should be a "not filtered language". Let's see what I'd expect when people switch language. First there should be a language selector block. People could surf nodes or taxonomies. When they switch... they should land on the most similar content in the different language. You could have a one to one mapping between nodes or you could slide to the nearest bridge editors built between taxonomies. As if you had 2 different trees, you could decide to make a one to one mapping between leaves and branches or you could link just some branches of the 2 taxonomies... The algorithm should go down towards the root till it find a bridge and cross it. So if there is not perfect match the user will be taken to the most similar content available. Let's come to the editors... Editors have 3 tasks: - writing nodes - building the taxonomies - building bridges between taxonomies Oversimplifying you may have people responsible for just one or more tasks and different access right. So it is important to decouple as much as possible these activities with a look at performances too. I think localised URLs have lower priority but they are a strategical decision and may be a PITA. I don't think people remember URLs, they may remember http://www.mysite.com/blog or http://www.mysite.com/gallery but once you get in things like http://www.mysite.com/literature/modern/italian/war/biographies/ I doubt people can really remember such things... But there is a but... a) search engines do read URLs to rank pages b) some people do read URLs to rank pages for high ranking content language and URL language should match. URL automatic translation has no sense as there is no sense in automatic taxonomy translation. i18n module used a cheap trick to build a bridge between nodes: same url with language prefix. No DB access and a 404 if nothing there. You could build a custom 404 to redirect the user to some related material in the other language... etc... With URL translation you have to add a DB access. From the editor point of view there is no difference... or maybe... a new system may even be better, maybe the new i18n already implement something similar. Editors that are aware of working in a multi-language site with previous i18n system had to know the url of the going-to-be-translated article and put it into the path alias... so there won't be any extra cost in finding the original article and add a button and a drop-down to "add a translation", just to add a link between the 2 nodes. Engineering the DB structure to provide an efficient way to store this link shouldn't be terribly hard. Taxonomy could be filled automatically as the best match reaching the nearest bridge between the 2 taxonomies. This system should provide a way to add such link not just at creation time. A good interface for linking 2 nodes may be a bit more tricky. Similarly there should be a way to link different branches of taxonomies. Not all branches should be linked... you could build a one 2 one match or taxonomies may grow up semi-independently... An algorithm should go down the tree till it finds a bridge to cross and present the content on the other tree. Editors should edit menu too... but menus aren't taxonomies... they may contain taxonomies but they not. Menu may be different in different languages (maybe some services aren't provided in all the language... Again... I'd have localised menus merged with a jolly menu and taxonomies. In a multi-language site tools for mass editing of taxonomies may come handy... language version tend to diverge, then you try to make them converge once more and you may have to move branches of taxonomy around... but this is "independent" from "core" localisation functions. Localised menu will be "hand written" and have an assigned language. Jolly menu may use t(). And taxonomy menu will use the same rules as taxonomies. Menus, with the exception of taxonomy menus aren't so large you can't deal with them manually. Cherry on top would be to signal if there are one to one matches for nodes so you'll know if switching language will take you to the same content or just to the most similar content. At the interface level that could be made highlighting the flags of the switching block if there is matching content in those languages. But once you've a function to inform you there is matching content... everyone could build up his interface of choice I see media duplication just as a problem of node interface. If people can chose if uploading to the server or choose from the server there shouldn't be any problem of media duplication across different languages unless they have to be different media. oh... and someone more knowledgeable should take into account RTL language too... I've no idea of the problems involved in such kind of localisation. I wrote few pages in Simplified Chinese in a multi-language website and I didn't have any problem, so Chinese at least as an exotic language for us European doesn't seem to be an issue. I didn't test localisation function for dates and currency. I didn't have the need for them. I remember HKLUG uses drupal too and it uses i18n module. They helped me to install Chinese localised Linux for my wife and they suggested me a great Linux router mmm more than 6 years ago I think. -- Ivan Sergio Borgonovo http://www.webthatworks.it