[drupal-devel] Help system redesign
Last night there was a discussion in #drupal regarding the help system, and the need for an overhaul therein. Dries asked me to send a note to this list with what was discussed and the thoughts on where to go with it. This is a developer, documenter, and translator issue, so input from all three parties is welcome. Problems with the current help system, vis, hook_help() ----------------------------------------------------------------------- - Huge blocks o' text in the source code file. - Because the strings are large, translation of help files via PO is slooooow. - Synchronization with the online handbook is tedious and difficult, which means it sometimes doesn't get done. Proposed solution ----------------------- Replace hook_help() with a database-based help lookup system that incorporates translations directly in the database. The database table would look something like this: help ( hid, module, key, locale, helptext ); The key would be a short textual string that identifies the help text in question, serving the same function as $section in hook_help(). (Possibly with the same naming scheme, possibly not.) Then, hook_help() would be replaced with a function help_get(): /* Example function, missing lots of necessary error checks */ function help_get($module, $key, $locale='en') { $help = db_fetch_object(db_query("SELECT helptext FROM {help} WHERE module='%s' AND key='%s' AND locale='%s'", $module, $key, $locale)); return $help->helptext; } What gets returned, then, is an already-localized string, ready for printing/theming/whatevering. The help lookup is always a single database hit, no slow textual substr() matching needed. That also means that PO files are much smaller, and module source code doesn't have to have giant masses of text in it. Issues and solutions -------------------------- Links: I'd say the solution here is to have links flagged in the database for easy conversion by help_get() before they're returned. For example: ... Go to the <link url="admin/modules">modules page</link> to enable... help_get() would then, before returning the string, check for <link> tags, extract the data, and replace it with the output from an l() call. I'm not sure of the efficiency of the syntax above, but something along those lines should be easiest. (Good place for a regex?) Data loading: Of course, there needs to be some way to get helptext into the database in the first place. That would require an auxiliary file along with the .module file, say .help. See below for comments on the format of .help. When the module is enabled on admin/modules, the .help file is parsed into the help table once. When the module is disabled, the help table is cleared of any records for that module to avoid cruft. While this does make enabling and disabling a module a bit slower, that's not done often enough that it should make any practical difference. We would need to then make sure that the module's name and short description are available in the module itself, as those are needed on the admin/modules page. The simplest solution would be to have a hook_module_info() function, which returns an array of the form: array('name' => 'node', 'description' => 'Allows content to be submitted to the site and displayed on pages.'); Those text strings are small enough that they can be handled by the current translation system. admin/modules would then include every .module file (as now) and then call the _module_info() hooks to get the data to display. Data format, data entry, handbook sync -------------------------------------------------- These all dovetail together, so I'm going to discuss them all at once. - The Docs team wants to be able to write help text for core and contrib modules without futzing about with MySQL dumps or some other obscure format. A web-frontend is preferred. - Developers need to be able to enter their own help text without going through the drupal.org docs team. That's especially important for custom modules that need help text but don't end up on Drupal.org for whatever reason. (Custom for a client, commercial modules, etc.) - The Translation people don't want to have to deal with obscure file formats either. - Keeping the text in the online handbook in sync with the in-system text in the actual module should be sufficiently trivial that no one thinks to not do so. That's a tall order, and a problem that hasn't been solved yet. :-) My recommendation for the intermediary help files would be an XML format, one reasonably easy to import/export from the online handbook and reasonably easy to hand-edit. The Docs and Translations teams could then edit the text in the online handbook and export it to the necessary XML files, as can the translation team, while developers can write the XML file directly via their XML editing method of choice. The file needn't be complicated. For instance: <drupal:help module='node' locale='en'> <drupal:entry key="description">Some <b>HTML</b> here.</drupal:entry> <drupal:entry key="stuff">Some other text.</drupal:entry> </drupal:help> When the module is enabled, the XML file is parsed into the help table, as mentioned. It's a one-time event, so performance is a non-issue. There would be a separate file for each locale. So the node module, for instance, would have: node.module node.mysql node.help node.es.help node.de.help ... As now, English would be the default language. Besides being easier to edit, using XML files instead of an SQL dump keeps the help text database-independent. Otherwise, we'd need separate mysql and pgsql files for each locale. Other thoughts ------------------- I have not worked with the current help system that extensively, so I'm not sure how it handles context-sensitive help. By that I mean "What's this?" type help, or other "what do I do on this page" help text. One advantage to this method is that keys reserved for the main help system could be pre-defined, while a module author would be free to add additional help keys for his module. So on the foobar/4/edit page, the module author could write: $output .= theme('help_link', 'foobar', 'edit'); Which would return a formatted link to context-appropriate help, such as: <div class="help-link"><a href="help/foobar/edit">Help on this page</a></div> help/foobar/edit then would trigger the callback for 'help', which would dutifully display the theme('help') for module 'foobar', key 'edit'. Both of those could then be re-themed to allow pop-up help windows, javascript overLib (http://www.bosrup.com/web/overlib/) calls, or whatever else a theme-author felt like doing. Some more thought here is definitely needed, but I'd love to see more non-admin-user context sensitive help in Drupal. Conclusion -------------- I'm sure at least some of the above is insanely stupid, so please point out where gently. :-) Dries said he suspects this would be a 4.8-targeted improvement, which I think is likely given the amount of legacy code and how close we are to 4.7's release. Devs, Docs, and Trans, your input please! My thanks to Dries, Amazon, webcheck, killes, and everyone else who was in the channel last night for this discussion. *dons flame retardant suit* -- Larry Garfield AIM: LOLG42 larry@garfieldtech.com ICQ: 6817012 "If nature has made any one thing less susceptible than all others of exclusive property, it is the action of the thinking power called an idea, which an individual may exclusively possess as long as he keeps it to himself; but the moment it is divulged, it forces itself into the possession of every one, and the receiver cannot dispossess himself of it." -- Thomas Jefferson
Hey Larry Thanks for taking the time to update those who do not frequent IRC on this important topic. Some comments: 1. Use of <link> The <link>blah</link> is not something I like to see. This is none standard, even though it will not make it in the final rendered HTML as is. I prefer something that is syntactically valid, such as <div class="drupalhelp">blah</div>. A better option will be a call to a function: e.g. drupal_help('key'); 2. XML While everyone is using XML today, and it is very much in vogue, do we really needed? Wouldn't a simple: key1:text goes here key2:other text goes here Be just as adequate? 3. .po files Where are the .po in all this? If we are using a modulename.es.help, will that contain the help only, or all the strings? If it is the help only, then translating a module will require two steps (one using the locale string translation, and another using the .help file). If it is for all strings, then does this replace t(), or just augments it? Are we confusing people by having more than one interface? Here is a late night thought: Can t() be extended so that if it is passed an array, with 'key', then it would lookup a string that has the key passed rather than just the string? Something that ties into t() would be great, for the sake of ease of use, unification of interface, and not confusing developers/documentors... Regards
On Saturday 24 September 2005 11:25 pm, Khalid B wrote:
Hey Larry
Thanks for taking the time to update those who do not frequent IRC on this important topic.
Some comments:
1. Use of <link>
The <link>blah</link> is not something I like to see. This is none standard, even though it will not make it in the final rendered HTML as is. I prefer something that is syntactically valid, such as <div class="drupalhelp">blah</div>.
A better option will be a call to a function: e.g. drupal_help('key');
Well, the problem with putting a function call into the help text is that we'd then have to eval() it. I've not looked at Drupal's eval wrapper, but I generally shy away from eval() for security reasons. Also, mixing PHP code with the help text is what we're trying to not do.
2. XML While everyone is using XML today, and it is very much in vogue, do we really needed? Wouldn't a simple:
key1:text goes here key2:other text goes here
Be just as adequate?
Probably. As long as it's easily hand-editable it should work fine. I suggested XML because of the plethora of editing tools available and it's good design vis a vis escape-needing characters. A colon-delimited text file or some such would work too, as long as we handle escaping of real colons in the string.
3. .po files
Where are the .po in all this? If we are using a modulename.es.help, will that contain the help only, or all the strings? If it is the help only, then translating a module will require two steps (one using the locale string translation, and another using the .help file). If it is for all strings, then does this replace t(), or just augments it? Are we confusing people by having more than one interface?
The idea is to not use PO files and t() in the first place for the help text, because for larger text strings they're much slower than a simple database hit. The idea was for modulename.es.help to contain the full translated text. So (using the XML example before) module.help would read: <drupal:entry key="stuff">And now, open the door.</drupal:entry> While module.fr.help would read: <drupal:entry key="stuff">Et maintenant, ouvrez la porte.</drupal:entry>
Here is a late night thought: Can t() be extended so that if it is passed an array, with 'key', then it would lookup a string that has the key passed rather than just the string?
I don't know, as I've never used PO files myself. :-) My only use for translation so far personally has been to customize the built-in text for a client by creating a fake locale and "translating" just a few strings to have more app-specific text. Translation people, any thoughts here?
Something that ties into t() would be great, for the sake of ease of use, unification of interface, and not confusing developers/documentors...
Regards
-- Larry Garfield AIM: LOLG42 larry@garfieldtech.com ICQ: 6817012 "If nature has made any one thing less susceptible than all others of exclusive property, it is the action of the thinking power called an idea, which an individual may exclusively possess as long as he keeps it to himself; but the moment it is divulged, it forces itself into the possession of every one, and the receiver cannot dispossess himself of it." -- Thomas Jefferson
On Sat, Sep 24, 2005 at 11:52:46PM -0500, Larry Garfield wrote:
On Saturday 24 September 2005 11:25 pm, Khalid B wrote:
2. XML While everyone is using XML today, and it is very much in vogue, do we really needed? Wouldn't a simple:
key1:text goes here key2:other text goes here
Be just as adequate?
Probably. As long as it's easily hand-editable it should work fine. I suggested XML because of the plethora of editing tools available and it's good design vis a vis escape-needing characters. A colon-delimited text file or some such would work too, as long as we handle escaping of real colons in the string.
Assuming that keys does not contain colons (which is quite safe to assume/impose) you don't need any escaping. Before the first : is a key, after that is a text...
The idea is to not use PO files and t() in the first place for the help text, because for larger text strings they're much slower than a simple database hit.
How are they slower? -- Piotrek irc: #debian.pl Mors Drosophilis melanogastribus!
On 9/25/05, piotrwww@krukowiecki.net <piotrwww@krukowiecki.net> wrote:
On Sat, Sep 24, 2005 at 11:52:46PM -0500, Larry Garfield wrote:
On Saturday 24 September 2005 11:25 pm, Khalid B wrote:
2. XML While everyone is using XML today, and it is very much in vogue, do we really needed? Wouldn't a simple:
key1:text goes here key2:other text goes here
Be just as adequate?
Probably. As long as it's easily hand-editable it should work fine. I suggested XML because of the plethora of editing tools available and it's good design vis a vis escape-needing characters. A colon-delimited text file or some such would work too, as long as we handle escaping of real colons in the string.
The idea is to not use PO files and t() in the first place for the help text, because for larger text strings they're much slower than a simple database hit.
How are they slower?
I think I am with Dries that a language web editing front end is probably the most important thing right now. Adding a new language could be a SQL dump and module. The main function of the module could just be to add it's language to the various language selection menus. One useful thing could be to get this language web editor working on Drupal.org, so various people can collaborate to create the translations. It may be usefull to have some kind on 'notes' feature, where translators can add notes about a particular change (e.g. I changed 'ty' to 'uchaf' as this link takes users up a level, rather than to home). If any import/export of translations is worked on I would suggest a XLIFF [1] standards based XML format would be better than: * a 'roll your own' XML format: as XLIFF translation tools are already available. * PO files: as XML has, I think, better facilities for full description of international character sets. - Grug [1] http://developers.sun.com/dev/gadc/technicalpublications/articles/xliff.html http://www.oasis-open.org/committees/xliff/documents/cs-xliff-core-1.1-20031...
I am concerned of fragmentation: different ways of handling things. The documentor and translator have to learn several interfaces for doing the same thing. Unifying the interface on all PO or all NewSystem is better than learning and maintaing two sets of formats, tools, ...etc. XML is nice and all, but if we just need is key->data then it is overkill. If we use PO for help, the only drawback I see is that English will not have a translation by default, and will need to go thru the translation hoops. I guess the ultimate idea has yet to be found in this dicussion.
I think I am with Dries that a language web editing front end is probably the most important thing right now. Adding a new language could be a SQL dump and module. The main function of the module could just be to add it's language to the various language selection menus.
One useful thing could be to get this language web editor working on Drupal.org, so various people can collaborate to create the translations. It may be usefull to have some kind on 'notes' feature, where translators can add notes about a particular change (e.g. I changed 'ty' to 'uchaf' as this link takes users up a level, rather than to home).
This is all possible with CVS, but of course CVS is not easy enough for quite some folks. You get commit messages exactly as you described, plus you have version control, which is already proven. Using simple text files (PO files) the diffs are even easier to read then those XML files, you reindent, and all the lines suddenly changed. It is unfortunate, that CVS is not easy for quite some people, but IMHO it does not mean we should create a whole new system. The Tortoise tools for Windows are quite good in simlifying CVS/SVN.
If any import/export of translations is worked on I would suggest a XLIFF [1] standards based XML format would be better than: * a 'roll your own' XML format: as XLIFF translation tools are already available. * PO files: as XML has, I think, better facilities for full description of international character sets.
Not that we have no need to deal with international charsets, as we only support utf8, nothing else. PO editors do utf8 well. Let us know the advantages of switching formats, switching desktop tools, writing new import/export code along your suggestion please. What is not possible now? What is going to be better? Goba
On 9/25/05, Gabor Hojtsy <gabor@hojtsy.hu> wrote:
This is all possible with CVS, but of course CVS is not easy enough for quite some folks. You get commit messages exactly as you described, plus you have version control, which is already proven. Using simple text files (PO files) the diffs are even easier to read then those XML files, you reindent, and all the lines suddenly changed. It is unfortunate, that CVS is not easy for quite some people, but IMHO it does not mean we should create a whole new system. The Tortoise tools for Windows are quite good in simlifying CVS/SVN.
[snip]
Not that we have no need to deal with international charsets, as we only support utf8, nothing else. PO editors do utf8 well. Let us know the advantages of switching formats, switching desktop tools, writing new import/export code along your suggestion please.
I don't think there is any need to abandon PO as a format for translators to work in. My main point was that if the text is going to be moved to the database (as per the original suggestion) then an online translation tool is an obvious next step (see below). We would have to write import code to import the existing translations anyhow! As for XML, I just wanted to point out that a standard already exists, and we should avoid making up our own (if we choose to support XML import/export) unless there are good reasons. I don't have much knowledge (or opinions) about about PO vs. XLIFF, but http://mail.gnome.org/archives/gnome-i18n/2003-October/msg00032.html seems to have some interesting points.
What is not possible now? What is going to be better?
I think you explained it yourself in your first paragraph. The technical hurdles to learning translation tools and CVS are quite high. While there is no problem with what we have now (other than suggested in the Larry's original summary) there are always non-technical people who want to contribute to open source projects, and translation is an ideal opportunity to allow this. I feel that having the option of using a web-based tool to allow this is good. If I18N is important to Drupal (and it obviously widens the potential market) then the volume of translation will have to increase exponentially, and hence the more accessible we can make the translation the better. - Grug
I don't think there is any need to abandon PO as a format for translators to work in. My main point was that if the text is going to be moved to the database (as per the original suggestion) then an online translation tool is an obvious next step (see below). We would have to write import code to import the existing translations anyhow!
As for XML, I just wanted to point out that a standard already exists, and we should avoid making up our own (if we choose to support XML import/export) unless there are good reasons. I don't have much knowledge (or opinions) about about PO vs. XLIFF, but http://mail.gnome.org/archives/gnome-i18n/2003-October/msg00032.html seems to have some interesting points.
Seems like you really don't know how things are done now. We already store translation strings in the database, in fact original English strings are stored a multitude of times along with translations. And then we already have an import/export facility for PO files. Note that once we started to support PO files (instead of using a web interface), the number of translations dramatically increased, very much passing our expectations. The fact that XLIFF might be better for someone then PO (explained on the link you posted up here), does not mean it is better for us. Someone who knows both need to do a comparision. Show us the needs in Drupal which XLIFF fulfills, and are not possible with PO files.
What is not possible now? What is going to be better?
I think you explained it yourself in your first paragraph. The technical hurdles to learning translation tools and CVS are quite high. While there is no problem with what we have now (other than suggested in the Larry's original summary) there are always non-technical people who want to contribute to open source projects, and translation is an ideal opportunity to allow this. I feel that having the option of using a web-based tool to allow this is good. If I18N is important to Drupal (and it obviously widens the potential market) then the volume of translation will have to increase exponentially, and hence the more accessible we can make the translation the better.
I have seen the number of translations increasing once we started to provide a PO import/export interface, despite the CVS storage. Sure there were translations imported by Gerhard and others for the translators. Going on a completely different route does not solve current problems in itself, in fact it might be harder fo people to adapt. Goba
The fact that XLIFF might be better for someone then PO (explained on the link you posted up here), does not mean it is better for us. Someone who knows both need to do a comparision. Show us the needs in Drupal which XLIFF fulfills, and are not possible with PO files.
I don't actually know anything about either translation process, however I did come across this link in my research about what PO and XLIFF are and there are quite a few pros/cons laid out here, as well as use cases from someone at Sun and Novell: http://mail.gnome.org/archives/gnome-i18n/2003-October/msg00022.html
The fact that XLIFF might be better for someone then PO (explained on the link you posted up here), does not mean it is better for us. Someone who knows both need to do a comparision. Show us the needs in Drupal which XLIFF fulfills, and are not possible with PO files.
I don't actually know anything about either translation process, however I did come across this link in my research about what PO and XLIFF are and there are quite a few pros/cons laid out here, as well as use cases from someone at Sun and Novell:
http://mail.gnome.org/archives/gnome-i18n/2003-October/msg00022.html
Angie, unfortunately the questions put up in this beginning of the thread are not relevant to us. We are not using the binary counterparts of PO (MO) at all, we have no interest in the capabilities of the library provided (ie. memory consumption, etc., as we are not going to use that Java lib :), and thus we know how partial translations, etc. are handled in our implementation :) Goba
Wow this thread picked up steam today. :-) To comment on all the replies to date so far, in no particular order: 1) I freely admit that my knowledge of PO files is virtually nill. I suppose they could serve as the delivery file format, but how many developers know a PO editor? I'd never even heard of one until about 2 weeks ago, but have been writing PHP for over 5 years. Amazon insists that the Docs team will be writing all or nearly all of the help text anyway, but I still believe we should make it as easy as possible for non-Docs Team developers to write their own help text. Remember, not all Drupal modules end up on drupal.org! I mention XML as an alternative because developers are far more likely to know XML than PO, I think. 2) I wasn't aware of the way in which translations are cached now. If, as Khalid suggested, we could modify t() to handle lookups using a short key as well as the full English text, perhaps we could leverage the existing cache mechanism and let it do whatever it does? That would be necessary to achieve the goal of getting big blocks o' text out of the .module file, as well as making the matching faster (see next). 3) The statement that very-long string PO lookups is slow is based on comments killes made in the aforementioned IRC discussion. I will take his word on it unless someone can show otherwise. :-) 4) I firmly believe that any new help system should be designed in such a way as to encourage developers to provide context sensitive help. Even if most of the context sensitive help *text* is written by the Docs and Translation teams, developers should be encouraged to include the hooks for it in the first place. I suspect they'll want to provide at least an effort at text for it, if only to not ship code that reads "help goes here", so again some developer-friendly way of creating said text should be included. Again, that's especially important for non-drupal.org modules that the Docs and Translation teams won't get a crack at either way. On Sunday 25 September 2005 10:48 am, Gabor Hojtsy wrote:
I don't think there is any need to abandon PO as a format for translators to work in. My main point was that if the text is going to be moved to the database (as per the original suggestion) then an online translation tool is an obvious next step (see below). We would have to write import code to import the existing translations anyhow!
As for XML, I just wanted to point out that a standard already exists, and we should avoid making up our own (if we choose to support XML import/export) unless there are good reasons. I don't have much knowledge (or opinions) about about PO vs. XLIFF, but http://mail.gnome.org/archives/gnome-i18n/2003-October/msg00032.html seems to have some interesting points.
Seems like you really don't know how things are done now. We already store translation strings in the database, in fact original English strings are stored a multitude of times along with translations. And then we already have an import/export facility for PO files. Note that once we started to support PO files (instead of using a web interface), the number of translations dramatically increased, very much passing our expectations.
The fact that XLIFF might be better for someone then PO (explained on the link you posted up here), does not mean it is better for us. Someone who knows both need to do a comparision. Show us the needs in Drupal which XLIFF fulfills, and are not possible with PO files.
What is not possible now? What is going to be better?
I think you explained it yourself in your first paragraph. The technical hurdles to learning translation tools and CVS are quite high. While there is no problem with what we have now (other than suggested in the Larry's original summary) there are always non-technical people who want to contribute to open source projects, and translation is an ideal opportunity to allow this. I feel that having the option of using a web-based tool to allow this is good. If I18N is important to Drupal (and it obviously widens the potential market) then the volume of translation will have to increase exponentially, and hence the more accessible we can make the translation the better.
I have seen the number of translations increasing once we started to provide a PO import/export interface, despite the CVS storage. Sure there were translations imported by Gerhard and others for the translators. Going on a completely different route does not solve current problems in itself, in fact it might be harder fo people to adapt.
Goba
-- Larry Garfield AIM: LOLG42 larry@garfieldtech.com ICQ: 6817012 "If nature has made any one thing less susceptible than all others of exclusive property, it is the action of the thinking power called an idea, which an individual may exclusively possess as long as he keeps it to himself; but the moment it is divulged, it forces itself into the possession of every one, and the receiver cannot dispossess himself of it." -- Thomas Jefferson
On Sun, 25 Sep 2005, Larry Garfield wrote:
Wow this thread picked up steam today. :-)
I like to think that this is also due to the removal of the issues posted to this list. :p
To comment on all the replies to date so far, in no particular order:
1) I freely admit that my knowledge of PO files is virtually nill. I suppose they could serve as the delivery file format, but how many developers know a PO editor? I'd never even heard of one until about 2 weeks ago, but have been writing PHP for over 5 years. Amazon insists that the Docs team will be writing all or nearly all of the help text anyway, but I still believe we should make it as easy as possible for non-Docs Team developers to write their own help text. Remember, not all Drupal modules end up on drupal.org! I mention XML as an alternative because developers are far more likely to know XML than PO, I think.
Developers don't need to know anything about PO editors as long as they aren't also translators.
2) I wasn't aware of the way in which translations are cached now. If, as Khalid suggested, we could modify t() to handle lookups using a short key as well as the full English text, perhaps we could leverage the existing cache mechanism and let it do whatever it does? That would be necessary to achieve the goal of getting big blocks o' text out of the .module file, as well as making the matching faster (see next).
We are currently using the English text strings directly. We could move to using md5 checksums, if this is improves something. I guess introducing an index on the first couple of cahracters would be helpfull as well.
3) The statement that very-long string PO lookups is slow is based on comments killes made in the aforementioned IRC discussion. I will take his word on it unless someone can show otherwise. :-)
My comment was based on the fact that long strings get retrieved directly from the db. If the t() function is slower for long strings than for short ones (I suspected this) needs to be proven.
4) I firmly believe that any new help system should be designed in such a way as to encourage developers to provide context sensitive help. Even if most of the context sensitive help *text* is written by the Docs and Translation teams, developers should be encouraged to include the hooks for it in the first place. I suspect they'll want to provide at least an effort at text for it, if only to not ship code that reads "help goes here", so again some developer-friendly way of creating said text should be included. Again, that's especially important for non-drupal.org modules that the Docs and Translation teams won't get a crack at either way.
The way to retrieve help from the databse that has been proposed in #drupal would still allow you to hardcode translatable help texts into the module as it is done currently. Cheers, Gerhard
Seems like you really don't know how things are done now. We already store translation strings in the database, in fact original English strings are stored a multitude of times along with translations. And then we already have an import/export facility for PO files. Note that once we started to support PO files (instead of using a web interface), the number of translations dramatically increased, very much passing our expectations.
Agreed. - The English version (original) should be shipped as a .mysql file. - Translations should be using .PO files. Remaining question: - How do module developers and the documentation team write and revise the original English documentation? -- Dries Buytaert :: http://www.buytaert.net/
On Sun, 25 Sep 2005, Grugnog wrote:
On 9/25/05, piotrwww@krukowiecki.net <piotrwww@krukowiecki.net> wrote:
On Sat, Sep 24, 2005 at 11:52:46PM -0500, Larry Garfield wrote:
On Saturday 24 September 2005 11:25 pm, Khalid B wrote: The idea is to not use PO files and t() in the first place for the help text, because for larger text strings they're much slower than a simple database hit.
How are they slower?
I think I am with Dries that a language web editing front end is probably the most important thing right now. Adding a new language could be a SQL dump and module. The main function of the module could just be to add it's language to the various language selection menus.
While I like the idea of adding a web interface for translations, I am still of the opinion that PO files should be used for shipping translations. They are a de-facto standard in the free software world. I have been looking at available web interfaces. There seem to be two: pootle and rosetta. Rosetta isn't really available as it is a hosted service offered by the Ubuntu people. Uri Sharf uploaded some Drupal 4.5 (?) files and there are actually translations there. Sometimes in locales not available in drupal.org. If we want to have a webinterface for translations, we can a) use the ubuntu service or b) setup pootle on our shiny new hardware. For pootle I'll check with the CiviCRM people as they are using it and have expressed interest in collaboration wrt translations. My only problem with pootle is that it is written in Python. :p My main aim is to get more complete (!) translations in more locales. I've been working with translators for some time now and I think the need to maintain translations in CVS is something that limits our growth in this area. Ideally a webservice should not only provide a webinterface but also allow translators to upload their PO files.
One useful thing could be to get this language web editor working on Drupal.org, so various people can collaborate to create the translations. It may be usefull to have some kind on 'notes' feature, where translators can add notes about a particular change (e.g. I changed 'ty' to 'uchaf' as this link takes users up a level, rather than to home).
I'd need to check if and which of the two interfaces offers this.
If any import/export of translations is worked on I would suggest a XLIFF [1] standards based XML format would be better than: * a 'roll your own' XML format: as XLIFF translation tools are already available. * PO files: as XML has, I think, better facilities for full description of international character sets.
As Goba pointed out, this is not needed. And PO files of course support charsets too. Cheers, Gerhard
Probably. As long as it's easily hand-editable it should work fine. I suggested XML because of the plethora of editing tools available and it's good design vis a vis escape-needing characters. A colon-delimited text file or some such would work too, as long as we handle escaping of real colons in the string.
3. .po files
[...]
The idea is to not use PO files and t() in the first place for the help text, because for larger text strings they're much slower than a simple database hit.
[...]
I don't know, as I've never used PO files myself. :-) My only use for translation so far personally has been to customize the built-in text for a client by creating a fake locale and "translating" just a few strings to have more app-specific text.
Translation people, any thoughts here?
If you only need to have key->value pairs, there is no reason to abandon PO files. If you try to contact some real people doing translation, you will notice that 'easily hand editable with XML editors' is not easy enough. Goba
On Sun, 25 Sep 2005, Gabor Hojtsy wrote:
3. .po files
[...]
The idea is to not use PO files and t() in the first place for the help text, because for larger text strings they're much slower than a simple database hit.
Since we've decided to split up large t()-ed texts into smaller paragraphs theis changes. There will in fact be more database hits per page with help texts. I guess we can liove with it, though. The obviously better solution would be to use libgettext. We could also consider to change the way the locale cacheing works. Currently all short strings (< 75 characters) are stored in a serialized array in the cache table. These strings are loaded for each page view. Maybe we should instead store all strings that occur on a particular page similar to the page cache for anon users? This would of course cause a lot of entires in the cache table (node/1, node/2, ...). We could opt to replace all numerics from the stored urls since all node pages (of the same type!) will have the same strings. I guess that needs more thought, but I think it could be made to work. Gain: Strictly one cache hit per page (assuming all strings are known) Currently we can have many db queries if there are long strings on a page. Loss: Storage of redundant information. Cache would need to be invalidated on block changes etc.
[...]
I don't know, as I've never used PO files myself. :-) My only use for
I suggest to at least try them before thinking about making suggestions to not use them. They are a very convenient means for translation and widely used.
translation so far personally has been to customize the built-in text for a client by creating a fake locale and "translating" just a few strings to have more app-specific text.
Translation people, any thoughts here?
If you only need to have key->value pairs, there is no reason to abandon PO files. If you try to contact some real people doing translation, you will notice that 'easily hand editable with XML editors' is not easy enough.
Yeah, I agree with Goba. XML editing generally sucks. My main reason to look for a CMS years ago was to not write html by hand anymore. Don't let it creep in through the backdoor again. Cheers Gerhard
Larry -- thanks for taking this on. This has been needed for some time. Hopefully we can still get this into 4.7 under the guise of usability. On 24-Sep-05, at 8:59 PM, Larry Garfield wrote:
Last night there was a discussion in #drupal regarding the help system, and the need for an overhaul therein. Dries asked me to send a note to this list with what was discussed and the thoughts on where to go with it. This is a developer, documenter, and translator issue, so input from all three parties is welcome.
1. <link> * I agree with Khalid that we should use something XHTML based * any reason not to use <A href=""> directly? Like, <a href="admin/ modules" class="drupalhelplink"> 2. XML vs. keys vs. ? * the big thing here is support of tools; XML sounds better than a completely non-standard colon delimited file * is there any way that PO files can be used here as well, especially for import/export? would be nice to have just one tool needed Sounds very excellent for the support of install profiles, where you could have different wording of help. I guess we would ship a database-help-en.sql file? Cheers, -- Boris Mann http://www.bmannconsulting.com
Thanks for the write-up. On 25 Sep 2005, at 09:01, Boris Mann wrote:
2. XML vs. keys vs. ? * the big thing here is support of tools; XML sounds better than a completely non-standard colon delimited file * is there any way that PO files can be used here as well, especially for import/export? would be nice to have just one tool needed
I'd leave the XML format out of this for the time being. We can export and import help texts using SQL dumps. If editing translations can be done using a web interface (either on your local host or directly on drupal.org), there is no need for editing XML files and no need for having to master CVS, patch and diff. Remind that I've been encouraging the documentation team to write static DocBook documents (i.e. XML + CVS + patches); they found it too complex. This won't be any different, would it? -- Dries Buytaert :: http://www.buytaert.net/
My recommendation for the intermediary help files would be an XML format, one reasonably easy to import/export from the online handbook and reasonably easy to hand-edit. The Docs and Translations teams could then edit the text in the online handbook and export it to the necessary XML files, as can the translation team, while developers can write the XML file directly via their XML editing method of choice. The file needn't be complicated. For instance:
<drupal:help module='node' locale='en'> <drupal:entry key="description">Some <b>HTML</b> here.</drupal:entry> <drupal:entry key="stuff">Some other text.</drupal:entry> </drupal:help>
When the module is enabled, the XML file is parsed into the help table, as mentioned. It's a one-time event, so performance is a non-issue.
There would be a separate file for each locale. So the node module, for instance, would have:
node.module node.mysql node.help node.es.help node.de.help ...
As now, English would be the default language. Besides being easier to edit, using XML files instead of an SQL dump keeps the help text database-independent. Otherwise, we'd need separate mysql and pgsql files for each locale.
Besides easier to edit with custom already done tools like poedit and kbabel, po editing is already mastered by Drupal interface translators. If you are going to introduce a custom XML format, provide a ready made, easy to use editor for translators (at least as easy as any PO editor). Also you might provide all PO files as XML files, then, once you have that cool translation tool. Goba
participants (10)
-
Angie Byron -
Boris Mann -
Dries Buytaert -
Dries Buytaert -
Gabor Hojtsy -
Gerhard Killesreiter -
Grugnog -
Khalid B -
Larry Garfield -
piotrwwwï¼ krukowiecki.net