Issue status update for http://drupal.org/node/29030 Post a follow up: http://drupal.org/project/comments/add/29030 Project: Drupal Version: cvs Component: base system Category: feature requests Priority: normal Assigned to: chx Reported by: chx Updated by: eldarin Status: patch (code needs review) Another approach which also works well: in menu.module:menu_execute_active_handler() right after setting $path to the q variable, I call a module which I called "urlpatterns" which resolves incoming URLs. There I match against a set of URL regexp patterns configured by the module admin. That way the URL rewrite happens very early in the useragent request to the server. The reason for this is further down in menu.module:menu_execute_active_handler(), where I have a AAA function which decides if access should be given on a configurable URL basis - configured with another module doing AAA. That way security settings mimic .htaccess in some way, while having the power of regexp flexibility as well as very good extendibility. The get_normal_path() and get_path_alias() functions then are routed to a check and lookup in the "urlpatterns" module as well as the AAA module. The immediate benefits of this is that I don't link to anywhere on the server, where access is denied to the user. Just improves security somewhat, as well as performance when I only let one "subscriber" hang on to the hook given in menu_execute_active_handler(). In my specialized case, I see no reason to have more than one URL rewriting module, since it would possible become a large spaghetti mess with possible unpredictable results if there is no central way of assuring URLs. I take care of "sub-URL aliasing" with regexp rules, so there should be no reason to do so either. It keeps security a bit tidier. eldarin Previous comments: ------------------------------------------------------------------------ Wed, 17 Aug 2005 11:39:12 +0000 : chx Attachment: http://drupal.org/files/issues/url_rewrite.patch (1.35 KB) I need a URL rewrite hook sometimes, here is an implementation, with ample comments. I looked into arg() to see whether it needs a reset parameter, and it does not, but it contained a minor bug, which is also fixed. ------------------------------------------------------------------------ Wed, 17 Aug 2005 11:46:45 +0000 : chx Attachment: http://drupal.org/files/issues/url_rewrite_0.patch (1.73 KB) Performance boost. ------------------------------------------------------------------------ Wed, 17 Aug 2005 11:48:51 +0000 : Jose A Reyero +1 This is really needed by i18n module, and maybe other modules could use it to add some extra information in the query string. ------------------------------------------------------------------------ Wed, 17 Aug 2005 11:50:16 +0000 : killes@www.drop.org this seems just an evil plot to cater for the evil hack that i18n.module is. -- ------------------------------------------------------------------------ Wed, 17 Aug 2005 11:57:23 +0000 : chx killes, while I know you do not like the current implementation of i18n, that module exists and even works. If you do not like the current approach, you can always write a better one. And also, please note we try to introduce non-i18n specific solutions this time. ------------------------------------------------------------------------ Wed, 17 Aug 2005 12:05:10 +0000 : killes@www.drop.org "it works" has never been a consideration in Drupal development, don't let us start to use it. I don't need an i18n module. Had I had the urge to write one, it would have been based on walkah's excellent start to be seen here: http://cvs.drupal.org/viewcvs/drupal/contributions/sandbox/walkah/translate_... I don't see anything non-i18n specific here. let us not pollute low level functions such as url() with custom hacks. ------------------------------------------------------------------------ Wed, 17 Aug 2005 12:26:11 +0000 : Goba Why extend url() and why not the alias retrieval functions? BTW at that time it was decided that a single function should be used (conf_url_rewrite()) and not a hook, because of performance reasons. If you provide the functionality this was, then conf_url_rewrite() gets confusing, and even meaningless if you do it in the alias retrieval functions themselfs. ------------------------------------------------------------------------ Wed, 17 Aug 2005 12:49:30 +0000 : chx Attachment: http://drupal.org/files/issues/url_rewrite_1.patch (1.95 KB) Goba is so right, that I was already working on it. ------------------------------------------------------------------------ Thu, 18 Aug 2005 11:11:39 +0000 : Jose A Reyero This second version is more limited, and wont be that useful, because in case an alias exists, the rewritting is skipped. IMHO, the aim for this patch should be allowing modules to add information in the path or in the query string, for *all* the outgoing urls. I was thinking of i18n module and language information, but this can be useful for other modules too. So please, let's stick to the previous version of the patch ------------------------------------------------------------------------ Thu, 18 Aug 2005 11:45:23 +0000 : Goba Jose, so you advocate only mangling URLs on output? Really? ------------------------------------------------------------------------ Thu, 18 Aug 2005 15:41:41 +0000 : Jose A Reyero Goba, yes, my idea was to provide a hook, so any module can mangle the path/query string. What will happen with that incoming paths is a different story, I mean any module can access the query string any time later... And I'd like this to be separated from path aliasing, which can be done in a different step... Ideally, of course, this would be done for outgoing and incoming urls, but there's a number of issues, like incoming path processing being done before module loading, in common.inc... Another issue, IMHO, is that current url handling is a bit messy and could be better streamlined. But the first patch was simple enough and quite straight forward, at least it works for *all* outgoing URLs, and allows to rewrite query strings, while the second one, I see much more limited use for it.... ------------------------------------------------------------------------ Thu, 18 Aug 2005 17:07:13 +0000 : Goba Jose, AFAIS if you mangle with the outgoing URLs (eg. put 'en/' before all URLs), and there will be nothing in the incoming processing to strip that, this will directly lead to 'page not found' replies. But I probably fail to see the exact use case you would like to propose this for. ------------------------------------------------------------------------ Thu, 18 Aug 2005 17:48:13 +0000 : Jose A Reyero Goba, yes, basically you are right... But these are my use cases: For i18n ---------- - Add language prefix at the beginning of the outgoing urls - Remove it in the module init hook. Yes, I know... this is probably some of what killes calls evil hacks :-( But with this new hook, I could also think of adding language in the query string. And maybe some other modules could want to add some info in the query string. There are a number of reasons for language to be in the URL -search engines, links...- and also the ability to create path aliases with or without language... Hackish? Yes. But right now we are facing the following dilemma: no specific calls for non core modules -which I dont disagree with- but then any implementation of this, for not to require patching, will have to be in a module, and then incoming paths are first processed before module loading, and on top of that we have the cache system.... so its quite a complex thing.... I'd be happy with any idea to implement this more cleanly, or maybe we should aim higher, like reworking the whole init thing and path pre-processing... What we are trying for the moment is to introduce only some general use hooks, like this small patch... otherwise it is a too big all-or-nothing question to get this working in Drupal.... Thanks for your comments and I'd appreciate any suggestion. ------------------------------------------------------------------------ Mon, 29 Aug 2005 18:54:48 +0000 : Jose A Reyero Attachment: http://drupal.org/files/issues/url_rewrite_2.patch (713 bytes) Updated simplified patch. As other patches are already in, we only need this to run i18n module with Drupal 4.7 without patching!! ------------------------------------------------------------------------ Mon, 29 Aug 2005 19:29:45 +0000 : DFG I would be more useful to have a similar hook in drupal_get_path_alias() and drupal_get_internal_path(). ------------------------------------------------------------------------ Tue, 30 Aug 2005 00:26:27 +0000 : fago patch applies (with offset) and doesn't break anything, as i could see. a lot of people are interested in i18n, so please include this last one. further the possibilty inject a query-string might be useful in other cases. +1 ------------------------------------------------------------------------ Tue, 13 Sep 2005 12:48:29 +0000 : mgifford This is a worth while hook to add to the core code. It will ease the implementation of more multi-lingual sites in drupal and provide better support for a broader community of users/developers. +1 ------------------------------------------------------------------------ Tue, 13 Sep 2005 15:22:10 +0000 : Souvent22 +1. This patch is needed. We don't live in a vacum ya know, there are more languages than english. :). Hope this gets in, I just made a site for someone in Italy, and this could help out when making modules. ------------------------------------------------------------------------ Tue, 13 Sep 2005 18:02:25 +0000 : Dries If this patch gets committed, there will be a third mechanism to rewrite URLs. Also, it is a well-known fact that url() and l() are a performance bottleneck. I think we need to take a step back, see how we can overcome the limitations of the current system and come to a simple yet fast URL rewrite mechanism. There ought to be a better way. ------------------------------------------------------------------------ Tue, 13 Sep 2005 18:24:22 +0000 : chx Attachment: http://drupal.org/files/issues/url_rewrite_3.patch (1.87 KB) This version implements a hook in drupal_get_path_alias and in drupal_get_normal_path . Performance hit should be negligeble in most cases: a foreach on an empty array. ------------------------------------------------------------------------ Tue, 13 Sep 2005 18:46:39 +0000 : chx Attachment: http://drupal.org/files/issues/url_rewrite_4.patch (1.89 KB) ------------------------------------------------------------------------ Tue, 13 Sep 2005 18:50:22 +0000 : Dries Moving the code around doesn't change a thing; it's still a third/new mechanism to rewrite URLs. I'll take a closer look at this as time permits. ------------------------------------------------------------------------ Tue, 13 Sep 2005 19:00:54 +0000 : chx Attachment: http://drupal.org/files/issues/url_rewrite_5.patch (1.89 KB) No, it's a second mechanism only as it removes conf_url_rewrite -- conf_url_rewrite routines can be moved to a module and thus shared as a module. And you can have more than one rewrite, this way. ------------------------------------------------------------------------ Tue, 13 Sep 2005 23:48:42 +0000 : Jose A Reyero Attachment: http://drupal.org/files/issues/url_rewrite_alt.patch (1.61 KB) +1 for chx (plus alternative patch) I like chx's patch, and I think also that getting rid of 'conf_url_rewrite' and replacing it with a hook is a good thing. However, if the main concern is performance, we could use too 'conf_url_rewrite', if only all paths were run through it unconditionally. So, making clear I'd prefer chx's solution, here's an alternative one. ------------------------------------------------------------------------ Wed, 14 Sep 2005 09:09:18 +0000 : Goba I like chx's generic version (url_rewrite_5.patch) best, as it replaces an awkward URL rewrite mechanism (introduced by myself, pressed by performance reasons), with a lot cleaner, albeit a little bit less performant solution. BTW there is a spelling mistake, chx written 'inccoming' in the patch, plus I see no reason to check for empty($arguments) in arg() at all, since having the $q set properly (after this patch) would also ensure that $arguments is properly set. Note that this patch also fixes a small performance problem in arg(): now it always tries to do an explode, if $_GET['q'] is empty, that is on the homepage.