strange inconsistency: 'custom_url_rewrite'
Hi there, while writing a tiny module to make these fancy 'web2.0' *cough* urls in the form of tag1+tag2+tag3 I ran across a rather strange issue. custom_url_rewrite works as expected, but is rather OO unfriendly. I mean, if I define that, Drupal wil FAil to even load if someone adds another module with this, or if a person has this function define in her settings.php. Why is this not hook_url_rewrite? Am I missing something? Is it a performance thing? Bèr NB: I am aware of pathauto. But there is no bleeping way that I am going to alias my 1500 (and still growing) tags plus all its combinations in there! That would mean ~5e4114 aliases. thats a five with 4114 zeros! I doubt MySQL can handle that at all. -- | Bèr Kessels | webschuur.com | website development | | Jabber & Google Talk: ber@jabber.webschuur.com | http://bler.webschuur.com | http://www.webschuur.com |
Bèr Kessels wrote:
Hi there,
while writing a tiny module to make these fancy 'web2.0' *cough* urls in the form of tag1+tag2+tag3 I ran across a rather strange issue.
custom_url_rewrite works as expected, but is rather OO unfriendly. I mean, if I define that, Drupal wil FAil to even load if someone adds another module with this, or if a person has this function define in her settings.php.
Why is this not hook_url_rewrite? Am I missing something? Is it a performance thing?
Big +1 from me to transitioning URL rewrite to a hook.
On Mon, 26 Dec 2005 19:03:31 +0100, Earl Miles <merlin@logrus.com> wrote:
Bèr Kessels wrote:
Hi there, while writing a tiny module to make these fancy 'web2.0' *cough* urls in the form of tag1+tag2+tag3 I ran across a rather strange issue. custom_url_rewrite works as expected, but is rather OO unfriendly. I mean, if I define that, Drupal wil FAil to even load if someone adds another module with this, or if a person has this function define in her settings.php. Why is this not hook_url_rewrite? Am I missing something? Is it a performance thing?
Big +1 from me to transitioning URL rewrite to a hook.
/me yawns and clicks duplicate. http://drupal.org/node/29030
Op maandag 26 december 2005 19:28, schreef Karoly Negyesi:
/me yawns and clicks duplicate. http://drupal.org/node/29030 /me think CHX is a bity tired and hence did not really read the patch that was committed adn my message.
Bèr -- [ Bèr Kessels | Drupal services www.webschuur.com ]
On Mon, 26 Dec 2005 19:30:09 +0100, Bèr Kessels <ber@webschuur.com> wrote:
Op maandag 26 december 2005 19:28, schreef Karoly Negyesi:
/me yawns and clicks duplicate. http://drupal.org/node/29030 /me think CHX is a bity tired and hence did not really read the patch that was committed adn my message.
You sure of that? http://drupal.org/node/29030#comment-44664
Here is wat I wrote: * There is an odd inconsistency. Its not OO (object oriented) friendly to allow only one function. We don't do this anywhere else. * I cannot **find** the way to achive what I want: mass aliasing of urls based on certain dynamic parameters. ('foo' to 12 if foo is the title of term with tid 12) Here is what I asked: * Am I missing something? Did we deliberately introduce this inconsistency for performance reasons? * Or for some other reason? What I see in that thread only answers part of my problem: * yes, its a performance thing. * No ,there is no way I can use anythign in that direction to allow rewriting of urls. So, I get from this, that we will have to live with the inconsistency. And I will probably (?) not be able to write a tag - tid mapping module that rewrites the urls. Because honestly, a module cannot have that function. It /will/ break! It would mean that my module exludes any module that wants to do something (remotely) similar. An I think it is indeed not i18n specific as my use case very well points out. Another use case would be to map /user/CHX to user/12345. Again. This is not possible atm. (though this could possibly be stored in the database, IMO its a silly and odd overkil) Oh, and thank you for your friendly answer. ;) Bèr 'not being able to keep track of all the core threads, because he is not undistractableCHX' Kessels. -- PGP ber@webschuur.com http://www.webschuur.com/sites/webschuur.com/files/ber_webschuur.asc PGP berkessels@gmx.net http://www.webschuur.com/sites/webschuur.com/files/ber_gmx.asc
Bèr Kessels wrote:
What I see in that thread only answers part of my problem: * yes, its a performance thing. * No ,there is no way I can use anythign in that direction to allow rewriting of urls.
So, I get from this, that we will have to live with the inconsistency. And I will probably (?) not be able to write a tag - tid mapping module that rewrites the urls. Because honestly, a module cannot have that function. It /will/ break! It would mean that my module exludes any module that wants to do something (remotely) similar.
An I think it is indeed not i18n specific as my use case very well points out. Another use case would be to map /user/CHX to user/12345. Again. This is not possible atm. (though this could possibly be stored in the database, IMO its a silly and odd overkil)
Hi Ber, Well, the idea is we needed something to allow url rewriting by a single module and it is not a hook because of performance reasons. But anyway, this is not a new thing, it is an updated and more powerful 'config_url_rewrite' function... And it doesnt really make sense to make it a hook because more than one module using that would clash anyway. About breaking the site, this doesn't need to happen. I've used some conditional definition in i18n module, cvs, so the module wont work but wont break the site either.. And if you want to use that with more than one module, I suggest having 'custom_url_rewrite' defined in your settings file, then calling the other modules in order, but it would need some 'fine tuning' for two modules using it at the same time... Hope this at least makes sense :-)
Op maandag 26 december 2005 22:45, schreef Jose A. Reyero:
About breaking the site, this doesn't need to happen. I've used some conditional definition in i18n module, cvs, so the module wont work but wont break the site either.. And if you want to use that with more than one module, I suggest having 'custom_url_rewrite' defined in your settings file, then calling the other modules in order, but it would need some 'fine tuning' for two modules using it at the same time...
My whole point is: I can get it to worrk. No problem. Hell, given enough time I can get *anything* to work in Drupal. Just hack the core FUBAR. Now, all I wantedwas to provide a nice small module that chaged or beloveth taxonomy into something even cooler. Into something that is going to push Drupal full into that web 20 spotlight: nice and clean tag supprt. For that, I am working on some (small and simple) modules tagadelic_browser and tagadelic_filter (and off course there still is tagadelic for the clouds) And these modules should -obviously- work out of the box. we seriously cannot expect people to hack functions into theyr settings.php or comment away stuff in other modules. Especially not since we are now aiming at the installer. (or will the installer be so smart that it can write code for us?) Apparantly this is a brick wall I ran against (head first). A brick wall we put there for performace reasons (good reasons, IMO, don't get me wrong). So I will rephrase my question then: * Do we want to raise inpentrable brick walls, to gain performance. * And if so, where do we raise them? * Do we raise more such walls if that helps perfomance (removing hooks, or limiting flexibility) * Who, where and when is decided who needs the flexibility and who needs the performance. I have to note that I recall the days of the nuke. Where a module often had the following install: * unzip this in your nuke root, and answer yes if it asks you to overwrite any files. There was hardly flexibility. A module was often half actual module and half rewritten-core-files. Personally I can somehow live with this; Somehow. For I am savvy enough to hack away. but really, we all know this is a bad habit. And we all know that that is no real solution. Adding stuff yo your install like 'you must copy paste this piece of PHP inot this and this location in your settings.php' is just deadly. Its a bad thing and I prefer not to go down that route. hence I am now pondering to just add some nifty drupal_gotos, and let menu + callbak handle the rest. I will leave the core-generated taxonomy/ugly/urls in place. Bèr -- [ Bèr Kessels | Drupal services www.webschuur.com ]
Bèr Kessels wrote:
And these modules should -obviously- work out of the box. we seriously cannot expect people to hack functions into theyr settings.php or comment away stuff in other modules. Especially not since we are now aiming at the installer. (or will the installer be so smart that it can write code for us?)
Of course not, but for final users these modules using custom_url_rewrite maybe should be labeled somehow as not compatible with each other.
Apparantly this is a brick wall I ran against (head first). A brick wall we put there for performace reasons (good reasons, IMO, don't get me wrong).
So I will rephrase my question then: * Do we want to raise inpentrable brick walls, to gain performance. * And if so, where do we raise them? * Do we raise more such walls if that helps perfomance (removing hooks, or limiting flexibility) * Who, where and when is decided who needs the flexibility and who needs the performance.
Well, I guess it depends on how many people and how badly need that function call or that hook, so in this case it looks to me like a reasonable trade off. This is a function that can be called like a hundred times for a single page. The question is maybe what is the performance impact of an empty hook - called 100s of times- when no module using it is enabled. Anyone knows? And this one is not that simple because it's not about modules adding on each other's results. This function has to produce a single result, which is an url, so it doesn't fit with the general hook mechanism either. And if you want two modules to be rewriting urls at the same time, this should be specifically coded somehow. But anyway if it remains not a hook at the end and our two modules are the only ones using it, we can manage it so users dont need to hack functions in settings.php. Maybe the first module defining the function makes it a hook and calls the others? I know its not the perfect solution, but if it proves to be useful, then maybe it can be a hook in the next version. - I hope you dont want to make yours 'aaa_tagadelic' to define the function first ;-)
On Monday 26 December 2005 01:57 pm, Bèr Kessels wrote:
So, I get from this, that we will have to live with the inconsistency. And I will probably (?) not be able to write a tag - tid mapping module that rewrites the urls. Because honestly, a module cannot have that function. It /will/ break! It would mean that my module exludes any module that wants to do something (remotely) similar.
An I think it is indeed not i18n specific as my use case very well points out. Another use case would be to map /user/CHX to user/12345. Again. This is not possible atm. (though this could possibly be stored in the database, IMO its a silly and odd overkil)
I agree completely, Ber. This topic came up on the dev list a few weeks ago as well when discussing the limitations of path and pathauto. The basic problem is that they don't scale well, particularly for dynamic content. Making /user/myname/* work for all /user/ functions, just in core, using url_alias, would require at minimum 4 records per user. (view, edit, track, contact, plus some number of profile pages.) Putting that in url_alias is simply not going to scale past a few hundred users. Try it on drupal.org (45000 users and counting), and it dies horribly. I did spend some time trying to implement it manually for the user module, actually (usernames are unique, so they can be used in place of uids in URLs without increasing the number of db hits), but ended up creating all sorts of strange and mysterious errors I couldn't identify so I eventually abandoned it. (It also had to be all-or-nothing, or other user-extending modules wouldn't work either.) The performance concern is real, though, given the number of links a typical page has. Some fast, well-cached module-based rewrite mechanism is needed, I agree. Perhaps just prefix-based? I've been trying to avoid proposing anything at this point, though, as any changes now would just slow down the release of 4.7. :-) -- Larry Garfield AIM: LOLG42 larry@garfieldtech.com ICQ: 6817012 "If nature has made any one thing less susceptible than all others of exclusive property, it is the action of the thinking power called an idea, which an individual may exclusively possess as long as he keeps it to himself; but the moment it is divulged, it forces itself into the possession of every one, and the receiver cannot dispossess himself of it." -- Thomas Jefferson
I agree completely, Ber. This topic came up on the dev list a few weeks ago as well when discussing the limitations of path and pathauto. The basic problem is that they don't scale well, particularly for dynamic content. Making /user/myname/* work for all /user/ functions, just in core, using url_alias, would require at minimum 4 records per user. (view, edit, track, contact, plus some number of profile pages.) Putting that in url_alias is simply not going to scale past a few hundred users. Try it on drupal.org (45000 users and counting), and it dies horribly.
I did spend some time trying to implement it manually for the user module, actually (usernames are unique, so they can be used in place of uids in URLs without increasing the number of db hits), but ended up creating all sorts of strange and mysterious errors I couldn't identify so I eventually abandoned it. (It also had to be all-or-nothing, or other user-extending modules wouldn't work either.)
The performance concern is real, though, given the number of links a typical page has. Some fast, well-cached module-based rewrite mechanism is needed, I agree. Perhaps just prefix-based? I've been trying to avoid proposing anything at this point, though, as any changes now would just slow down the release of 4.7. :-)
Speaking of users, these are quite easy to do these in regexps. This is an edited excerpt from the weblabor.hu aliasing code (generates Hungarian URLs): $userautoalias = array( '' => '', '/edit' => '/szerkesztes', '/track' => '/kovetes', '/track/navigation' => '/kovetes/navigacio', '/contact' => '/kapcsolat', ); if (preg_match("!^user/(\\d+)(.*)$!", $path, $match)) { if (isset($userautoalias[$match[2]])) { return 'tagok/' . $match[1] . $userautoalias[$match[2]]; } } This is all it takes to convert user/1234/track to tagok/1234/kovetes. User names can be similarly done, given that you either query the database for the name, or have it cached somewhere. Goba
On Tuesday 27 December 2005 10:10 am, Gabor Hojtsy wrote:
The performance concern is real, though, given the number of links a typical page has. Some fast, well-cached module-based rewrite mechanism is needed, I agree. Perhaps just prefix-based? I've been trying to avoid proposing anything at this point, though, as any changes now would just slow down the release of 4.7. :-)
Speaking of users, these are quite easy to do these in regexps. This is an edited excerpt from the weblabor.hu aliasing code (generates Hungarian URLs):
$userautoalias = array( '' => '', '/edit' => '/szerkesztes', '/track' => '/kovetes', '/track/navigation' => '/kovetes/navigacio', '/contact' => '/kapcsolat', );
if (preg_match("!^user/(\\d+)(.*)$!", $path, $match)) { if (isset($userautoalias[$match[2]])) { return 'tagok/' . $match[1] . $userautoalias[$match[2]]; } }
This is all it takes to convert user/1234/track to tagok/1234/kovetes. User names can be similarly done, given that you either query the database for the name, or have it cached somewhere.
Goba
Hm. I hadn't thought of internationalization. I was thinking something like where the user module could map URL prefixes, that way all user-extending modules would work as well without having to be individually changed. Eg, for incoming URLs the user module would grab /user/ as a prefix, read the second parameter, and if it's a string translate it to the corresponding int via a single db hit (SELECT uid FROM users WHERE username='%s'). Then all other modules just see the numeric version, just as they do now with url_alias. On outgoing, l() would again check for any prefix and translate the other way. Add caching or flavor where necessary. The same mechanism would allow for a node to also have /nodetype/label or /nodetype/class/label URLs, and all actions (view, edit, etc.) would still work properly because by the time other modules see it, it's already been translated back down to /node/id# as now. I suppose that could still work with internationalization if you just throw some t() calls into it. <waits for someone to tell him why that is dumb> -- Larry Garfield AIM: LOLG42 larry@garfieldtech.com ICQ: 6817012 "If nature has made any one thing less susceptible than all others of exclusive property, it is the action of the thinking power called an idea, which an individual may exclusively possess as long as he keeps it to himself; but the moment it is divulged, it forces itself into the possession of every one, and the receiver cannot dispossess himself of it." -- Thomas Jefferson
while writing a tiny module to make these fancy 'web2.0' *cough* urls in the form of tag1+tag2+tag3 I ran across a rather strange issue.
custom_url_rewrite works as expected, but is rather OO unfriendly. I mean, if I define that, Drupal wil FAil to even load if someone adds another module with this, or if a person has this function define in her settings.php.
Why is this not hook_url_rewrite? Am I missing something? Is it a performance thing?
It is a performance thing mostly. It was discussed that allowing more than one url rewrite function would lead to great unexpected pains (execution order, speed), and the single named function method was choosen. You can always do conditional function definitions in PHP. if (!function_exists('somefunction')) { function somefunction () { } } Goba
participants (6)
-
Bèr Kessels -
Earl Miles -
Gabor Hojtsy -
Jose A. Reyero -
Karoly Negyesi -
Larry Garfield