Hi, Note: this post is very long. But it worths a reading. I dare to say that one of the biggest results of DrupalCon is that a new menu system is born. It all began almost a year ago: http://lists.drupal.org/archives/development/2005-10/msg00708.html [quote] There are three basic tasks to consider when talking about performance improvements: - displaying the visible menu (in general, this means all accessible items since we may need them for DHTML solutions) - displaying the breadcrumb trail - finding the active callback - checking access rights [/quote] OK that's four. I already have code for the last three in my sandbox and Adrian have outlined how the first will happen. My focus was finding the active callback. I simply followed JonBob's directions: "We could simplify menu declaration by allowing some sort of wildcards in menu paths. For example, if a module could declare the path "node/%d/edit", then we could dispense with the logic that defines the function only if we are in a node path. The code would certainly be cleaner in this case." I am using the percent sign as a wildcard. There is no %d or %s just %. So you define node/%/edit. Let's say someone navigates to node/12/edit/foo. Now, how do we find out that node/%/edit needs to be run? We generate all possible callbacks and check which one exists: 1111 node/12/edit/foo 1110 node/12/edit/% 1101 node/12/%/foo 1100 node/12/%/% 1011 node/%/edit/foo 1010 node/%/edit/% 1001 node/%/%/foo 1000 node/%/%/% 111 node/12/edit 110 node/12/% 101 node/%/edit 100 node/%/% 11 node/12 10 node/% 1 node The binary numbers are generated naturally: where you see a %, that's a 0, where you see a specific part, that's a 1. This will nicely indicate the the order in which they need to be checked (this is my idea and I am very proud of it -- nothing else in here is my idea). So, we have found node/%/edit. Can we access it? Instead of specifying 'access' => node_access('update', arg(1)) now we do 'access' => 'node_access', 'access arguments' => array('update', 1) . So we use Adrian's favorite: callbacks. I map 1 to arg(1) on runtime for both access and callback arguments. (Side note: I know currently node_access gets a node and not a node id. Adding node_load won't stop us.) Our life gets a bit complicated here, because we do not need to grab one menu item but three: one that defines the title, one that defines the (page) callback and one that defines access. Instead of running three queries against a huge table, I planned to retrieve all possible matches, and pick my three items. As a bonus, I will get the breadcrumb trail. Someone at the second menu meeting (Steven? Adrian?) pointed out that I could move this to the builder phase, and that's done. So we are back to retrieving one item which has title, callback and access inherited as needed. At mentioned in the third meeting, Steven pointed out that my system is vulnerable to a simple DoS attack so now menu item definitions are limited to 6 parts. I can hardly imagine this being a real limitation, I can consider raising it to seven but definitely not more. We are shooting at what's useable not all possible cases. Finally we need to display the navigation block. As said, I shoot at what's probable so instead of caching per user I will check every menu item's access on the fly. The probable case is that these are user_access calls. Maybe user_access needs to be changed to use a multi dimensional array if the repeated strpos calls are too slow. With that said, we need a mechanism to discover which menu items are possible. We will simply cache by parent id and unserialize that (credit goes to Adrian for this). Note that these structures are nowhere near the size of the current menu array. As a bonus feature, we will have hook_menu_alter. To make that usable, we moved the path to the menu item key. (Silly me was doing it in the builder but Adrian pointed out that I should simply define the menu items this way) Aside from Adrian and Steven, big kudos goes to Earl Milles (merlinofchaos). What's done? On the menu level, I need to write the navigation block, all else is done. I also need to convert modules -- node and user already done. All this took only a dozen hours or so, beginning with the first evening/night of DrupalCon first day ending on the train to Charleroi converting user and three meetings at DrupalCon / BarCamp. Regards NK
What's done? On the menu level, I need to write the navigation block, all else is done. I also need to convert modules -- node and user already done. All this took only a dozen hours or so, beginning with the first evening/night of DrupalCon first day ending on the train to Charleroi converting user and three meetings at DrupalCon / BarCamp.
yes, this is extremely exciting. in my testing, the menu system is the slowest part of drupal (beside parsing php code - use an opcode cache to fix). this new system gets rid of the $may_cache switch in hook_menu(). hook_menu is now called only upon module enable/disable and admin/menu. this is outstanding. however, we no longer have a place to stick our drupal_add_css() and drupal_add_js() type calls. so, lets create a new hook_signal($op) which fires at various points during a request. Initially, it replace hook_init() and hook_exit() with $op=precache, $op=postcache, and $op=exit. We shall also add $op=postbootstrap which is where the drupal_add_css calls will go. One minor complication is that we no longer have hook_init and hook_exit for determining which modules need to be bootstrapped (see system.module and the bootstrap column of system table). But we do have .install files now so my proposal is that modules which require loading during bootstrap perform an UPDATE system SET bootstrap=1 WHERE module=<foo> during their hook_install(). This hook_signal() patch should probably go in before this menu system.
Moshe Can we find more descriptive names other than "signal", "precache" and "postcache"? I can imagine these be confusing to module developers who will most probably not be familiar with the inner workings of Drupal and the boot strap.
Khalid B wrote:
Moshe
Can we find more descriptive names other than "signal", "precache" and "postcache"? I can imagine these be confusing to module developers who will most probably not be familiar with the inner workings of Drupal and the boot strap.
this thread started with 2 successive messages filled with juicy technical details aimed at speeding up core drupal. it makes zero sense to talk about variable names at this stage.
Moshe Weitzman wrote:
this thread started with 2 successive messages filled with juicy technical details aimed at speeding up core drupal. it makes zero sense to talk about variable names at this stage. Then I'll branch the thread so it will make sense.
As a module maintainer and former clueless noob on Drupal I believe that the variable and hook names used are important to facilitating developer adoption. I am coming up on two years of working within Drupal and I still keep discovering new hooks, nooks and crannies where the interface names didn't indicate what I would use the hook for. One could argue that this is because I'm slow or don't "get it". I won't, but it is entirely possible. :) I'll argue that it shows the richness of Drupal. I am repeatedly impressed. Core performance is important. Please don't forget that many of the fantastic improvements you all are working on are used in the context of sites with non-core modules, and it takes time on our part (the somewhat involved module maintainer crowd) to rework our modules to match the latest and greatest APIs. I subscribe to this development list because I've found it is the most effective way to understand the changes that are pending. 5.0 adoption will be slowed while modules are ported to it. Making that easier is important. If I saw something called "hook_signal()" when browsing the hook summary portion of the API I would not see immediate relevance to me - I don't think it communicates the intended meaning. My *nix C background would have me wondering why a php script needed a signal handler. In stylistic terms I too prefer hook_whatever_function() over hook_with_a_massive_switch_on($op). I do agree that it is not clear when to return some tidbit for display and when to effect state but not return a value. Hooks with switches do not imply "no return value" to me. Drupal is a wonderful development platform. One of my challenges with it continues to be why I would use a particular hook and where. How is easy to glean from the golden code. Hook names aid with the why and where. Scott -- *Scott McLewin* www.folkjam.org find jams. post jams. play well with others. <http://www.folkjam.org>
As a module maintainer and former clueless noob on Drupal I believe that the variable and hook names used are important to facilitating developer adoption.
of course function names and variable names are important. i'm stressing that we don't need to debate this endlessly in karoly's thread about the menu system. noone beside me has replied yet with any substance on his original message.
Moshe Weitzman wrote:
of course function names and variable names are important. i'm stressing that we don't need to debate this endlessly in karoly's thread about the menu system. noone beside me has replied yet with any substance on his original message. Moshe - Great. That's not what I read in your message and I'm glad I misread it.
-- *Scott McLewin* www.folkjam.org find jams. post jams. play well with others. <http://www.folkjam.org>
Hi Scott, On 26 Sep 2006, at 17:46, Scott McLewin wrote:
this thread started with 2 successive messages filled with juicy technical details aimed at speeding up core drupal. it makes zero sense to talk about variable names at this stage. Then I'll branch the thread so it will make sense.
As a module maintainer and former clueless noob on Drupal I believe that the variable and hook names used are important to facilitating developer adoption. I am coming up on two years of working within Drupal and I still keep discovering new hooks, nooks and crannies where the interface names didn't indicate what I would use the hook for. One could argue that this is because I'm slow or don't "get it". I won't, but it is entirely possible. :) I'll argue that it shows the richness of Drupal. I am repeatedly impressed.
making Drupal easier to develop for is an important goal (see my blog posts about it). Please take the time to point out what function names you found to be confusing, and provide concrete suggestions as how we could made them easier to discover. For extra points, file issues with patches. All of what I said is obvious, and is a fairly easy problem to fix. It's not worth discussing over and over again. Just make it happen. -- Dries Buytaert :: http://www.buytaert.net/
Moshe Weitzman wrote But we do have .install files now so my proposal is that modules which require loading during bootstrap perform an UPDATE system SET bootstrap=1 WHERE module=<foo> during their hook_install(). I would suggest a function call instead of SQL, that way it is independent of the implementation. Something like drupal_load_during_bootstrap($flag = FALSE). So a module that needs to be loaded during bootstrap would in the .install file do a call like this drupal_load_during_bootstrap(TRUE); The only reason for the argument is if at some point in the when the module is upgraded it no longer needs to be loaded during bootstrap it can call the function with a value of FALSE.
At 9:44 AM -0500 9/26/06, John VanDyk wrote:
this new system gets rid of the $may_cache switch in hook_menu(). hook_menu is now called only upon module enable/disable and admin/menu. this is outstanding. however, we no longer have a place to stick our drupal_add_css() and drupal_add_js() type calls.
so, lets create a new hook_signal($op) which fires at various points during a request. Initially, it replace hook_init() and hook_exit() with $op=precache, $op=postcache, and $op=exit. We shall also add $op=postbootstrap which is where the drupal_add_css calls will go.
This is an unneeded level of abstraction. We already have signals that fire at various points during a request. They are called hooks. They are faster and more descriptive than hook_signal($op). hook_precache hook_postcache hook_exit etc.
On 26-Sep-06, at 10:50 AM, John VanDyk wrote:
At 9:44 AM -0500 9/26/06, John VanDyk wrote:
this new system gets rid of the $may_cache switch in hook_menu(). hook_menu is now called only upon module enable/disable and admin/ menu. this is outstanding. however, we no longer have a place to stick our drupal_add_css() and drupal_add_js() type calls.
so, lets create a new hook_signal($op) which fires at various points during a request. Initially, it replace hook_init() and hook_exit() with $op=precache, $op=postcache, and $op=exit. We shall also add $op=postbootstrap which is where the drupal_add_css calls will go.
This is an unneeded level of abstraction. We already have signals that fire at various points during a request. They are called hooks. They are faster and more descriptive than hook_signal($op).
hook_precache hook_postcache hook_exit
i agree... but also - none of those makes it clear to me *at all* that I should be putting my add_js/add_css code in them. -- James Walker :: http://walkah.net/ :: xmpp:walkah@walkah.net
James Walker wrote:
On 26-Sep-06, at 10:50 AM, John VanDyk wrote:
At 9:44 AM -0500 9/26/06, John VanDyk wrote:
this new system gets rid of the $may_cache switch in hook_menu(). hook_menu is now called only upon module enable/disable and admin/menu. this is outstanding. however, we no longer have a place to stick our drupal_add_css() and drupal_add_js() type calls.
so, lets create a new hook_signal($op) which fires at various points during a request. Initially, it replace hook_init() and hook_exit() with $op=precache, $op=postcache, and $op=exit. We shall also add $op=postbootstrap which is where the drupal_add_css calls will go.
This is an unneeded level of abstraction. We already have signals that fire at various points during a request. They are called hooks. They are faster and more descriptive than hook_signal($op).
hook_precache hook_postcache hook_exit
i agree... but also - none of those makes it clear to me *at all* that I should be putting my add_js/add_css code in them.
Maybe add a hook_header? Cheers, Gerhard
so, lets create a new hook_signal($op) which fires at various points during a request. Initially, it replace hook_init() and hook_exit() with $op=precache, $op=postcache, and $op=exit. We shall also add $op=postbootstrap which is where the drupal_add_css calls will go.
This is an unneeded level of abstraction. We already have signals that fire at various points during a request. They are called hooks. They are faster and more descriptive than hook_signal($op).
hook_precache hook_postcache hook_exit
sure, thats a reasonable option ... i find it useful though to distinguish between hooks that care about your return value and hooks that don't. those that don't are candidates for grouping within a single hook_signal($op). the principle advantage is cutting down on the number of hooks ... not a big deal though.
Moshe Weitzman wrote:
sure, thats a reasonable option ... i find it useful though to distinguish between hooks that care about your return value and hooks that don't. those that don't are candidates for grouping within a single hook_signal($op). the principle advantage is cutting down on the number of hooks ... not a big deal though.
If there's one thing I've become convinced of after working with NodeAPI, it's that multi-operation hooks are evil. In some cases it makes sense -- providing an $op is a way to distinguish between what triggered a given call to the function. But in many cases, we use and abuse $op to squish many disparate functions into a monster-function. Separate hooks with clear meanings is a happy thing, IMO at least. --Jeff
On Tue, 26 Sep 2006, Moshe Weitzman wrote:
sure, thats a reasonable option ... i find it useful though to distinguish between hooks that care about your return value and hooks that don't. those that don't are candidates for grouping within a single hook_signal($op). the principle advantage is cutting down on the number of hooks ... not a big deal though.
IMHO it is not considered to be an advantage if we add another layer of indirection here. Gabor
On Tuesday 26 September 2006 09:17, Moshe Weitzman wrote:
this new system gets rid of the $may_cache switch in hook_menu(). hook_menu is now called only upon module enable/disable and admin/menu. this is outstanding. however, we no longer have a place to stick our drupal_add_css() and drupal_add_js() type calls.
hook_menu() was a bad place for them in the first place anyway. They should be called selectively when needed so as to not cause needless calls to CSS files that don't get used. We don't need another hook for that, we need to just move the drupal_add_css() calls. See: http://drupal.org/node/81835 http://drupal.org/node/82133 I am willing to take care of moving the calls for other core modules if there's buy-in to actually commit them, but I don't want to create still more issues that will get left to rot until the second issue above is committed. -- Larry Garfield AIM: LOLG42 larry@garfieldtech.com ICQ: 6817012 "If nature has made any one thing less susceptible than all others of exclusive property, it is the action of the thinking power called an idea, which an individual may exclusively possess as long as he keeps it to himself; but the moment it is divulged, it forces itself into the possession of every one, and the receiver cannot dispossess himself of it." -- Thomas Jefferson
A little off topic - but as long as we are talking about changing the menu system a bit... About a kazillion years ago I had wondered if we could allow modules to build 'custom' menus that were outside of the 'navigation' menu tree. I had proposed adding a few lines of code to menu .inc http://drupal.org/files/issues/menu_inc_3.patch (a bit out of date - but you get the idea). I was wondering if anyone has any new thoughts on the idea of modules not being so tied to navigation menu. Basically allowing them to build their own menus that are true menu system menus - (not just navigation jammed into a hook_block). At the time of the patch - I had a specific implementation in mind - where the book module would build a true menu system menu instead of the way it builds its navigation block. This concept would still apply regardless of the menu array structure. Any thoughts? Any other killer applications for modules building menus outside of 'navigation'? andre
About a kazillion years ago I had wondered if we could allow modules to build 'custom' menus that were outside of the 'navigation' menu tree.
This is a desired feature IMO as well. Just like we can create our own blocks on module enable, we should be able to create our custom menus. As well as defining our own menus, I think we should also be able to manually specify the parent/child hierarchy and not be forced to use paths to denote hierarchy (although in most cases it is good practice to do so, parent parent/child parent/child/grandchild). What's the URL of your issue that contains that patch? As maybe we should further discuss this there or in another thread. Rob Roy Barreca Founder and COO Electronic Insight Corporation http://www.electronicinsight.com rob@electronicinsight.com Andre Molnar wrote:
A little off topic - but as long as we are talking about changing the menu system a bit...
About a kazillion years ago I had wondered if we could allow modules to build 'custom' menus that were outside of the 'navigation' menu tree.
I had proposed adding a few lines of code to menu .inc
http://drupal.org/files/issues/menu_inc_3.patch (a bit out of date - but you get the idea).
I was wondering if anyone has any new thoughts on the idea of modules not being so tied to navigation menu. Basically allowing them to build their own menus that are true menu system menus - (not just navigation jammed into a hook_block).
At the time of the patch - I had a specific implementation in mind - where the book module would build a true menu system menu instead of the way it builds its navigation block.
This concept would still apply regardless of the menu array structure.
Any thoughts? Any other killer applications for modules building menus outside of 'navigation'?
andre
Rob Barreca wrote:
What's the URL of your issue that contains that patch? As maybe we should further discuss this there or in another thread.
Exciting! Op dinsdag 26 september 2006 13:58, schreef Karoly Negyesi:
So, we have found node/%/edit. Can we access it? Instead of specifying 'access' => node_access('update', arg(1)) now we do 'access' => 'node_access', 'access arguments' => array('update', 1) . So we use Adrian's favorite: callbacks. I map 1 to arg(1) on runtime for both access and callback arguments. (Side note: I know currently node_access gets a node and not a node id. Adding node_load won't stop us.)
I really suggest again to have a look at ruby on rails routers. They do all you say, people have thought about it very well, they have proved to be working extremely well, without any performance issues. It will also allow us a few steps closer to MVC, without inventing too much wheels ourselves. Words to look for: Routers, Model and Controller. Bèr -- PGP ber@webschuur.com http://www.webschuur.com/sites/webschuur.com/files/ber_webschuur.asc Written while listening to Kiss You Off by Scissor Sisters on Ta-Dah!
On 26 Sep 2006, at 20:01, Bèr Kessels wrote:
I really suggest again to have a look at ruby on rails routers. They do all you say, people have thought about it very well, they have proved to be working extremely well, without any performance issues. It will also allow us a few steps closer to MVC, without inventing too much wheels ourselves.
Words to look for: Routers, Model and Controller.
I tried searching for these but couldn't come up with a satisfying explanation of RoR's routers. I did, however, found http://framework.zend.com/issues/browse/ ZF-166. It demonstrates a PHP implementation of RoR's router conecpt (as implemented in the Zenf framework). -- Dries Buytaert :: http://www.buytaert.net/
Op woensdag 27 september 2006 09:21, schreef Dries Buytaert:
On 26 Sep 2006, at 20:01, Bèr Kessels wrote:
I really suggest again to have a look at ruby on rails routers. They do all you say, people have thought about it very well, they have proved to be working extremely well, without any performance issues. It will also allow us a few steps closer to MVC, without inventing too much wheels ourselves.
Words to look for: Routers, Model and Controller.
I tried searching for these but couldn't come up with a satisfying explanation of RoR's routers.
This is a hands-on tutorial on how to /use/ them: http://manuals.rubyonrails.com/read/chapter/65 When reading that, it roughly translates to Drupal as folloows: Ruby ':year/:month/:day', :controller => 'blog', :action => 'by_date' Druphp : array('#path' => '%year/%month/%day', '#module' => 'blog', '#callback' => blog_by_date) %year/%month/%day will be available in blog_by_date() as $year, $month, $day, or as array('year' => 2006, 'month' => xyz, etc ). If you have some more time, dive into the large docs. http://wiki.rubyonrails.org/rails/pages/routes You will see that routers in RoR go really wild and funky at some point by allowing regexps etc. I think we should focus on the basis, the concept.
I did, however, found http://framework.zend.com/issues/browse/ ZF-166. It demonstrates a PHP implementation of RoR's router conecpt (as implemented in the Zenf framework).
Interesting. Esp since the amout of code is quite small (yet too complex for me to understand :-) ) I am not particularly fond of the :notation, since that is a native Ruby concept (sort of like the Globals in PHP), but here ported to PHP. I think we should try to keep it more native PHP :-) Bèr -- | Bèr Kessels | webschuur.com | Drupal, Joomla and Ruby on Rails web development | | Jabber & Google Talk: ber@jabber.webschuur.com | | http://bler.webschuur.com | http://www.webschuur.com |
On Tuesday 26 September 2006 06:58, Karoly Negyesi wrote:
Hi,
Note: this post is very long. But it worths a reading.
Indeed!
I dare to say that one of the biggest results of DrupalCon is that a new menu system is born.
Who was the midwife?
It all began almost a year ago:
http://lists.drupal.org/archives/development/2005-10/msg00708.html
[quote] There are three basic tasks to consider when talking about performance improvements: - displaying the visible menu (in general, this means all accessible items since we may need them for DHTML solutions) - displaying the breadcrumb trail - finding the active callback - checking access rights [/quote]
OK that's four. I already have code for the last three in my sandbox and Adrian have outlined how the first will happen.
My focus was finding the active callback. I simply followed JonBob's directions: "We could simplify menu declaration by allowing some sort of wildcards in menu paths. For example, if a module could declare the path "node/%d/edit", then we could dispense with the logic that defines the function only if we are in a node path. The code would certainly be cleaner in this case."
I am using the percent sign as a wildcard. There is no %d or %s just %. So you define node/%/edit. Let's say someone navigates to node/12/edit/foo. Now, how do we find out that node/%/edit needs to be run? We generate all possible callbacks and check which one exists:
So far I really like this concept.
1111 node/12/edit/foo 1110 node/12/edit/% 1101 node/12/%/foo 1100 node/12/%/% 1011 node/%/edit/foo 1010 node/%/edit/% 1001 node/%/%/foo 1000 node/%/%/% 111 node/12/edit 110 node/12/% 101 node/%/edit 100 node/%/% 11 node/12 10 node/% 1 node
The binary numbers are generated naturally: where you see a %, that's a 0, where you see a specific part, that's a 1. This will nicely indicate the the order in which they need to be checked (this is my idea and I am very proud of it -- nothing else in here is my idea).
*gives chx a cookie. So then, you'd generate all interpolations of an incoming path (above), then loop and do an isset() on the cached menu array for the path? Something like that?
So, we have found node/%/edit. Can we access it? Instead of specifying 'access' => node_access('update', arg(1)) now we do 'access' => 'node_access', 'access arguments' => array('update', 1) . So we use Adrian's favorite: callbacks. I map 1 to arg(1) on runtime for both access and callback arguments. (Side note: I know currently node_access gets a node and not a node id. Adding node_load won't stop us.)
How well would this support more complex access rules? Currently the end result is a boolean and that's all that's saved. That allows for complex rules like: user_access('break things') || $uid=1 || $uid == $nid->uid (or whatever). How would we do similar compound access rules with this system? It looks like it would only support a single function call, which is not always appropriate.
Our life gets a bit complicated here, because we do not need to grab one menu item but three: one that defines the title, one that defines the (page) callback and one that defines access. Instead of running three queries against a huge table, I planned to retrieve all possible matches, and pick my three items. As a bonus, I will get the breadcrumb trail. Someone at the second menu meeting (Steven? Adrian?) pointed out that I could move this to the builder phase, and that's done. So we are back to retrieving one item which has title, callback and access inherited as needed.
I do not recall the original thread in detail, and was (sadly) not at DrupalCon. Why would we need 3 calls?
At mentioned in the third meeting, Steven pointed out that my system is vulnerable to a simple DoS attack so now menu item definitions are limited to 6 parts. I can hardly imagine this being a real limitation, I can consider raising it to seven but definitely not more. We are shooting at what's useable not all possible cases.
That's probably safe for internal paths. The vast majority of paths accessed are node/# anyway, which doesn't exactly have many permutations. As long as the path alias is unaffected, this should be fine.
Finally we need to display the navigation block. As said, I shoot at what's probable so instead of caching per user I will check every menu item's access on the fly. The probable case is that these are user_access calls. Maybe user_access needs to be changed to use a multi dimensional array if the repeated strpos calls are too slow. With that said, we need a mechanism to discover which menu items are possible. We will simply cache by parent id and unserialize that (credit goes to Adrian for this). Note that these structures are nowhere near the size of the current menu array.
User permissions shouldn't be stored as a comma-delimited string anyway. They should be independent entries, so that we can just to an in_array() or even isset() on an array of permissions a user has.
As a bonus feature, we will have hook_menu_alter. To make that usable, we moved the path to the menu item key. (Silly me was doing it in the builder but Adrian pointed out that I should simply define the menu items this way)
Sweet, more FAPI-like. :-) Would that allow us to *remove* menu items via _menu_alter()?
Aside from Adrian and Steven, big kudos goes to Earl Milles (merlinofchaos).
What's done? On the menu level, I need to write the navigation block, all else is done. I also need to convert modules -- node and user already done. All this took only a dozen hours or so, beginning with the first evening/night of DrupalCon first day ending on the train to Charleroi converting user and three meetings at DrupalCon / BarCamp.
This sounds like a good move overall, but I am curious what a "typical" hook_menu() will look like when this is done? Can you provide or point to a basic example? (An excerpt from the new and improved node_menu() would be fine.) Spiffiness! -- Larry Garfield AIM: LOLG42 larry@garfieldtech.com ICQ: 6817012 "If nature has made any one thing less susceptible than all others of exclusive property, it is the action of the thinking power called an idea, which an individual may exclusively possess as long as he keeps it to himself; but the moment it is divulged, it forces itself into the possession of every one, and the receiver cannot dispossess himself of it." -- Thomas Jefferson
On 27 Sep 2006, at 4:17 AM, Larry Garfield wrote:
How well would this support more complex access rules? Currently the end result is a boolean and that's all that's saved. That allows for complex rules like:
The issue with a boolean like that however is that it's result is different for every single user. This new menu system entirely removes the may_cache element, and is almost guaranteed to be several orders of magnitude faster and less memory intensive. (disclaimer: 64.5% of statistics are made up on the spot).
user_access('break things') || $uid=1 || $uid == $nid->uid
You can still wrap this in a function and call it. This is sadly the only way this can be done, due to php's unfortunate lack of support for closures.
(or whatever). How would we do similar compound access rules with this system? It looks like it would only support a single function call, which is not always appropriate.
I have a mechanism to allow multiple function calls, but it is still only for a limited amount of flexibility. It is built on the callback specification of forms api, and is formalised using a function as a constructor. : function cb($function) { $args = func_get_args(); $callback = array_shift($args); $data['_properties_'] = array('type' => 'callback')); $data[microtime()] => array('callback' => $callback, 'arguments' => $args); return $data; } Which now allows you to do : 'access' => cb('function', 'arg') + cb('function', 'arg') + cb ('function','arg') In the data api, any property can be a callback instead of a value, in fact it's preferred as callbacks can be cached, but values can't be.
Larry Garfield wrote:
So then, you'd generate all interpolations of an incoming path (above), then loop and do an isset() on the cached menu array for the path? Something like that?
Better. Create all possible variants on the current URL, do a fetch from the menu table using an IN, and ORDER the query by the weight descending, so the highest comes to the top. Then take the first result that comes up. For example: The actual URL entered is: node/12/edit/foo the simplistic version of the query would be: SELECT * FROM menu WHERE path IN ('node/12/edit/foo', 'node/12/edit', 'node/12/%/foo', 'node/%/edit/foo', 'node/%/%/foo', 'node/%/edit', 'node/%') ORDER BY weight DESC; This does have an order of N complexity in that there are 2^N-1 URL possibilities for every URL fragment you have -- which gets unwieldy around 7, but it is rare that you get URLs that long, though I imagine it does happen from time to time. At this time we believe this is an acceptable cost compared to the cost of the system it is replacing, while leaving us with maximum flexibility.
On 28 Sep 2006, at 14:37, Earl Miles wrote:
The actual URL entered is: node/12/edit/foo
the simplistic version of the query would be:
SELECT * FROM menu WHERE path IN ('node/12/edit/foo', 'node/12/ edit', 'node/12/%/foo', 'node/%/edit/foo', 'node/%/%/foo', 'node/%/ edit', 'node/%') ORDER BY weight DESC;
This does have an order of N complexity in that there are 2^N-1 URL possibilities for every URL fragment you have -- which gets unwieldy around 7, but it is rare that you get URLs that long, though I imagine it does happen from time to time. At this time we believe this is an acceptable cost compared to the cost of the system it is replacing, while leaving us with maximum flexibility.
Interesting aproach, Earl. I think this might work, and relatively fast too. :-) -- Dries Buytaert :: http://www.buytaert.net/
Side comment on the use of %. Since we use % for a special meaning, will there be a readability issue here? Too much escaping of %? Would another special character do the trick here?
On 28 Sep 2006, at 14:37, Earl Miles wrote:
The actual URL entered is: node/12/edit/foo
the simplistic version of the query would be:
SELECT * FROM menu WHERE path IN ('node/12/edit/foo', 'node/12/ edit', 'node/12/%/foo', 'node/%/edit/foo', 'node/%/%/foo', 'node/%/ edit', 'node/%') ORDER BY weight DESC;
This does have an order of N complexity in that there are 2^N-1 URL possibilities for every URL fragment you have -- which gets unwieldy around 7, but it is rare that you get URLs that long, though I imagine it does happen from time to time. At this time we believe this is an acceptable cost compared to the cost of the system it is replacing, while leaving us with maximum flexibility.
On Thursday 28 September 2006 12:45, Khalid B wrote:
Side comment on the use of %.
Since we use % for a special meaning, will there be a readability issue here? Too much escaping of %? Would another special character do the trick here?
How about asterisk ("*")? It already implies wildcard in regexp and filename globbing, which makes it an intuitive fit. And in most databases it has no special meaning and therefore doesn't have to be escaped in a string. Remember that in the context of a URL path, we shouldn't be thinking of it as a "database query". That's an internal implementation detail, not relevant to the site designer. Syscrusher -- ------------------------------------------------------------------------------- Syscrusher (Scott Courtney) Drupal page: http://drupal.org/user/9184 syscrusher at 4th dot com Home page: http://4th.com/
On 28 Sep 2006, at 5:56 PM, Dries Buytaert wrote:
Interesting aproach, Earl. I think this might work, and relatively fast too. :-)
And a lot less memory intensive. no building huge arrays that you keep in memory. Also. we get rid of the need for the $may_cache operation.
On 28-Sep-06, at 1:38 PM, adrian rossouw wrote:
Also. we get rid of the need for the $may_cache operation.
er... that was always the case with chx's new approach, no? -- James Walker :: http://walkah.net/ :: xmpp:walkah@walkah.net
On Thursday 28 September 2006 10:56, Dries Buytaert wrote:
On 28 Sep 2006, at 14:37, Earl Miles wrote:
The actual URL entered is: node/12/edit/foo
the simplistic version of the query would be:
SELECT * FROM menu WHERE path IN ('node/12/edit/foo', 'node/12/ edit', 'node/12/%/foo', 'node/%/edit/foo', 'node/%/%/foo', 'node/%/ edit', 'node/%') ORDER BY weight DESC;
This does have an order of N complexity in that there are 2^N-1 URL possibilities for every URL fragment you have -- which gets unwieldy around 7, but it is rare that you get URLs that long, though I imagine it does happen from time to time. At this time we believe this is an acceptable cost compared to the cost of the system it is replacing, while leaving us with maximum flexibility.
Interesting aproach, Earl. I think this might work, and relatively fast too. :-)
Indeed! This qualifies as an "Ooo" in my book. And even if it is a 2^n algorithm, it's still being done in SQL as a not-unusual SQL query. It's absolutely guaranteed that MySQL or Postgres will be faster than any searching we do in PHP. I also agree with Kahlid that we should use something other than % for simplicity. *, ?, #, there's plenty of other characters we can use that wouldn't make the SQL uglier. -- Larry Garfield AIM: LOLG42 larry@garfieldtech.com ICQ: 6817012 "If nature has made any one thing less susceptible than all others of exclusive property, it is the action of the thinking power called an idea, which an individual may exclusively possess as long as he keeps it to himself; but the moment it is divulged, it forces itself into the possession of every one, and the receiver cannot dispossess himself of it." -- Thomas Jefferson
On 29 Sep 2006, at 04:55, Larry Garfield wrote:
I also agree with Kahlid that we should use something other than % for simplicity. *, ?, #, there's plenty of other characters we can use that wouldn't make the SQL uglier.
I don't care about the wildcard character. We are already using % in some place so I'd optimize for speed. -- Dries Buytaert :: http://www.buytaert.net/
Op 29-sep-06, om 09:06 heeft Dries Buytaert het volgende geschreven:
On 29 Sep 2006, at 04:55, Larry Garfield wrote:
I also agree with Kahlid that we should use something other than % for simplicity. *, ?, #, there's plenty of other characters we can use that wouldn't make the SQL uglier.
I don't care about the wildcard character. We are already using % in some place so I'd optimize for speed.
The wildcard choice does have an effect. There is a problem if a module wants to define a path with a percentage character in it. This can happen for example if you search for "some % search string", the resulting path is "search/node/some % search string". The only solution would be to remove % signs from the path. This is not a problem in search (it is considered a garbage character), but it might be elsewhere (pathauto). However, we only allow full wildcards, i.E. where an entire argument is "%", so this problem is not too big. We just need to be aware of it. Note that this has nothing to do with the "%XX" urlencoding escape. Urlencoding is only applied when outputting the menu path as a url (e.g. when we add base_path() and ?q=), and it is also decoded before PHP passes the data into $_GET. Steven
On Sep 26, 2006, at 4:58 AM, Karoly Negyesi wrote:
At mentioned in the third meeting, Steven pointed out that my system is vulnerable to a simple DoS attack so now menu item definitions are limited to 6 parts. I can hardly imagine this being a real limitation, I can consider raising it to seven but definitely not more. We are shooting at what's useable not all possible cases.
folks should keep in mind that the 6th part would still contain the rest of the URL (we just limit an explode on '/', so we get an array of at most 6 elements). if you *desperately* needed to register different menu items, you could (with some limitations), and you'd still be able to get to the whole path via arg(X), even if some crazy URL ended up being 8 or 9 levels deep. On Sep 26, 2006, at 8:36 AM, Jeff Eaton wrote:
Separate hooks with clear meanings is a happy thing, IMO at least.
+1 to that. ;) overall, i love it. thanks to everyone involved, especially chx "the midwife". ;) -derek
On 26 Sep 2006, at 13:58, Karoly Negyesi wrote:
My focus was finding the active callback. I simply followed JonBob's directions: "We could simplify menu declaration by allowing some sort of wildcards in menu paths. For example, if a module could declare the path "node/%d/edit", then we could dispense with the logic that defines the function only if we are in a node path. The code would certainly be cleaner in this case."
I am using the percent sign as a wildcard. There is no %d or %s just %. So you define node/%/edit. Let's say someone navigates to node/12/edit/foo. Now, how do we find out that node/%/edit needs to be run? We generate all possible callbacks and check which one exists:
Any particular reason not to use %d and %s? I'd favor the use of %d and %s (instead of %) for two reasons: 1. Security. It allows us to cast URL parameters to their proper type. This helps to prevent XSS/SQL injection attacks. 2. Consistency. People familiar with the database API can easily guess what they do.
The binary numbers are generated naturally: where you see a %, that's a 0, where you see a specific part, that's a 1. This will nicely indicate the the order in which they need to be checked (this is my idea and I am very proud of it -- nothing else in here is my idea).
Mmm. So, say we have 250 menu callbacks in the system, and we're trying to match 'node/12/edit/foo'. There exist 15 permutations of 'node/12/edit/foo' so in the worst case scenario we'd have to do 250 * 15 = 3750 string comparisons to find the proper handler? That sounds excessive ... and is bound to be really slow. Maybe I'm mistaken?
Our life gets a bit complicated here, because we do not need to grab one menu item but three: one that defines the title, one that defines the (page) callback and one that defines access. Instead of running three queries against a huge table, I planned to retrieve all possible matches, and pick my three items. As a bonus, I will get the breadcrumb trail. Someone at the second menu meeting (Steven? Adrian?) pointed out that I could move this to the builder phase, and that's done. So we are back to retrieving one item which has title, callback and access inherited as needed.
This part I don't understand ... -- Dries Buytaert :: http://www.buytaert.net/
On 27 Sep 2006, at 9:35 AM, Dries Buytaert wrote:
Any particular reason not to use %d and %s? I'd favor the use of % d and %s (instead of %) for two reasons:
1. Security. It allows us to cast URL parameters to their proper type. This helps to prevent XSS/SQL injection attacks.
2. Consistency. People familiar with the database API can easily guess what they do. It DRAMATICALLY increases the number of possible matches.
Secondly, I don't feel that chx actually explained this correctly, but we store the generated menu items, into a normalised database table. So we can get the callback through a single 'select * from {menu_tree} where path in (/* generated list of possible options */) order by likelihood, slashes, wildcards'; so you get 1 single sql query that can get the result, instead of unserialising the massive tree.
adrian rossouw wrote:
Secondly, I don't feel that chx actually explained this correctly, but we store the generated menu items, into a normalised database table.
So we can get the callback through a single 'select * from {menu_tree} where path in (/* generated list of possible options */) order by likelihood, slashes, wildcards';
so you get 1 single sql query that can get the result, instead of unserialising the massive tree.
Is the number of items in "generated list of possible options" -- that is, the SQL IN LIST small? Is "path" an indexed column? Doing select statements with IN lists is often quite inefficient. Is the expected result set a single item, a few items or a lot of items? (The answer is probably evident from an understanding of just exactly how this menu scheme works, but I admit I have not quite grasped it completely yet.) Unserializing a massive tree is certainly slow. Let's be sure we don't shoot ourselves in the foot with an algorithm which may be subject to SQL end-cases that take forever to retrieve. :-) If this makes a big improvement in performance as apparently it is believed to make possible, then it will be one of the best improvements to core.
Is the number of items in "generated list of possible options" -- that is, the SQL IN LIST small? Is "path" an indexed column? Doing select statements with IN lists is often quite inefficient.
For n pieces of URL parts, it's 2^n - 1. We cut at six, so it's maximum is 63. Here it is for three pieces: node/12/edit node/12/% node/%/edit node/%/% node/12 node/% node this is seven. Whether path is an indexed column or not is not a question at this point. It's more like an assert: path must be indexed.
Is the expected result set a single item, a few items or a lot of items? (The answer is probably evident from an understanding of just exactly how this menu scheme works, but I admit I have not quite grasped it completely yet.)
One item. Everything is packed into that item by the builder. At some point I thought you will have three items, then approximately n items or less (depending on how many children you have). But then came the BarCamp menu talk and now everything is in the builder.
Unserializing a massive tree is certainly slow. Let's be sure we don't shoot ourselves in the foot with an algorithm which may be subject to SQL end-cases that take forever to retrieve. :-)
Let's hope not. We are trying to make it as fast as possible. Currently, I work on the builder, to build an array out of depths and weights, use array_multisort and then vancode the stuff so that the navigation block is in correct order. Regards NK
On Sep 27, 2006, at 12:35 AM, Dries Buytaert wrote:
Any particular reason not to use %d and %s?
for example, a large # of the menu items registered in the project module and friends require an argument to specify which project you're operating on, but that could be either numeric (project node id) or a string (the project's short name). in most cases, the project id is currently the last element of the URL, so it's not registered as part of the menu path anyway, but i could certainly imagine a situation where i'd need the arg in the middle of the URL with additional path elements at the end (especially in some of the schemes i've thought of to try to solve http://drupal.org/node/ 66981). assuming i need some menu items like: project/[id]/[function] where [function] is a string describing the action this page will perform, and [id] is either an int (project nid) or string (project name), why should i have to register this as 2 separate menu items (one with a %d and the other with a %s)? they're both going to be identical in every other respect, and unless i'm insane, the callback function for both items will be a single function that just checks "is_numeric(arg(1))" and does the right thing depending on what kind of argument it is. just providing a potential use-case for a generic % in the menu path -- i'm not set on it (i've got more important holy wars to wage). ;) thanks, -derek
On 28 Sep 2006, at 6:00 PM, Derek Wright wrote:
where [function] is a string describing the action this page will perform, and [id] is either an int (project nid) or string (project name), why should i have to register this as 2 separate menu items (one with a %d and the other with a %s)? they're both going to be identical in every other respect, and unless i'm insane, the callback function for both items will be a single function that just checks "is_numeric(arg(1))" and does the right thing depending on what kind of argument it is.
Yeah. we decided that a page callback could just do if (!is_numeric) { drupal_page_not_found(); } that's completely acceptable imo
----- Start Original Message ----- Sent: Thu, 28 Sep 2006 09:00:38 -0700 From: Derek Wright <drupal@dwwright.net> To: development@drupal.org Subject: Re: [development] The new menu system
On Sep 27, 2006, at 12:35 AM, Dries Buytaert wrote:
Any particular reason not to use %d and %s?
Yes. 3^n possibiltiies instead of 2^n is a horrible cost. And also, we are matching in the database not with sprintf. Argument type is not to be solved IMNSHO on this level. Regarding *, I think some modules already use that, namely views. There won't be too much escaping. You will never ever emit a path like node/%/foo , that's an internal id. Regards NK
Yes. 3^n possibiltiies instead of 2^n is a horrible cost. And also, we are matching in the database not with sprintf. Argument type is not to be solved IMNSHO on this level.
Regarding *, I think some modules already use that, namely views. There won't be too much escaping. You will never ever emit a path like node/%/foo , that's an internal id.
Have you considered using %s and %d, but considering them equivalent for the lookup process? e.g. I could not define node/%s/edit and node/%d/edit, but if I define the second one, the argument gets cast to integer when passed in. Steven
Have you considered using %s and %d, but considering them equivalent for the lookup process?
e.g. I could not define node/%s/edit and node/%d/edit, but if I define the second one, the argument gets cast to integer when passed in.
Yes. This is called mask (a callback, defaults to sprintf) and mask arguments (like array('%s/%s/%d') but can be whatever). This will be applied like: $_GET['q'] = call_user_func_array($mask_callback, array_merge($mask_arguments, explode('/', $_GET['q']))); It is planned on the second run of the patch or Drumm will have my hide for stuffing too many things in one patch.
On 28 Sep 2006, at 18:00, Derek Wright wrote:
just providing a potential use-case for a generic % in the menu path -- i'm not set on it (i've got more important holy wars to wage). ;)
I think the use-case is somewhat moot. %s would be a generic attribute (as only %d's would be cast to integers). The performance related argument and the query-ability argument, however, makes sense to me. Thanks for the various clarifications. I'll look at the sandbox as soon time permits! :) -- Dries Buytaert :: http://www.buytaert.net/
On Tuesday 26 September 2006 07:58, Karoly Negyesi wrote:
At mentioned in the third meeting, Steven pointed out that my system is vulnerable to a simple DoS attack so now menu item definitions are limited to 6 parts. I can hardly imagine this being a real limitation, I can consider raising it to seven but definitely not more. We are shooting at what's useable not all possible cases.
I have a custom module developed for my employer's intranet that has what I consider to be pretty deep menus. From that perspective, I still agree with you that 6 is plenty, because even this ugly module of mine only uses 5 maximum. I can't think what I'd ever do with more than 6. I'm assuming, though, that your restriction doesn't affect user-defined paths from path.module, right? On that assumption... +1 Syscrusher -- ------------------------------------------------------------------------------- Syscrusher (Scott Courtney) Drupal page: http://drupal.org/user/9184 syscrusher at 4th dot com Home page: http://4th.com/
participants (22)
-
adrian rossouw -
Andre Molnar -
Bèr Kessels -
Chris Johnson -
Derek Wright -
Dries Buytaert -
Earl Miles -
Gabor Hojtsy -
Gerhard Killesreiter -
James Walker -
Jeff Eaton -
John VanDyk -
Karoly Negyesi -
karoly@negyesi.net -
Khalid B -
Larry Garfield -
Moshe Weitzman -
Rob Barreca -
Scott McLewin -
Steve Ringwood -
Steven Wittens -
Syscrusher