After some discussion this morning, I am putting up a proposal to eliminate hooks entirely (though not the concept of hooks) and replace them with officially registered callbacks. I have identified several areas where this will improve Drupal (primarily in terms of probable performance gains). There are 2 parts to this. Part 1: drupal_register_callback function drupal_register_callback($callback, $function = NULL, $file = NULL) This function is only called from a particular location; ideally in its own separate .register file that only needs to be loaded when the callback registry is being rebuilt. However, it could also be a function within the .module file. There are some questions here, primarily balancing ease of use for the developer versus performance for the system. When it is called, the system should already know which module context is being used. This function would register a given callback (examples: nodeapi, form_alter, etc) with the Drupal system. The registry would replace the logic module_implements uses to determine which module implements which hook. The $function would be what's called. In today's world the function would be called mymodule_nodeapi; and it would likely not change, but we should allow registration to specify the function used. If left blank, the default function name would be constructed. (i.e, modulename_hookname). The $file would be a file that should live in the same directory as the .module file and tells the system which file to conditionally load when that callback is invoked. This is one of the things that creates a big advantage in this system: We could move callbacks into conditionally loaded files, allowing Drupal's minimum per-page codesize to be smaller. The 'nodeapi' hook is not run for every page load, but because hooks are anonymous right now, that code *must* be present. When you can conditionally load, however, the required size of the .module becomes very small. (in fact you would only keep items in that file that are called on every or nearly every page load. Currently that's hook_init and hook_menu but with changes to hook_menu that one may well be eliminated from every-page-load calls, and the .module file itself may not even need to be loaded for every page). Invoking a callback would still be the same: module_invoke('callback', arguments...); Part 2: function drupal_register_theme($theme, $args, $function = NULL, $file = NULL) This works very similarly to callbacks, but it registers a theming function that can be overridden by the theme. The only difference in is the $args argument, which is an array that tells the function how to translate function arguments into the $vars array, for use in .tpl.php files. For example, let's say I have the following theme function: function theme_foo($node, $bar, $baz); The $args array would be: array('node', 'bar', 'baz'). Themes would also have a similar registry; we may need to make it slightly more complex so that it could work with the theme engine to automatically register a template. Which means we might want to find a way to make it simpler for themers to register the templates they're using (for example, a directory scan could find well-named template files and automatically register them based upon already known registered info). This entire system (callbacks and functions) would be cached, so information would only need to be re-read when modules and theme settings are updated. A 'development' mode might be necessary that re-reads this information every page load. There would probably be some way for devel.module to expose this setting and automatically disable development mode when devel.module is enabled. Overall benefits to this system: 1) Some execution time is saved looking up module hooks. I don't believe this is a great deal of execution time, but it is definitely some. 2) A great deal of code can be loaded conditionally, reducing Drupal's footprint. I believe this could be a *very* significant boost to Drupal and could vastly reduce the amount of code that is loaded, making Drupal far, far friendlier to hosts that cannot use an opcode cache. 3) Certain namespace collisions will be avoided. Developers will no longer be penalized for forgetting that mymodule_user() is the user hook. 4) Theme functions will no longer have to hunt around for their function; they will not have to test the file-system to see if templates exist. They will know exactly where to go when the time is right. 5) Theme template files will be easier to include in modules. In fact, if implemented properly, modules could actually include .tpl.php files, make the proper registration, and no theme code will be harmed by being in the .module file *at all*. 6) phptemplate.engine can have much of its code removed to core, making it easier to implement alternative template engines. 7) As an interesting side effect, themes could be made to implement hook_form_alter, if we coded it that way. There is a value to this, and also an argument why this might not be a good thing. Overall cons: 1) Having to register a callback is less convenient than simply naming a function. It's even more inconvenient to have a .register file. However, using the existing naming convention, the module_builder.module *could* be made to scan a module and produce a .register file or a register function. 2) Having to register theme functions is similarly inconvenient but I think it's no worse than we have now. 3) Having to reregister upon making changes is very inconvenient; a development mode would be a must. Thoughts?
On Tue, 23 Jan 2007, Earl Miles wrote:
The $file would be a file that should live in the same directory as the .module file and tells the system which file to conditionally load when that callback is invoked. This is one of the things that creates a big advantage in this system: We could move callbacks into conditionally loaded files, allowing Drupal's minimum per-page codesize to be smaller. The 'nodeapi' hook is not run for every page load, but because hooks are anonymous right now, that code *must* be present.
Well, I am increasingly adopting a similar coding practice. A few days ago, I also split up archive.module into a small stub which registers the menu item and has a page callback, and then that page callback includes archive.inc which contains all the magic to display an archive page. This does eliminate a great deal of parsing for PHP. Although we use an opcode cache here, we observed greatly reduced memory usage if we take care of doing our own modules this way. (I don't have hard numbers unfortunately). Gabor
Gabor Hojtsy wrote:
On Tue, 23 Jan 2007, Earl Miles wrote:
The $file would be a file that should live in the same directory as the .module file and tells the system which file to conditionally load when that callback is invoked. This is one of the things that creates a big advantage in this system: We could move callbacks into conditionally loaded files, allowing Drupal's minimum per-page codesize to be smaller. The 'nodeapi' hook is not run for every page load, but because hooks are anonymous right now, that code *must* be present.
Well, I am increasingly adopting a similar coding practice. A few days ago, I also split up archive.module into a small stub which registers the menu item and has a page callback, and then that page callback includes archive.inc which contains all the magic to display an archive page.
Can I just quickly claim prior art for this ;-? http://cvs.drupal.org/viewcvs/drupal/contributions/modules/notify/notify.mod...
This does eliminate a great deal of parsing for PHP. Although we use an opcode cache here, we observed greatly reduced memory usage if we take care of doing our own modules this way. (I don't have hard numbers unfortunately).
The problem with this approach is that it needs to be finetuned for every module. For most modules you can safely but admin related stuff into an .inc file, sometimes you can put other stuff into an .inc too. The ultimate vesion of this was Karoly's split patch, which he has abandoned. I think it would be intersting to learn about any advantage this idea has even if you use APC. Cheers, Gerhard
http://cvs.drupal.org/viewcvs/drupal/contributions/modules/notify/notify.mod...
Heh, and I combined that .inc back into notify.module to have all the code in one place. Shows how much I know! :-[ Rob Roy Barreca Founder and COO Electronic Insight Corporation http://www.electronicinsight.com rob@electronicinsight.com Gerhard Killesreiter wrote:
Gabor Hojtsy wrote:
On Tue, 23 Jan 2007, Earl Miles wrote:
The $file would be a file that should live in the same directory as the .module file and tells the system which file to conditionally load when that callback is invoked. This is one of the things that creates a big advantage in this system: We could move callbacks into conditionally loaded files, allowing Drupal's minimum per-page codesize to be smaller. The 'nodeapi' hook is not run for every page load, but because hooks are anonymous right now, that code *must* be present.
Well, I am increasingly adopting a similar coding practice. A few days ago, I also split up archive.module into a small stub which registers the menu item and has a page callback, and then that page callback includes archive.inc which contains all the magic to display an archive page.
Can I just quickly claim prior art for this ;-?
http://cvs.drupal.org/viewcvs/drupal/contributions/modules/notify/notify.mod...
This does eliminate a great deal of parsing for PHP. Although we use an opcode cache here, we observed greatly reduced memory usage if we take care of doing our own modules this way. (I don't have hard numbers unfortunately).
The problem with this approach is that it needs to be finetuned for every module. For most modules you can safely but admin related stuff into an .inc file, sometimes you can put other stuff into an .inc too.
The ultimate vesion of this was Karoly's split patch, which he has abandoned.
I think it would be intersting to learn about any advantage this idea has even if you use APC.
Cheers, Gerhard
On Jan 23, 2007, at 4:34 PM, Gabor Hojtsy wrote:
Well, I am increasingly adopting a similar coding practice. A few days ago, I also split up archive.module into a small stub which registers the menu item and has a page callback, and then that page callback includes archive.inc which contains all the magic to display an archive page. This does eliminate a great deal of parsing for PHP. Although we use an opcode cache here, we observed greatly reduced memory usage if we take care of doing our own modules this way. (I don't have hard numbers unfortunately).
FWIW, I've been doing the same. To keep things consistent, each of my modules that's more than few hundred lines has a _module function that includes the inc file: function foo_menu($may_cache) { $items[] = array('path' => 'admin/user/foo', 'title' => t('Foo'), 'callback' => '_foo', 'callback arguments' => array('admin'), ); // ... return $items; } function foo_form($node) { return _foo('node_form', $node); } function _foo($func) { require_once dirname(__FILE__).'/foo.inc'; if (function_exists($func = '_foo_'.$func)) { $args = func_get_args(); unset($args[0]); return call_user_func_array($func, $args); } } And then, in the foo.inc file, I can implement _foo_admin, _foo_node_form and any other functions I'd want to call. Naturally, I try to keep all of the code required for normal operations in the .module file. (foo_user, view/load portions of foo_nodeapi, foo_block, etc) It would be counter-productive to invoke _foo() on every page load. Allie Micka pajunas interactive, inc. http://www.pajunas.com scalable web hosting and open source strategies
+1 from me provided the performance gain is real. -- Sammy Spets Synerger Pty Ltd http://synerger.com On 23-Jan-07 14:21, Earl Miles wrote:
After some discussion this morning, I am putting up a proposal to eliminate hooks entirely (though not the concept of hooks) and replace them with officially registered callbacks. I have identified several areas where this will improve Drupal (primarily in terms of probable performance gains). There are 2 parts to this.
Part 1: drupal_register_callback
function drupal_register_callback($callback, $function = NULL, $file = NULL)
This function is only called from a particular location; ideally in its own separate .register file that only needs to be loaded when the callback registry is being rebuilt. However, it could also be a function within the .module file. There are some questions here, primarily balancing ease of use for the developer versus performance for the system. When it is called, the system should already know which module context is being used.
This function would register a given callback (examples: nodeapi, form_alter, etc) with the Drupal system. The registry would replace the logic module_implements uses to determine which module implements which hook.
The $function would be what's called. In today's world the function would be called mymodule_nodeapi; and it would likely not change, but we should allow registration to specify the function used. If left blank, the default function name would be constructed. (i.e, modulename_hookname).
The $file would be a file that should live in the same directory as the .module file and tells the system which file to conditionally load when that callback is invoked. This is one of the things that creates a big advantage in this system: We could move callbacks into conditionally loaded files, allowing Drupal's minimum per-page codesize to be smaller. The 'nodeapi' hook is not run for every page load, but because hooks are anonymous right now, that code *must* be present. When you can conditionally load, however, the required size of the .module becomes very small. (in fact you would only keep items in that file that are called on every or nearly every page load. Currently that's hook_init and hook_menu but with changes to hook_menu that one may well be eliminated from every-page-load calls, and the .module file itself may not even need to be loaded for every page).
Invoking a callback would still be the same: module_invoke('callback', arguments...);
Part 2: function drupal_register_theme($theme, $args, $function = NULL, $file = NULL)
This works very similarly to callbacks, but it registers a theming function that can be overridden by the theme. The only difference in is the $args argument, which is an array that tells the function how to translate function arguments into the $vars array, for use in .tpl.php files.
For example, let's say I have the following theme function:
function theme_foo($node, $bar, $baz);
The $args array would be: array('node', 'bar', 'baz').
Themes would also have a similar registry; we may need to make it slightly more complex so that it could work with the theme engine to automatically register a template. Which means we might want to find a way to make it simpler for themers to register the templates they're using (for example, a directory scan could find well-named template files and automatically register them based upon already known registered info).
This entire system (callbacks and functions) would be cached, so information would only need to be re-read when modules and theme settings are updated. A 'development' mode might be necessary that re-reads this information every page load. There would probably be some way for devel.module to expose this setting and automatically disable development mode when devel.module is enabled.
Overall benefits to this system:
1) Some execution time is saved looking up module hooks. I don't believe this is a great deal of execution time, but it is definitely some.
2) A great deal of code can be loaded conditionally, reducing Drupal's footprint. I believe this could be a *very* significant boost to Drupal and could vastly reduce the amount of code that is loaded, making Drupal far, far friendlier to hosts that cannot use an opcode cache.
3) Certain namespace collisions will be avoided. Developers will no longer be penalized for forgetting that mymodule_user() is the user hook.
4) Theme functions will no longer have to hunt around for their function; they will not have to test the file-system to see if templates exist. They will know exactly where to go when the time is right.
5) Theme template files will be easier to include in modules. In fact, if implemented properly, modules could actually include .tpl.php files, make the proper registration, and no theme code will be harmed by being in the .module file *at all*.
6) phptemplate.engine can have much of its code removed to core, making it easier to implement alternative template engines.
7) As an interesting side effect, themes could be made to implement hook_form_alter, if we coded it that way. There is a value to this, and also an argument why this might not be a good thing.
Overall cons:
1) Having to register a callback is less convenient than simply naming a function. It's even more inconvenient to have a .register file. However, using the existing naming convention, the module_builder.module *could* be made to scan a module and produce a .register file or a register function.
2) Having to register theme functions is similarly inconvenient but I think it's no worse than we have now.
3) Having to reregister upon making changes is very inconvenient; a development mode would be a must.
Thoughts?
Sammy Spets wrote:
+1 from me provided the performance gain is real.
I think the first actual work step would be to do a light implementation which does the following things: o Implements a simple drupal_register_callback o Changes module_implements and module_invoke appropriately o Looks for instances where module_invoke is being used improperly o Identifies some chunk of code to be set aside And then some benchmarks to be run. I think the biggest gains will be in theming, in the end, but the conditional includes should be able to make a fair chunk of code not be loaded. One point I'd like to make: Drupal uses a LOT of CPU on many shared hosts. Theoretically, if done well, this method could reduce the amount of CPU it uses simply by reducing its footprint, especially if some of the larger modules (*cough* Views, CCK, ecommerce) are able to restructure to take advantage of it.
+1 from me too. I think the potential for speed is huge for this patch. The less files we load the faster Drupal will be for sure. Some comments: - I am torn about the .register file though. Why? Because we are moving towards too many files already (first .install, then .info, then .register, then .foo, ad nauseum). Perhaps this is overcome by one hook only modulename_register() or modulename_callbacks(). - Another option is to have a compiler for Drupal. Once a site is "compiled" all the modulename_callbacks() functions are written to ONE file for all the site's modules and themes, or perhaps two files, one for core and the other for contrib. This single file (or two), can be loaded once instead of some 30 .callback or .module files, which will be faster. We then get the benefits of Karoly's split patch as a bonus. - In development mode, the compile step is dynamic, making the site slower, but guaranteed to be "fresh". - We can even make the development mode be the default if we want, and the compile step is an optimization step for sites who need it.
Khalid B wrote:
- I am torn about the .register file though. Why? Because we are moving towards too many files already (first .install, then .info, then .register, then .foo, ad nauseum). Perhaps this is overcome by one hook only modulename_register() or modulename_callbacks().
See Karoly's response. I think that combined with some kind of special registry hook is probably better than a .register file. And it easily incorporates his before/after bits, which are brilliant and I had forgotten about them.
On 1/23/07, Khalid B <kb@2bits.com> wrote:
+1 from me too.
I think the potential for speed is huge for this patch. The less files we load the faster Drupal will be for sure.
Some comments:
- I am torn about the .register file though. Why? Because we are moving towards too many files already (first .install, then .info, then .register, then .foo, ad nauseum). Perhaps this is overcome by one hook only modulename_register() or modulename_callbacks().
- Another option is to have a compiler for Drupal. Once a site is "compiled" all the modulename_callbacks() functions are written to ONE file for all the site's modules and themes, or perhaps two files, one for core and the other for contrib. This single file (or two), can be loaded once instead of some
30 .callback or .module files, which will be faster. We then get the benefits of Karoly's split patch as a bonus.
- In development mode, the compile step is dynamic, making the site slower, but guaranteed to be "fresh".
- We can even make the development mode be the default if we want, and the compile step is an optimization step for sites who need it.
More thinking aloud: The compile step can be done on-demand as well, but requires the web server to have write access to the master .callback. It can detect if the time of the files have changed and then generate the master .callback file. Much like we do the menu cache in the database today. Heck, we can store the callbacks in the database as well. Not sure if that is faster or not. We already store the modules and themes in the system table.
On 1/24/07, Khalid B <kb@2bits.com> wrote:
- Another option is to have a compiler for Drupal. Once a site is "compiled" all the modulename_callbacks() functions are written to ONE file for all the site's modules and themes, or perhaps two files, one for core and the other for contrib. This single file (or two), can be loaded once instead of some 30 .callback or .module files, which will be faster. We then get the benefits of Karoly's split patch as a bonus.
Why would you want to compile all the callback functions into ONE file? Sure, that would reduce disk I/O, but it would mean that we're not reducing our memory footprint by nearly as much as we could/should. A better option, IMO, would be to compile code based on the menu paths for which that code is needed. Because of wildcards (e.g. 'node/%' is just one path, rather than 'node/1', 'node/2', etc. being however many thousand paths), this will make much more sense in the Drupal 6 menu system than it would for the Drupal 5 menu system. For each menu path, we could have a file that is a cache of the functions needed by that path. Kind of analogous to page caching, except we're caching logic, rather than output. This would give us the best of both worlds: 1. Minimum disk I/O: we're only loading ONE file (for modules) on each page view. 2. Minimum memory footprint: that one file contains ONLY the functions/callbacks that are actually used on that page view. It would, however, have the big disadvantage of Drupal generating hundreds of compiled-code cache files. Perhaps we could take advantage of the cache API instead? E.g. store the compiled code for each page in the database by default, but use file-based caching if that cache engine is installed, or memcached, etc. Cheers, Jaza.
On 1/23/07, Jeremy Epstein <jazepstein@gmail.com> wrote:
On 1/24/07, Khalid B <kb@2bits.com> wrote:
- Another option is to have a compiler for Drupal. Once a site is "compiled" all the modulename_callbacks() functions are written to ONE file for all the site's modules and themes, or perhaps two files, one for core and the other for contrib. This single file (or two), can be loaded once instead of some 30 .callback or .module files, which will be faster. We then get the benefits of Karoly's split patch as a bonus.
Why would you want to compile all the callback functions into ONE file? Sure, that would reduce disk I/O, but it would mean that we're not reducing our memory footprint by nearly as much as we could/should.
By "compile", I did not mean real compilation of PHP code into byte code or something. Just a pre-processing step that is Drupal specific of merging all the callback definitions into one file to load (or one table in the DB). Just the definitions, not the actual code. So, only one file (table) needs to get loaded on each page view, and that contains a pointer to which files contain which functions, and what callback (like what Earl said initially). Loading one file is faster than loading 30 of them for every page. Then we go to the actual file on an as needed basis, not having to read all the files in as we do today. A better option, IMO, would be to compile code based on the menu paths
for which that code is needed. Because of wildcards (e.g. 'node/%' is just one path, rather than 'node/1', 'node/2', etc. being however many thousand paths), this will make much more sense in the Drupal 6 menu system than it would for the Drupal 5 menu system. For each menu path, we could have a file that is a cache of the functions needed by that path. Kind of analogous to page caching, except we're caching logic, rather than output.
This would give us the best of both worlds:
1. Minimum disk I/O: we're only loading ONE file (for modules) on each page view.
That is what I was saying, but with Earl's original idea of callbacks. 2. Minimum memory footprint: that one file contains ONLY the
functions/callbacks that are actually used on that page view.
Exactly what I said. It would, however, have the big disadvantage of Drupal generating
hundreds of compiled-code cache files. Perhaps we could take advantage of the cache API instead? E.g. store the compiled code for each page in the database by default, but use file-based caching if that cache engine is installed, or memcached, etc.
Not sure what the actual impact of APC and friends will be here. Cheers,
Jaza.
I agree that callbacks make more sense than hooks, in general, and that we should be trying to use callbacks instead of hooks wherever possible. Non-technical analogy that I just thought of: If you have 100 people in a room, and you're trying to find all the people in the room who are from Africa (let's say there's 30 African people in the room - but you don't know that), then it will take a lot longer to look at each person's name tag, and see if it says 'Africa' on it, than to simply have the 30 people come up to you, and tell you that they're the ones from Africa. Similarly, it takes Drupal a lot longer to go around and look if 50 modules implement hook_nodeapi, than it would take for the 12 modules in question to just tell Drupal: "hey btw, I implement the nodeapi callback". However, if you have 100 people in the room, and 98 of them are Africans, then it's probably easier to just go through each one. Otherwise the Africans will get annoyed, and say: "hey, we're almost all African, so why does each one of us have to tell you where we're from?" Similarly, I think that the (few) hooks that are implemented by virtually every module, should remain hooks. It would just be a pain in the a$$ for module developers, otherwise. This is currently only hook_help() and hook_menu(), AFAIK. I'm not sure if the new menu system in Drupal 6 will mean less hook_menu() implementations; but if it doesn't, then hook_menu() should definitely remain a hook, not a callback. Also, I agree that the performance gain needs to be worth the pain (of having to register heaps of callbacks). Benchmark results will be needed. The whole 'linking include files to registered callbacks' idea seems to me like a GREAT way to kill two birds with one stone. Good thinking, Earl! Hang on... you're thinking of applying it all to theme functions/templates as well... make that 3 birds. With ketchup. :P Cheers, Jaza. On 1/24/07, Earl Miles <merlin@logrus.com> wrote:
After some discussion this morning, I am putting up a proposal to eliminate hooks entirely (though not the concept of hooks) and replace them with officially registered callbacks. I have identified several areas where this will improve Drupal (primarily in terms of probable performance gains). There are 2 parts to this.
Part 1: drupal_register_callback
function drupal_register_callback($callback, $function = NULL, $file = NULL)
This function is only called from a particular location; ideally in its own separate .register file that only needs to be loaded when the callback registry is being rebuilt. However, it could also be a function within the .module file. There are some questions here, primarily balancing ease of use for the developer versus performance for the system. When it is called, the system should already know which module context is being used.
This function would register a given callback (examples: nodeapi, form_alter, etc) with the Drupal system. The registry would replace the logic module_implements uses to determine which module implements which hook.
The $function would be what's called. In today's world the function would be called mymodule_nodeapi; and it would likely not change, but we should allow registration to specify the function used. If left blank, the default function name would be constructed. (i.e, modulename_hookname).
The $file would be a file that should live in the same directory as the .module file and tells the system which file to conditionally load when that callback is invoked. This is one of the things that creates a big advantage in this system: We could move callbacks into conditionally loaded files, allowing Drupal's minimum per-page codesize to be smaller. The 'nodeapi' hook is not run for every page load, but because hooks are anonymous right now, that code *must* be present. When you can conditionally load, however, the required size of the .module becomes very small. (in fact you would only keep items in that file that are called on every or nearly every page load. Currently that's hook_init and hook_menu but with changes to hook_menu that one may well be eliminated from every-page-load calls, and the .module file itself may not even need to be loaded for every page).
Invoking a callback would still be the same: module_invoke('callback', arguments...);
Part 2: function drupal_register_theme($theme, $args, $function = NULL, $file = NULL)
This works very similarly to callbacks, but it registers a theming function that can be overridden by the theme. The only difference in is the $args argument, which is an array that tells the function how to translate function arguments into the $vars array, for use in .tpl.php files.
For example, let's say I have the following theme function:
function theme_foo($node, $bar, $baz);
The $args array would be: array('node', 'bar', 'baz').
Themes would also have a similar registry; we may need to make it slightly more complex so that it could work with the theme engine to automatically register a template. Which means we might want to find a way to make it simpler for themers to register the templates they're using (for example, a directory scan could find well-named template files and automatically register them based upon already known registered info).
This entire system (callbacks and functions) would be cached, so information would only need to be re-read when modules and theme settings are updated. A 'development' mode might be necessary that re-reads this information every page load. There would probably be some way for devel.module to expose this setting and automatically disable development mode when devel.module is enabled.
Overall benefits to this system:
1) Some execution time is saved looking up module hooks. I don't believe this is a great deal of execution time, but it is definitely some.
2) A great deal of code can be loaded conditionally, reducing Drupal's footprint. I believe this could be a *very* significant boost to Drupal and could vastly reduce the amount of code that is loaded, making Drupal far, far friendlier to hosts that cannot use an opcode cache.
3) Certain namespace collisions will be avoided. Developers will no longer be penalized for forgetting that mymodule_user() is the user hook.
4) Theme functions will no longer have to hunt around for their function; they will not have to test the file-system to see if templates exist. They will know exactly where to go when the time is right.
5) Theme template files will be easier to include in modules. In fact, if implemented properly, modules could actually include .tpl.php files, make the proper registration, and no theme code will be harmed by being in the .module file *at all*.
6) phptemplate.engine can have much of its code removed to core, making it easier to implement alternative template engines.
7) As an interesting side effect, themes could be made to implement hook_form_alter, if we coded it that way. There is a value to this, and also an argument why this might not be a good thing.
Overall cons:
1) Having to register a callback is less convenient than simply naming a function. It's even more inconvenient to have a .register file. However, using the existing naming convention, the module_builder.module *could* be made to scan a module and produce a .register file or a register function.
2) Having to register theme functions is similarly inconvenient but I think it's no worse than we have now.
3) Having to reregister upon making changes is very inconvenient; a development mode would be a must.
Thoughts?
A big -1 on eliminating the hook concept. A big +1 on registering and adding file concept. The way I see the future of hooks: a) hook_define_hooks, herein you can define hooks. b) on module enable time, Drupal scans your module for hooks and registers them for you -- this would merely be a cached version of module_implements. But, if you want, you can register for yourself. You could define whether you want to come before or after other module's implementation (see weights.php in my sandbox for an implementation of this -- THAT is prior art :) ) and now with Merlin's idea you could define files as well. Yes, files, it needs to be an array for what I believe, obvious reasons. This way if you do not want to bother with registering then you do not need to. If you want to then it benefits you. Split mode is woe because of references and because everything becomes call_user_func_array called -- cufa is not fast enough. Also, too many files need to be included. Let's task the humans with grouping their functions together into chunks. Regards, NK
The way I see the future of hooks:
a) hook_define_hooks, herein you can define hooks.
b) on module enable time, Drupal scans your module for hooks and registers them for you -- this would merely be a cached version of module_implements. But, if you want, you can register for yourself. You could define whether you want to come before or after other module's implementation (see weights.php in my sandbox for an implementation of this -- THAT is prior art :) ) and now with Merlin's idea you could define files as well. Yes, files, it needs to be an array for what I believe, obvious reasons.
This way if you do
This is just perfect.
Assorted thoughts in no special order: The biggest benefit, IMO, is being able to conditionally load code that on a production site is needed only rarely. Eg, hook_menu after chx's mega-patch goes in, most of the views definition hooks, etc. Those are only needed on cache rebuilding. Parsing them on every page load is a waste. At the same time, however, parsing and compiling PHP is not the slowest step in a page hit. The disk IO is. Every single PHP-Internals person I've talked to online or off has been very adamant that disk IO is the #1 killer of PHP applications, and simply reading the source file off disk is slower than compiling it. I vaguely recall an issue queue that was working on splitting out all theme functions to .tpl.php files many months ago that saw a noticeable performance hit when theme.inc was dissected into a bunch of files. How much compiling we'd have to remove to make up for the increased disk IO on those page loads that would require it is something only benchmarking will tell us. I'm not sure why we would need arbitrary files in which to put conditional functions, though. I'd think a logical grouping done by the developer would be better, as it forces us to consider when things will be used as well as saves some processing logic. For instance, .module for always-load, .cache for things that only need to be loaded to rebuild a cache (hook_menu2, hook_views_tables(), etc.), I'd even go as far as a .node for the node method/callback/pseudo-hook things. From an organizational standpoint I'd prefer having many files to one huge file, if it weren't for the performance hit. I also agree with chx that most hooks/callbacks/whatevers can and should be auto-derived. That reduces the amount of code needed in .module to define what's not in .module. :-) Theme functions couldn't be auto-detected because of the parameters to them being so variable, but I'm perfectly fine with that. In fact, hook_theme() in which one just calls drupal_register_theme() a bunch of times would be another perfect example to go in the "not always" file. I don't think themes themselves need to register what they override. That could be scanned and cached, too, on the theme admin page. A theme() call then just becomes a lookup to see which of the three locations, in order, a theme file exists. Just borrow some of chx's menu magic SQL to find the "right" template. As for development, a flag in devel module to clear the cache table on every page load, a more draconian form of cacheexclude[1], shouldn't be too difficult to add. All kinds spiffiness. :-) [1] http://drupal.org/project/cacheexclude On Tuesday 23 January 2007 4:21 pm, Earl Miles wrote:
After some discussion this morning, I am putting up a proposal to eliminate hooks entirely (though not the concept of hooks) and replace them with officially registered callbacks. I have identified several areas where this will improve Drupal (primarily in terms of probable performance gains). There are 2 parts to this.
Part 1: drupal_register_callback
function drupal_register_callback($callback, $function = NULL, $file = NULL)
This function is only called from a particular location; ideally in its own separate .register file that only needs to be loaded when the callback registry is being rebuilt. However, it could also be a function within the .module file. There are some questions here, primarily balancing ease of use for the developer versus performance for the system. When it is called, the system should already know which module context is being used.
This function would register a given callback (examples: nodeapi, form_alter, etc) with the Drupal system. The registry would replace the logic module_implements uses to determine which module implements which hook.
The $function would be what's called. In today's world the function would be called mymodule_nodeapi; and it would likely not change, but we should allow registration to specify the function used. If left blank, the default function name would be constructed. (i.e, modulename_hookname).
The $file would be a file that should live in the same directory as the .module file and tells the system which file to conditionally load when that callback is invoked. This is one of the things that creates a big advantage in this system: We could move callbacks into conditionally loaded files, allowing Drupal's minimum per-page codesize to be smaller. The 'nodeapi' hook is not run for every page load, but because hooks are anonymous right now, that code *must* be present. When you can conditionally load, however, the required size of the .module becomes very small. (in fact you would only keep items in that file that are called on every or nearly every page load. Currently that's hook_init and hook_menu but with changes to hook_menu that one may well be eliminated from every-page-load calls, and the .module file itself may not even need to be loaded for every page).
Invoking a callback would still be the same: module_invoke('callback', arguments...);
Part 2: function drupal_register_theme($theme, $args, $function = NULL, $file = NULL)
This works very similarly to callbacks, but it registers a theming function that can be overridden by the theme. The only difference in is the $args argument, which is an array that tells the function how to translate function arguments into the $vars array, for use in .tpl.php files.
For example, let's say I have the following theme function:
function theme_foo($node, $bar, $baz);
The $args array would be: array('node', 'bar', 'baz').
Themes would also have a similar registry; we may need to make it slightly more complex so that it could work with the theme engine to automatically register a template. Which means we might want to find a way to make it simpler for themers to register the templates they're using (for example, a directory scan could find well-named template files and automatically register them based upon already known registered info).
This entire system (callbacks and functions) would be cached, so information would only need to be re-read when modules and theme settings are updated. A 'development' mode might be necessary that re-reads this information every page load. There would probably be some way for devel.module to expose this setting and automatically disable development mode when devel.module is enabled.
Overall benefits to this system:
1) Some execution time is saved looking up module hooks. I don't believe this is a great deal of execution time, but it is definitely some.
2) A great deal of code can be loaded conditionally, reducing Drupal's footprint. I believe this could be a *very* significant boost to Drupal and could vastly reduce the amount of code that is loaded, making Drupal far, far friendlier to hosts that cannot use an opcode cache.
3) Certain namespace collisions will be avoided. Developers will no longer be penalized for forgetting that mymodule_user() is the user hook.
4) Theme functions will no longer have to hunt around for their function; they will not have to test the file-system to see if templates exist. They will know exactly where to go when the time is right.
5) Theme template files will be easier to include in modules. In fact, if implemented properly, modules could actually include .tpl.php files, make the proper registration, and no theme code will be harmed by being in the .module file *at all*.
6) phptemplate.engine can have much of its code removed to core, making it easier to implement alternative template engines.
7) As an interesting side effect, themes could be made to implement hook_form_alter, if we coded it that way. There is a value to this, and also an argument why this might not be a good thing.
Overall cons:
1) Having to register a callback is less convenient than simply naming a function. It's even more inconvenient to have a .register file. However, using the existing naming convention, the module_builder.module *could* be made to scan a module and produce a .register file or a register function.
2) Having to register theme functions is similarly inconvenient but I think it's no worse than we have now.
3) Having to reregister upon making changes is very inconvenient; a development mode would be a must.
Thoughts?
-- Larry Garfield AIM: LOLG42 larry@garfieldtech.com ICQ: 6817012 "If nature has made any one thing less susceptible than all others of exclusive property, it is the action of the thinking power called an idea, which an individual may exclusively possess as long as he keeps it to himself; but the moment it is divulged, it forces itself into the possession of every one, and the receiver cannot dispossess himself of it." -- Thomas Jefferson
Larry Garfield wrote:
I don't think themes themselves need to register what they override. That could be scanned and cached, too, on the theme admin page. A theme() call then just becomes a lookup to see which of the three locations, in order, a theme file exists. Just borrow some of chx's menu magic SQL to find the "right" template.
I'm not completely sure that scanning would be adequate here; but I'm also not sure it wouldn't be. This will need to be analyzed, I think. For example, code that changes templates on some condition needs to be retained, and scanning might break this. I'm not sure, scanning might be able to take it into account in some fashion.
On 23-Jan-07, at 5:21 PM, Earl Miles wrote:
Part 1: drupal_register_callback
My favourite part of this technique is that it should make Drupal much more "deterministic" (?)... You can check your registered callbacks and have a much clearer idea of what will happen on any given function call instead of having to search through every single module and inc file to find something that looks similar to an invoke_all hook call. And this maps well to the "event listener" type thing used on the client site. Careful with all the .something files or you'll turn Drupal into Helma ;) While Adrian works harder at turning it into Lisp or Smalltak . hehe. -Rowan
My favourite part of this technique is that it should make Drupal much more "deterministic" (?)... You can check your registered callbacks and have a much clearer idea of what will happen on any given function call instead of having to search through every single module and inc file to find something that looks similar to an invoke_all hook call.
-1 on a readability. Sure, it takes a couple of days to pick up the hook concept, but now, I can pop open any text file and know exactly what functions are going to be called when. With a registry, I need to go ask Drupal when they will be called, return to the file, look up the function call I want (which is going to be named mymodules_silly_excuse_for_not_writing_hook_foo), .... I remain skeptical of this plan. I welcome a patch and benchmarks, however. -M
Mark Fredrickson wrote:
My favourite part of this technique is that it should make Drupal much more "deterministic" (?)... You can check your registered callbacks and have a much clearer idea of what will happen on any given function call instead of having to search through every single module and inc file to find something that looks similar to an invoke_all hook call.
-1 on a readability. Sure, it takes a couple of days to pick up the hook concept, but now, I can pop open any text file and know exactly what functions are going to be called when. With a registry, I need to go ask Drupal when they will be called, return to the file, look up the function call I want (which is going to be named mymodules_silly_excuse_for_not_writing_hook_foo), ....
I remain skeptical of this plan. I welcome a patch and benchmarks, however.
I honestly had the same concern as you with regards to landing in a new module and getting your feet wet quickly. With my passing understanding of the proposal there doesnt actually seem to be a reason not to follow a standard naming convention anyway is there? This would seem to provide both the benefits of computational efficiency AND the lower barrier to entry and ease of use that we are used to. Worst case scenario, should we decide to ditch /any/ naming convention the "map" as it were would exist in the proposed .registry file/function for quick reference no? Much like we currently use and lookup present menu items as drupal callbacks. -- Michael Favia michael@favias.org tel. 512.585.5650 http://michael.favias.org
On 24 Jan 2007, at 7:49 PM, Michael Favia wrote:
I honestly had the same concern as you with regards to landing in a new module and getting your feet wet quickly. With my passing understanding of the proposal there doesnt actually seem to be a reason not to follow a standard naming convention anyway is there?
If you don't follow the standard naming scheme, you would have to add an entry for that hook in your _registry hook. Whereas it could be automatically detected otherwise. Meaning by not following the standard you are actually making more work for yourself.
adrian rossouw wrote:
On 24 Jan 2007, at 7:49 PM, Michael Favia wrote:
I honestly had the same concern as you with regards to landing in a new module and getting your feet wet quickly. With my passing understanding of the proposal there doesnt actually seem to be a reason not to follow a standard naming convention anyway is there?
If you don't follow the standard naming scheme, you would have to add an entry for that hook in your _registry hook. Whereas it could be automatically detected otherwise.
Meaning by not following the standard you are actually making more work for yourself.
Is there any reason to allow non-standard entries in the first place? Cheers, Gerhard
Earl Miles wrote:
Gerhard Killesreiter wrote:
Is there any reason to allow non-standard entries in the first place?
Yes. Because if we do, we can let non-module files utilize these callbacks. That could be a benefit.
furthermore, we will allow hook ordering as chx suggested and has shown in his sandbox. scanned hooks will just get the default weight, like today.
Hi Earl, On 23 Jan 2007, at 23:21, Earl Miles wrote:
After some discussion this morning, I am putting up a proposal to eliminate hooks entirely (though not the concept of hooks) and replace them with officially registered callbacks. I have identified several areas where this will improve Drupal (primarily in terms of probable performance gains).
I think these ideas have merit. However, we'll want to benchmark them -- with and without APC. I think this is a case where we have to let the performance numbers speak. (When we started moving the theme_ functions into template files, we developed similar concepts.) -- Dries Buytaert :: http://www.buytaert.net/
Hi everyone, I'm all for the use of some hook caching. I totally agree with chx's proposal that all Drupal hooks get automatically scanned for when modules are installed. Similar to the menu system. It would be handy if a page was presented allowing an admin to fiddle with the ordering (i'm sure chx is working on this). In addition, it would be fantastic to allow modules to perform automatic hook registration on its sub-modules. For example, E-commerce would need this for its hooks. +1 for registration of hooks with an automatic registration for known hooks. -1 for getting rid of the hook concept (it's the same thing as callbacks!) -- Sammy Spets Synerger Pty Ltd http://synerger.com On 23-Jan-07 14:21, Earl Miles wrote:
After some discussion this morning, I am putting up a proposal to eliminate hooks entirely (though not the concept of hooks) and replace them with officially registered callbacks. I have identified several areas where this will improve Drupal (primarily in terms of probable performance gains). There are 2 parts to this.
Part 1: drupal_register_callback
function drupal_register_callback($callback, $function = NULL, $file = NULL)
This function is only called from a particular location; ideally in its own separate .register file that only needs to be loaded when the callback registry is being rebuilt. However, it could also be a function within the .module file. There are some questions here, primarily balancing ease of use for the developer versus performance for the system. When it is called, the system should already know which module context is being used.
This function would register a given callback (examples: nodeapi, form_alter, etc) with the Drupal system. The registry would replace the logic module_implements uses to determine which module implements which hook.
The $function would be what's called. In today's world the function would be called mymodule_nodeapi; and it would likely not change, but we should allow registration to specify the function used. If left blank, the default function name would be constructed. (i.e, modulename_hookname).
The $file would be a file that should live in the same directory as the .module file and tells the system which file to conditionally load when that callback is invoked. This is one of the things that creates a big advantage in this system: We could move callbacks into conditionally loaded files, allowing Drupal's minimum per-page codesize to be smaller. The 'nodeapi' hook is not run for every page load, but because hooks are anonymous right now, that code *must* be present. When you can conditionally load, however, the required size of the .module becomes very small. (in fact you would only keep items in that file that are called on every or nearly every page load. Currently that's hook_init and hook_menu but with changes to hook_menu that one may well be eliminated from every-page-load calls, and the .module file itself may not even need to be loaded for every page).
Invoking a callback would still be the same: module_invoke('callback', arguments...);
Part 2: function drupal_register_theme($theme, $args, $function = NULL, $file = NULL)
This works very similarly to callbacks, but it registers a theming function that can be overridden by the theme. The only difference in is the $args argument, which is an array that tells the function how to translate function arguments into the $vars array, for use in .tpl.php files.
For example, let's say I have the following theme function:
function theme_foo($node, $bar, $baz);
The $args array would be: array('node', 'bar', 'baz').
Themes would also have a similar registry; we may need to make it slightly more complex so that it could work with the theme engine to automatically register a template. Which means we might want to find a way to make it simpler for themers to register the templates they're using (for example, a directory scan could find well-named template files and automatically register them based upon already known registered info).
This entire system (callbacks and functions) would be cached, so information would only need to be re-read when modules and theme settings are updated. A 'development' mode might be necessary that re-reads this information every page load. There would probably be some way for devel.module to expose this setting and automatically disable development mode when devel.module is enabled.
Overall benefits to this system:
1) Some execution time is saved looking up module hooks. I don't believe this is a great deal of execution time, but it is definitely some.
2) A great deal of code can be loaded conditionally, reducing Drupal's footprint. I believe this could be a *very* significant boost to Drupal and could vastly reduce the amount of code that is loaded, making Drupal far, far friendlier to hosts that cannot use an opcode cache.
3) Certain namespace collisions will be avoided. Developers will no longer be penalized for forgetting that mymodule_user() is the user hook.
4) Theme functions will no longer have to hunt around for their function; they will not have to test the file-system to see if templates exist. They will know exactly where to go when the time is right.
5) Theme template files will be easier to include in modules. In fact, if implemented properly, modules could actually include .tpl.php files, make the proper registration, and no theme code will be harmed by being in the .module file *at all*.
6) phptemplate.engine can have much of its code removed to core, making it easier to implement alternative template engines.
7) As an interesting side effect, themes could be made to implement hook_form_alter, if we coded it that way. There is a value to this, and also an argument why this might not be a good thing.
Overall cons:
1) Having to register a callback is less convenient than simply naming a function. It's even more inconvenient to have a .register file. However, using the existing naming convention, the module_builder.module *could* be made to scan a module and produce a .register file or a register function.
2) Having to register theme functions is similarly inconvenient but I think it's no worse than we have now.
3) Having to reregister upon making changes is very inconvenient; a development mode would be a must.
Thoughts?
Sammy Spets wrote:
-1 for getting rid of the hook concept (it's the same thing as callbacks!)
There is a subtle difference between a hook and a callback, which has to do with registration. But auto-registration does sort of revert them back into being hooks, so I'm not sure where the terminology really stands now.
On 25 Jan 2007, at 01:31, Earl Miles wrote:
Sammy Spets wrote:
-1 for getting rid of the hook concept (it's the same thing as callbacks!)
There is a subtle difference between a hook and a callback, which has to do with registration.
Talking about terminology; this is also referred to as 'listeners'. The word 'callback' somehow /feels/ more generic than /listeners/. -- Dries Buytaert :: http://www.buytaert.net/
Dries Buytaert wrote:
Talking about terminology; this is also referred to as 'listeners'. The word 'callback' somehow /feels/ more generic than /listeners/.
When I think of listeners, I envision something that gets onto a well-named interface and then awaits instruction. Basic socket listeners, for example, which is where I know the word from, in fact. The main difference I can think of with a callback is a callback is generally specifically registered to one client or server, and a listener tends to be open to multiple clients. But I dunno what the official definitions might be, that's just a gut feeling I get from various things I've worked with over time.
On 1/25/07, Earl Miles <merlin@logrus.com> wrote:
The main difference I can think of with a callback is a callback is generally specifically registered to one client or server, and a listener tends to be open to multiple clients. But I dunno what the official definitions might be, that's just a gut feeling I get from various things I've worked with over time.
participants (17)
-
adrian rossouw -
Allie Micka -
Dries Buytaert -
Earl Miles -
Fernando Silva -
Gabor Hojtsy -
Gerhard Killesreiter -
Jeremy Epstein -
Karoly Negyesi -
Khalid B -
Larry Garfield -
Mark Fredrickson -
Michael Favia -
Moshe Weitzman -
Rob Barreca -
Rowan Kerr -
Sammy Spets