RFC: info hook standardization
Unlike the IETF, I am actually looking for comments and feedback at this point. :-) In recent versions, Drupal has developed an increasing number of registry-style hooks. By "registry style", I mean a hook that serves to build a ginormous (technical term) associative array and return it, and does nothing else. hook_menu(), hook_theme(), hook_node_info(), those are just a few of such hooks. CCK has one for formatters and another for widgets. Views has hook_views_default_views() and, in Views 2, a couple of others as well. There's an issue[1] to convert hook_blocks() over to that format, too, for a variety of reasons. In most cases, these hooks are called rarely and then cached, either in the cache table or in a dedicated table such as menu_router, or they aren't but easily could be and probably should be. However, it is not always readily apparent which is the case. For instance, hook_forms() is not a registry hook in that it is contextual, and cannot be cached that way in its current implementation. (Perhaps it should be, but that's not what I'm getting at.) Some such hooks have a corresponding _alter hook, others do not. I therefore am going to propose a conventional standard for such registry-style hooks. I will dub them "info hooks", because about half of such hooks currently end in _info() already. An info hook would be defined as follows: - It has a name of the form hook_$singularNoun_info(). - It takes no parameters. - It returns a nested associative array of arbitrary complexity (case-specific). - it has a corresponding _alter hook (vis, hook_$singularNoun_info_alter()). - It is structured in such a way that its results can and are cached (either in the cache table or a dedicated table, depending on the use case). - That cache is permanent, so under normal operation the hook never needs to be called again. - The new registry system knows to never pre-load the file in which an info hook resides, or rather to never flag a file included for an info hook for pre-caching. (We don't want to inadvertently flush the cache and then trigger a rebuild of those hooks on a node view page, because then we end up loading thousands of unnecessary lines.) Thus, any hook that ends in _info() you can reliably know exhibits those properties (and possibly others), and a hook that does not end in _info() does not exhibit all of those properties (although it may certainly exhibit some of them, c.f. hook_form_alter()). This has a number of advantages: - Clarity. Right now, when you see a hook you don't know inherently if it's cached or not, if it has an alter hook, etc. This way, you know that if it ends in _info(), it either does all of the above or it's a bug. - Consistency. Consistent APIs are good APIs. - Ease of development. When developing a new hook, a developer can say to himself "Ah, this is starting to look like an info hook, which means I don't need to guess what features I should add; I should just go ahead and make a full info hook out of it." You never know when that alter hook will be useful to someone. :-) - Encourages declarative programming. Eh? Declarative programming in PHP? Just so! Declarative programming is harder to make obscure, stupid bugs in (assuming you know the syntax), because most bugs are either syntax errors or blatantly obvious. Yes there are exceptions (some of the more complex SQL queries I've written come to mind), but in general you're less likely to get subtle bugs with an associative array. - Encourages generic engines. A generic engine that relies on declarative input is flexible, because you can then program it via a declarative syntax. Solving the generic problem may be more difficult in the short term, but in the long-term it buys you a great deal of flexibility. - Easier to document. Drupal's APIs are admittedly sometimes wacky and obscure. By separating the engine from the declarative interface to it, we can put a lot of power into the hands of hook authors with a very simple, easy to document syntax. We also have a clearly documented behavior for how an info hook is supposed to behave. Parameter orders also then are not a problem, since array key orders don't matter. The array keys also then serve as a form of self-documentation. - Backward compatibility. Yes, the unholy words! Just like one of the advantages of XML is that you can arbitrarily add new tags and existing well-written code won't automatically choke, you can add new keys to an associative array and as long as there is a sane default, existing code doesn't have to change unless it needs to. - Less executable code. The key goal of the Drupal 7 code registry is to load less code. Well, if we move more power into tightly written engines that are controlled by run-once hooks that we no longer need to load, that's less code, and more less code the more modules you have. Take every module's hook_menu, hook_theme, hook_node_info, hook_views_default_views, hook_block, etc. and simply remove them from the common case. Good bye several thousand lines of code. Naturally not everything can or should be a registry hook, and we shouldn't try to shoe-horn Drupal into being just an array-processor. However, this is a trend I have observed in Drupal development that it would behoove us to embrace and standardize and (dare I say!) extend the use of. Implementation: Implementing this plan would involve documenting it, of course. We would then rename hook_theme() to hook_theme_info(), hook_menu() to hook_menu_info(), etc. At that point we could investigate other subsystems that it would make sense to turn into info hooks. I suspect we will find a couple besides the block system(hook_block_info() anyone)? In the process, we may discover places where "obvious" flexibility is missing. Excuse me while I go get my flame-retardant suit. [1] http://drupal.org/node/257032 -- Larry Garfield AIM: LOLG42 larry@garfieldtech.com ICQ: 6817012 "If nature has made any one thing less susceptible than all others of exclusive property, it is the action of the thinking power called an idea, which an individual may exclusively possess as long as he keeps it to himself; but the moment it is divulged, it forces itself into the possession of every one, and the receiver cannot dispossess himself of it." -- Thomas Jefferson
Great write-up, Larry, I wholeheartedly support your proposal. On May 18, 2008, at 10:15 PM, Larry Garfield wrote:
At that point we could investigate other subsystems that it would make sense to turn into info hooks.
For example, hook_variable_info() (or whatever we'd end up calling it) to provide the defaults for every variable/setting exposed by a given module. This way, all the call sites for variable_get() don't have to take a 2nd parameter for the default value to use if there's nothing defined. http://drupal.org/node/145164 Cheers, -Derek (dww)
On May 18, 2008, at 10:15 PM, Larry Garfield wrote:
At that point we could investigate other subsystems that it would make
sense to turn into info hooks.
I opened a feature request [1] against hook_link() for grouping. This is currently available in taxonomy, but as David Rothstein pointed out in the issue, not in a nice way (calls to taxonomy_link are hardcoded in the theme system). For that and other reasons it seems like a candidate. 1. http://drupal.org/node/255686
proposal looks good to me. i only take issue with its verbosity :) hook_link() returns different info for each node and each person. it is not a good candidate IMO. the hook_schema() looks like good candidate. On Mon, May 19, 2008 at 5:04 AM, catch <catch56@googlemail.com> wrote:
On May 18, 2008, at 10:15 PM, Larry Garfield wrote:
At that point we could investigate other subsystems that it would make sense to turn into info hooks.
I opened a feature request [1] against hook_link() for grouping. This is currently available in taxonomy, but as David Rothstein pointed out in the issue, not in a nice way (calls to taxonomy_link are hardcoded in the theme system). For that and other reasons it seems like a candidate.
Great proposal. One addition: a way to invalidate the cache of an info hook if your info hook does something like reading from a database table. This would be the case if, say, your module generates a configurable set of ad blocks. -----Original Message----- From: Larry Garfield <larry@garfieldtech.com> Date: Mon, 19 May 2008 00:15:11 To:development@drupal.org Subject: [development] RFC: info hook standardization Unlike the IETF, I am actually looking for comments and feedback at this point. :-) In recent versions, Drupal has developed an increasing number of registry-style hooks. By "registry style", I mean a hook that serves to build a ginormous (technical term) associative array and return it, and does nothing else. hook_menu(), hook_theme(), hook_node_info(), those are just a few of such hooks. CCK has one for formatters and another for widgets. Views has hook_views_default_views() and, in Views 2, a couple of others as well. There's an issue[1] to convert hook_blocks() over to that format, too, for a variety of reasons. In most cases, these hooks are called rarely and then cached, either in the cache table or in a dedicated table such as menu_router, or they aren't but easily could be and probably should be. However, it is not always readily apparent which is the case. For instance, hook_forms() is not a registry hook in that it is contextual, and cannot be cached that way in its current implementation. (Perhaps it should be, but that's not what I'm getting at.) Some such hooks have a corresponding _alter hook, others do not. I therefore am going to propose a conventional standard for such registry-style hooks. I will dub them "info hooks", because about half of such hooks currently end in _info() already. An info hook would be defined as follows: - It has a name of the form hook_$singularNoun_info(). - It takes no parameters. - It returns a nested associative array of arbitrary complexity (case-specific). - it has a corresponding _alter hook (vis, hook_$singularNoun_info_alter()). - It is structured in such a way that its results can and are cached (either in the cache table or a dedicated table, depending on the use case). - That cache is permanent, so under normal operation the hook never needs to be called again. - The new registry system knows to never pre-load the file in which an info hook resides, or rather to never flag a file included for an info hook for pre-caching. (We don't want to inadvertently flush the cache and then trigger a rebuild of those hooks on a node view page, because then we end up loading thousands of unnecessary lines.) Thus, any hook that ends in _info() you can reliably know exhibits those properties (and possibly others), and a hook that does not end in _info() does not exhibit all of those properties (although it may certainly exhibit some of them, c.f. hook_form_alter()). This has a number of advantages: - Clarity. Right now, when you see a hook you don't know inherently if it's cached or not, if it has an alter hook, etc. This way, you know that if it ends in _info(), it either does all of the above or it's a bug. - Consistency. Consistent APIs are good APIs. - Ease of development. When developing a new hook, a developer can say to himself "Ah, this is starting to look like an info hook, which means I don't need to guess what features I should add; I should just go ahead and make a full info hook out of it." You never know when that alter hook will be useful to someone. :-) - Encourages declarative programming. Eh? Declarative programming in PHP? Just so! Declarative programming is harder to make obscure, stupid bugs in (assuming you know the syntax), because most bugs are either syntax errors or blatantly obvious. Yes there are exceptions (some of the more complex SQL queries I've written come to mind), but in general you're less likely to get subtle bugs with an associative array. - Encourages generic engines. A generic engine that relies on declarative input is flexible, because you can then program it via a declarative syntax. Solving the generic problem may be more difficult in the short term, but in the long-term it buys you a great deal of flexibility. - Easier to document. Drupal's APIs are admittedly sometimes wacky and obscure. By separating the engine from the declarative interface to it, we can put a lot of power into the hands of hook authors with a very simple, easy to document syntax. We also have a clearly documented behavior for how an info hook is supposed to behave. Parameter orders also then are not a problem, since array key orders don't matter. The array keys also then serve as a form of self-documentation. - Backward compatibility. Yes, the unholy words! Just like one of the advantages of XML is that you can arbitrarily add new tags and existing well-written code won't automatically choke, you can add new keys to an associative array and as long as there is a sane default, existing code doesn't have to change unless it needs to. - Less executable code. The key goal of the Drupal 7 code registry is to load less code. Well, if we move more power into tightly written engines that are controlled by run-once hooks that we no longer need to load, that's less code, and more less code the more modules you have. Take every module's hook_menu, hook_theme, hook_node_info, hook_views_default_views, hook_block, etc. and simply remove them from the common case. Good bye several thousand lines of code. Naturally not everything can or should be a registry hook, and we shouldn't try to shoe-horn Drupal into being just an array-processor. However, this is a trend I have observed in Drupal development that it would behoove us to embrace and standardize and (dare I say!) extend the use of. Implementation: Implementing this plan would involve documenting it, of course. We would then rename hook_theme() to hook_theme_info(), hook_menu() to hook_menu_info(), etc. At that point we could investigate other subsystems that it would make sense to turn into info hooks. I suspect we will find a couple besides the block system(hook_block_info() anyone)? In the process, we may discover places where "obvious" flexibility is missing. Excuse me while I go get my flame-retardant suit. [1] http://drupal.org/node/257032 -- Larry Garfield AIM: LOLG42 larry@garfieldtech.com ICQ: 6817012 "If nature has made any one thing less susceptible than all others of exclusive property, it is the action of the thinking power called an idea, which an individual may exclusively possess as long as he keeps it to himself; but the moment it is divulged, it forces itself into the possession of every one, and the receiver cannot dispossess himself of it." -- Thomas Jefferson
Larry nice proposal. but doesn't module_invoke_all already do basically what you're describing without the caching and naming enforcement? I agree with the naming convention, but not the enforcement. Maybe we just need a module_invoke_all_cacheable($expire, $hook, ....) { // expire is same as cache_set, only CACHE_STATIC indicates use a static var, no db. static $cache; $cid = $hook .':'. md5(serialize(args)); // test static cache if (!isset($cache[$cid])) { // test db cache if necessary, CACHE_STATIC indicates no DB cache if ($expires == CACHE_STATIC || !$cache[$cid] = cache_get('cache_hook_all', $cid)) $cache[$cid] = module_invoke_all(); } // store to db cache if necessary if ($expires != CACHE_STATIC) { cache_set('cache_hook_all',....); } } return $cache[$cid]) }
David Strauss wrote:
Great proposal.
One addition: a way to invalidate the cache of an info hook if your info hook does something like reading from a database table. This would be the case if, say, your module generates a configurable set of ad blocks.
You echoed my issue comment as well. It's very very important to be able to do this easily, IMO.
On Mon, 19 May 2008 11:23:02 -0700, Earl Miles <merlin@logrus.com> wrote:
David Strauss wrote:
Great proposal.
One addition: a way to invalidate the cache of an info hook if your info hook does something like reading from a database table. This would be the case if, say, your module generates a configurable set of ad blocks.
You echoed my issue comment as well. It's very very important to be able to do this easily, IMO.
Ah, indeed! Make that 2 more attributes of an info hook: - It has a commonly-named rebuild function: drupal_rebuild_$singularNoun() (or something). - drupal_rebuild_$singularNoun() is called as part of the master cache reset function. --Larry Garfield
On Mon, May 19, 2008 at 3:06 PM, Larry Garfield <larry@garfieldtech.com> wrote:
On Mon, 19 May 2008 11:23:02 -0700, Earl Miles <merlin@logrus.com> wrote:
David Strauss wrote:
Great proposal.
One addition: a way to invalidate the cache of an info hook if your info hook does something like reading from a database table. This would be the case if, say, your module generates a configurable set of ad blocks.
You echoed my issue comment as well. It's very very important to be able to do this easily, IMO.
Ah, indeed! Make that 2 more attributes of an info hook:
- It has a commonly-named rebuild function: drupal_rebuild_$singularNoun() (or something). - drupal_rebuild_$singularNoun() is called as part of the master cache reset function.
--Larry Garfield
but this wouldn't be able to reset a static cache variable? would it? unless we had something like. I really like to two stage cache for things that might possibly only want to be cached per request... function module_invoke_all_cached() { static $cache = array(); $args = func_get_args(); $hook = $args[0]; unset($args[0]); if (!is_string($hook)) { $cache = array(); } } OT: I micro benchmarked { $var = arr[0]; unset($arr[0]); } vs { $var = array_shift($arr); } and the assign and unset is consistently about 10%-30% faster.
That's not really what we're looking for. We're looking for a sort of drupal_reset_info($key) function that forces a reload of the data. -----Original Message----- From: "Darrel O'Pry" <darrel.opry@gmail.com> Date: Mon, 19 May 2008 18:26:47 To:development@drupal.org Subject: Re: [development] RFC: info hook standardization On Mon, May 19, 2008 at 3:06 PM, Larry Garfield <larry@garfieldtech.com <mailto:larry@garfieldtech.com> > wrote: On Mon, 19 May 2008 11:23:02 -0700, Earl Miles <merlin@logrus.com <mailto:merlin@logrus.com> > wrote:
David Strauss wrote:
Great proposal.
One addition: a way to invalidate the cache of an info hook if your info hook does something like reading from a database table. This would be the case if, say, your module generates a configurable set of ad blocks.
You echoed my issue comment as well. It's very very important to be able to do this easily, IMO.
Ah, indeed! Make that 2 more attributes of an info hook: - It has a commonly-named rebuild function: drupal_rebuild_$singularNoun() (or something). - drupal_rebuild_$singularNoun() is called as part of the master cache reset function. --Larry Garfield but this wouldn't be able to reset a static cache variable? would it? unless we had something like. I really like to two stage cache for things that might possibly only want to be cached per request... function module_invoke_all_cached() { static $cache = array(); $args = func_get_args(); $hook = $args[0]; unset($args[0]); if (!is_string($hook)) { $cache = array(); } } OT: I micro benchmarked { $var = arr[0]; unset($arr[0]); } vs { $var = array_shift($arr); } and the assign and unset is consistently about 10%-30% faster.
On Monday 19 May 2008, Darrel O'Pry wrote:
Ah, indeed! Make that 2 more attributes of an info hook:
- It has a commonly-named rebuild function: drupal_rebuild_$singularNoun() (or something). - drupal_rebuild_$singularNoun() is called as part of the master cache reset function.
--Larry Garfield
but this wouldn't be able to reset a static cache variable? would it? unless we had something like. I really like to two stage cache for things that might possibly only want to be cached per request...
function module_invoke_all_cached() { static $cache = array();
$args = func_get_args(); $hook = $args[0]; unset($args[0]); if (!is_string($hook)) { $cache = array(); } }
I think you're misunderstanding what I mean by cached. For an info hook, I mean cached to the database (either cache_set() or something like {menu_router}) so that it never gets called again except in unusual circumstances. A static cache only lasts for the duration of the request, and is not what I mean by caching here. Think hook_menu(), not hook_forms(). -- Larry Garfield AIM: LOLG42 larry@garfieldtech.com ICQ: 6817012 "If nature has made any one thing less susceptible than all others of exclusive property, it is the action of the thinking power called an idea, which an individual may exclusively possess as long as he keeps it to himself; but the moment it is divulged, it forces itself into the possession of every one, and the receiver cannot dispossess himself of it." -- Thomas Jefferson
HOOK_theme() fits the criteria you outlined, except for one item: "It takes no parameters." HOOK_theme() actually has 4 parameters that are essential for theme developers. Of course, HOOK_theme() is a special case since themes (and not just modules) can call that hook. (And, btw, themes can't call the corresponding HOOK_theme_registry_alter().) There might be other info hooks that could make use of parameters. Anybody know? - John John Wilkins | Albin.Net | nickname: JohnAlbin On May 18, 2008, at 10:15 PM, Larry Garfield wrote:
An info hook would be defined as follows:
- It has a name of the form hook_$singularNoun_info(). - It takes no parameters. - It returns a nested associative array of arbitrary complexity (case-specific). - it has a corresponding _alter hook (vis, hook_ $singularNoun_info_alter()). - It is structured in such a way that its results can and are cached (either in the cache table or a dedicated table, depending on the use case). - That cache is permanent, so under normal operation the hook never needs to be called again. - The new registry system knows to never pre-load the file in which an info hook resides, or rather to never flag a file included for an info hook for pre-caching. (We don't want to inadvertently flush the cache and then trigger a rebuild of those hooks on a node view page, because then we end up loading thousands of unnecessary lines.)
On Mon, May 19, 2008 at 12:46 PM, John Wilkins <drupal.user@albin.net> wrote:
HOOK_theme() fits the criteria you outlined, except for one item: "It takes no parameters."
Many other hooks also have no args, module_invoke_all has been able to support 0-n args for a long time. This is a problem long since solve in php and in various parts of drupal already. func_get_args, call_user_func_array(), etc...
HOOK_theme() actually has 4 parameters that are essential for theme developers. Of course, HOOK_theme() is a special case since themes (and not just modules) can call that hook. (And, btw, themes can't call the corresponding HOOK_theme_registry_alter().)
Themes can all call any hook. This is not a special case. Really you're going to have to explain, "themes can't call the corresponding HOOK_theme_registry_alter()". Why not?
Darrel O'Pry wrote:
On Mon, May 19, 2008 at 12:46 PM, John Wilkins <drupal.user@albin.net <mailto:drupal.user@albin.net>> wrote:
HOOK_theme() fits the criteria you outlined, except for one item: "It takes no parameters."
Many other hooks also have no args, module_invoke_all has been able to support 0-n args for a long time. This is a problem long since solve in php and in various parts of drupal already. func_get_args, call_user_func_array(), etc...
HOOK_theme() actually has 4 parameters that are essential for theme developers. Of course, HOOK_theme() is a special case since themes (and not just modules) can call that hook. (And, btw, themes can't call the corresponding HOOK_theme_registry_alter().)
Themes can all call any hook. This is not a special case. Really you're going to have to explain, "themes can't call the corresponding HOOK_theme_registry_alter()". Why not?
By 'call' he meant 'implement'. Themes get a special HOOK_theme() that is exactly like the module version, but they do not get a corresponding alter.
On Mon, May 19, 2008 at 2:24 PM, Earl Miles <merlin@logrus.com> wrote:
Darrel O'Pry wrote:
On Mon, May 19, 2008 at 12:46 PM, John Wilkins <drupal.user@albin.net<mailto:
drupal.user@albin.net>> wrote:
HOOK_theme() actually has 4 parameters that are essential for theme developers. Of course, HOOK_theme() is a special case since themes (and not just modules) can call that hook. (And, btw, themes can't call the corresponding HOOK_theme_registry_alter().)
Themes can all call any hook. This is not a special case. Really you're going to have to explain, "themes can't call the corresponding HOOK_theme_registry_alter()". Why not?
By 'call' he meant 'implement'.
Themes get a special HOOK_theme() that is exactly like the module version, but they do not get a corresponding alter.
That makes more sense. I'm a proponent of ignoring that special case for now since it's doing it's own caching and just implementing a caching version of module_invoke_all.... although I don't think we need to add the logic for caching in module_invoke_all itself since it is a called my many other modules, just extend what we have with a wrapper that implements the caching.
participants (8)
-
catch -
Darrel O'Pry -
David Strauss -
Derek Wright -
Earl Miles -
John Wilkins -
Larry Garfield -
Moshe Weitzman