[drupal-devel] [bug] Make node_save() recursion-aware
Issue status update for http://drupal.org/node/27007 Post a follow up: http://drupal.org/project/comments/add/27007 Project: Drupal Version: cvs Component: node system Category: bug reports Priority: normal Assigned to: jvandyk Reported by: jvandyk Updated by: jvandyk Status: patch Attachment: http://drupal.org/files/issues/node_serialized2x.patch (664 bytes) node_save() saves a node, then calls insert or update hooks. These hooks may run functions that, in turn, call node_save(). This all works fine except that if a revision is being created, the revision gets serialized twice -- once on the original node_save() call and once on the callback's node_save() call. The attached two-line patch simply inserts a boolean variable called rev_serialized to keep track of whether the serialization has already been done. I'd like to see cvs and 4.6 patched. This patch is necessary so I can release actions and workflow for 4.6. Background: http://drupal.org/node/24326 jvandyk
Issue status update for http://drupal.org/node/27007 Post a follow up: http://drupal.org/project/comments/add/27007 Project: Drupal Version: cvs Component: node system Category: bug reports Priority: normal Assigned to: jvandyk Reported by: jvandyk Updated by: Dries Status: patch I don't like this patch. It's a hairy fix. There must be a better way to avoid recursion in first place. Dries Previous comments: ------------------------------------------------------------------------ Tue, 19 Jul 2005 02:14:16 +0000 : jvandyk Attachment: http://drupal.org/files/issues/node_serialized2x.patch (664 bytes) node_save() saves a node, then calls insert or update hooks. These hooks may run functions that, in turn, call node_save(). This all works fine except that if a revision is being created, the revision gets serialized twice -- once on the original node_save() call and once on the callback's node_save() call. The attached two-line patch simply inserts a boolean variable called rev_serialized to keep track of whether the serialization has already been done. I'd like to see cvs and 4.6 patched. This patch is necessary so I can release actions and workflow for 4.6. Background: http://drupal.org/node/24326
Dries wrote:
I don't like this patch. It's a hairy fix. There must be a better way to avoid recursion in first place.
I hope we can put on our thinking caps to achieve jvandyk's intent as stated:
Previous comments: ------------------------------------------------------------------------
Tue, 19 Jul 2005 02:14:16 +0000 : jvandyk
Attachment: http://drupal.org/files/issues/node_serialized2x.patch (664 bytes)
[snip] I'd like to see cvs and 4.6 patched.
This patch is necessary so I can release actions and workflow for 4.6.
I believe Drupal at the 4.6+ level with Matt Westgate's eCommerce (mega)module/framework plus jvandyk's actions, and workflow modules has incredible potential as an innovative microenterprise and small business network/community platform. (Not to suggest that this incredible potential is not there for conventional ecommerce and business users, just that we're not too focused on those already served constituencies.) --Sohodojo Jim-- -- Jim Salmons and Timlynn Babitsky Founders and Research Directors Sohodojo - http://sohodojo.com
On 19 Jul 2005, at 21:04, Sohodojo Jim wrote:
I don't like this patch. It's a hairy fix. There must be a better way to avoid recursion in first place.
I hope we can put on our thinking caps to achieve jvandyk's intent as stated:
We discussed and brainstormed about this some more on IRC. :) -- Dries Buytaert :: http://www.buytaert.net/
Issue status update for http://drupal.org/node/27007 Post a follow up: http://drupal.org/project/comments/add/27007 Project: Drupal Version: cvs Component: node system Category: bug reports Priority: normal Assigned to: jvandyk Reported by: jvandyk Updated by: moshe weitzman Status: patch (code needs work) gosh, this a two line change with zero performanhce impact. seems like a useful and minimally invasive fix for our current dilemma. we'll remove it once a better fix comes around. is a shame to hold back workflow/actions for such a minor issue. any others want to comment or suggest a different solution?/ moshe weitzman Previous comments: ------------------------------------------------------------------------ Tue, 19 Jul 2005 02:14:16 +0000 : jvandyk Attachment: http://drupal.org/files/issues/node_serialized2x.patch (664 bytes) node_save() saves a node, then calls insert or update hooks. These hooks may run functions that, in turn, call node_save(). This all works fine except that if a revision is being created, the revision gets serialized twice -- once on the original node_save() call and once on the callback's node_save() call. The attached two-line patch simply inserts a boolean variable called rev_serialized to keep track of whether the serialization has already been done. I'd like to see cvs and 4.6 patched. This patch is necessary so I can release actions and workflow for 4.6. Background: http://drupal.org/node/24326 ------------------------------------------------------------------------ Tue, 19 Jul 2005 18:36:23 +0000 : Dries I don't like this patch. It's a hairy fix. There must be a better way to avoid recursion in first place.
Issue status update for http://drupal.org/node/27007 Post a follow up: http://drupal.org/project/comments/add/27007 Project: Drupal Version: cvs Component: node system Category: bug reports Priority: normal Assigned to: jvandyk Reported by: jvandyk Updated by: chx Status: patch (code needs work) If there would be a little comment why this is needed, then I'd +1 it heartily. chx Previous comments: ------------------------------------------------------------------------ Tue, 19 Jul 2005 02:14:16 +0000 : jvandyk Attachment: http://drupal.org/files/issues/node_serialized2x.patch (664 bytes) node_save() saves a node, then calls insert or update hooks. These hooks may run functions that, in turn, call node_save(). This all works fine except that if a revision is being created, the revision gets serialized twice -- once on the original node_save() call and once on the callback's node_save() call. The attached two-line patch simply inserts a boolean variable called rev_serialized to keep track of whether the serialization has already been done. I'd like to see cvs and 4.6 patched. This patch is necessary so I can release actions and workflow for 4.6. Background: http://drupal.org/node/24326 ------------------------------------------------------------------------ Tue, 19 Jul 2005 18:36:23 +0000 : Dries I don't like this patch. It's a hairy fix. There must be a better way to avoid recursion in first place. ------------------------------------------------------------------------ Thu, 18 Aug 2005 01:26:21 +0000 : moshe weitzman gosh, this a two line change with zero performanhce impact. seems like a useful and minimally invasive fix for our current dilemma. we'll remove it once a better fix comes around. is a shame to hold back workflow/actions for such a minor issue. any others want to comment or suggest a different solution?/
Issue status update for http://drupal.org/node/27007 Post a follow up: http://drupal.org/project/comments/add/27007 Project: Drupal Version: cvs Component: node system Category: bug reports Priority: normal Assigned to: jvandyk Reported by: jvandyk Updated by: killes@www.drop.org -Status: patch (code needs work) +Status: patch (ready to be committed) Please commit. In case revisions shouldn't make it (for performance reasons ;) we need it fixed for 4.7. killes@www.drop.org Previous comments: ------------------------------------------------------------------------ Tue, 19 Jul 2005 02:14:16 +0000 : jvandyk Attachment: http://drupal.org/files/issues/node_serialized2x.patch (664 bytes) node_save() saves a node, then calls insert or update hooks. These hooks may run functions that, in turn, call node_save(). This all works fine except that if a revision is being created, the revision gets serialized twice -- once on the original node_save() call and once on the callback's node_save() call. The attached two-line patch simply inserts a boolean variable called rev_serialized to keep track of whether the serialization has already been done. I'd like to see cvs and 4.6 patched. This patch is necessary so I can release actions and workflow for 4.6. Background: http://drupal.org/node/24326 ------------------------------------------------------------------------ Tue, 19 Jul 2005 18:36:23 +0000 : Dries I don't like this patch. It's a hairy fix. There must be a better way to avoid recursion in first place. ------------------------------------------------------------------------ Thu, 18 Aug 2005 01:26:21 +0000 : moshe weitzman gosh, this a two line change with zero performanhce impact. seems like a useful and minimally invasive fix for our current dilemma. we'll remove it once a better fix comes around. is a shame to hold back workflow/actions for such a minor issue. any others want to comment or suggest a different solution?/ ------------------------------------------------------------------------ Thu, 18 Aug 2005 04:57:11 +0000 : chx If there would be a little comment why this is needed, then I'd +1 it heartily.
Issue status update for http://drupal.org/node/27007 Post a follow up: http://drupal.org/project/comments/add/27007 Project: Drupal Version: cvs Component: node system Category: bug reports Priority: normal Assigned to: jvandyk Reported by: jvandyk Updated by: moshe weitzman Status: patch (ready to be committed) The first user of this will be workflow+actions modules. They listen on nodeapi's insert/update operations. When a node is seen there, arbitrary actions may fire according to the admin's wishes. So for example, a node may be promoted or may become sticky. To do that properly, the actions should call node_save() again with $node->promote=1. Thus, we are recursively calling node_save() which leads to a double serialize problem. This 2 line fixes it. There may be more nodes which benefit from this. moshe weitzman Previous comments: ------------------------------------------------------------------------ Tue, 19 Jul 2005 02:14:16 +0000 : jvandyk Attachment: http://drupal.org/files/issues/node_serialized2x.patch (664 bytes) node_save() saves a node, then calls insert or update hooks. These hooks may run functions that, in turn, call node_save(). This all works fine except that if a revision is being created, the revision gets serialized twice -- once on the original node_save() call and once on the callback's node_save() call. The attached two-line patch simply inserts a boolean variable called rev_serialized to keep track of whether the serialization has already been done. I'd like to see cvs and 4.6 patched. This patch is necessary so I can release actions and workflow for 4.6. Background: http://drupal.org/node/24326 ------------------------------------------------------------------------ Tue, 19 Jul 2005 18:36:23 +0000 : Dries I don't like this patch. It's a hairy fix. There must be a better way to avoid recursion in first place. ------------------------------------------------------------------------ Thu, 18 Aug 2005 01:26:21 +0000 : moshe weitzman gosh, this a two line change with zero performanhce impact. seems like a useful and minimally invasive fix for our current dilemma. we'll remove it once a better fix comes around. is a shame to hold back workflow/actions for such a minor issue. any others want to comment or suggest a different solution?/ ------------------------------------------------------------------------ Thu, 18 Aug 2005 04:57:11 +0000 : chx If there would be a little comment why this is needed, then I'd +1 it heartily. ------------------------------------------------------------------------ Thu, 18 Aug 2005 11:59:14 +0000 : killes@www.drop.org Please commit. In case revisions shouldn't make it (for performance reasons ;) we need it fixed for 4.7.
Issue status update for http://drupal.org/node/27007 Post a follow up: http://drupal.org/project/comments/add/27007 Project: Drupal Version: cvs Component: node system Category: bug reports Priority: normal Assigned to: jvandyk Reported by: jvandyk Updated by: moshe weitzman Status: patch (ready to be committed) I meant to say - "there may be more *modules* which benefit from this." moshe weitzman Previous comments: ------------------------------------------------------------------------ Tue, 19 Jul 2005 02:14:16 +0000 : jvandyk Attachment: http://drupal.org/files/issues/node_serialized2x.patch (664 bytes) node_save() saves a node, then calls insert or update hooks. These hooks may run functions that, in turn, call node_save(). This all works fine except that if a revision is being created, the revision gets serialized twice -- once on the original node_save() call and once on the callback's node_save() call. The attached two-line patch simply inserts a boolean variable called rev_serialized to keep track of whether the serialization has already been done. I'd like to see cvs and 4.6 patched. This patch is necessary so I can release actions and workflow for 4.6. Background: http://drupal.org/node/24326 ------------------------------------------------------------------------ Tue, 19 Jul 2005 18:36:23 +0000 : Dries I don't like this patch. It's a hairy fix. There must be a better way to avoid recursion in first place. ------------------------------------------------------------------------ Thu, 18 Aug 2005 01:26:21 +0000 : moshe weitzman gosh, this a two line change with zero performanhce impact. seems like a useful and minimally invasive fix for our current dilemma. we'll remove it once a better fix comes around. is a shame to hold back workflow/actions for such a minor issue. any others want to comment or suggest a different solution?/ ------------------------------------------------------------------------ Thu, 18 Aug 2005 04:57:11 +0000 : chx If there would be a little comment why this is needed, then I'd +1 it heartily. ------------------------------------------------------------------------ Thu, 18 Aug 2005 11:59:14 +0000 : killes@www.drop.org Please commit. In case revisions shouldn't make it (for performance reasons ;) we need it fixed for 4.7. ------------------------------------------------------------------------ Thu, 18 Aug 2005 14:22:15 +0000 : moshe weitzman The first user of this will be workflow+actions modules. They listen on nodeapi's insert/update operations. When a node is seen there, arbitrary actions may fire according to the admin's wishes. So for example, a node may be promoted or may become sticky. To do that properly, the actions should call node_save() again with $node->promote=1. Thus, we are recursively calling node_save() which leads to a double serialize problem. This 2 line fixes it. There may be more nodes which benefit from this.
Issue status update for http://drupal.org/node/27007 Post a follow up: http://drupal.org/project/comments/add/27007 Project: Drupal Version: cvs Component: node system Category: bug reports Priority: normal Assigned to: jvandyk Reported by: jvandyk Updated by: Dries Status: patch (ready to be committed) Won't commit. If you are calling node_save() recursively, something is really wrong. For one, you can no longer guarantee that the node is in a 'safe state'. Like, what happens if you saved half a node and you recursively try saving that node again? This needs a better solution. Dries Previous comments: ------------------------------------------------------------------------ Tue, 19 Jul 2005 02:14:16 +0000 : jvandyk Attachment: http://drupal.org/files/issues/node_serialized2x.patch (664 bytes) node_save() saves a node, then calls insert or update hooks. These hooks may run functions that, in turn, call node_save(). This all works fine except that if a revision is being created, the revision gets serialized twice -- once on the original node_save() call and once on the callback's node_save() call. The attached two-line patch simply inserts a boolean variable called rev_serialized to keep track of whether the serialization has already been done. I'd like to see cvs and 4.6 patched. This patch is necessary so I can release actions and workflow for 4.6. Background: http://drupal.org/node/24326 ------------------------------------------------------------------------ Tue, 19 Jul 2005 18:36:23 +0000 : Dries I don't like this patch. It's a hairy fix. There must be a better way to avoid recursion in first place. ------------------------------------------------------------------------ Thu, 18 Aug 2005 01:26:21 +0000 : moshe weitzman gosh, this a two line change with zero performanhce impact. seems like a useful and minimally invasive fix for our current dilemma. we'll remove it once a better fix comes around. is a shame to hold back workflow/actions for such a minor issue. any others want to comment or suggest a different solution?/ ------------------------------------------------------------------------ Thu, 18 Aug 2005 04:57:11 +0000 : chx If there would be a little comment why this is needed, then I'd +1 it heartily. ------------------------------------------------------------------------ Thu, 18 Aug 2005 11:59:14 +0000 : killes@www.drop.org Please commit. In case revisions shouldn't make it (for performance reasons ;) we need it fixed for 4.7. ------------------------------------------------------------------------ Thu, 18 Aug 2005 14:22:15 +0000 : moshe weitzman The first user of this will be workflow+actions modules. They listen on nodeapi's insert/update operations. When a node is seen there, arbitrary actions may fire according to the admin's wishes. So for example, a node may be promoted or may become sticky. To do that properly, the actions should call node_save() again with $node->promote=1. Thus, we are recursively calling node_save() which leads to a double serialize problem. This 2 line fixes it. There may be more nodes which benefit from this. ------------------------------------------------------------------------ Thu, 18 Aug 2005 14:23:24 +0000 : moshe weitzman I meant to say - "there may be more *modules* which benefit from this."
Issue status update for http://drupal.org/node/27007 Post a follow up: http://drupal.org/project/comments/add/27007 Project: Drupal Version: cvs Component: node system Category: bug reports Priority: normal Assigned to: jvandyk Reported by: jvandyk Updated by: chx Status: patch (ready to be committed) I have a possible solution, which does not need core change. Keep a stack of necessary actions in SESSION and pop the stack first in _exit then subsequent _init calls 'till the stack is empty. chx Previous comments: ------------------------------------------------------------------------ Tue, 19 Jul 2005 02:14:16 +0000 : jvandyk Attachment: http://drupal.org/files/issues/node_serialized2x.patch (664 bytes) node_save() saves a node, then calls insert or update hooks. These hooks may run functions that, in turn, call node_save(). This all works fine except that if a revision is being created, the revision gets serialized twice -- once on the original node_save() call and once on the callback's node_save() call. The attached two-line patch simply inserts a boolean variable called rev_serialized to keep track of whether the serialization has already been done. I'd like to see cvs and 4.6 patched. This patch is necessary so I can release actions and workflow for 4.6. Background: http://drupal.org/node/24326 ------------------------------------------------------------------------ Tue, 19 Jul 2005 18:36:23 +0000 : Dries I don't like this patch. It's a hairy fix. There must be a better way to avoid recursion in first place. ------------------------------------------------------------------------ Thu, 18 Aug 2005 01:26:21 +0000 : moshe weitzman gosh, this a two line change with zero performanhce impact. seems like a useful and minimally invasive fix for our current dilemma. we'll remove it once a better fix comes around. is a shame to hold back workflow/actions for such a minor issue. any others want to comment or suggest a different solution?/ ------------------------------------------------------------------------ Thu, 18 Aug 2005 04:57:11 +0000 : chx If there would be a little comment why this is needed, then I'd +1 it heartily. ------------------------------------------------------------------------ Thu, 18 Aug 2005 11:59:14 +0000 : killes@www.drop.org Please commit. In case revisions shouldn't make it (for performance reasons ;) we need it fixed for 4.7. ------------------------------------------------------------------------ Thu, 18 Aug 2005 14:22:15 +0000 : moshe weitzman The first user of this will be workflow+actions modules. They listen on nodeapi's insert/update operations. When a node is seen there, arbitrary actions may fire according to the admin's wishes. So for example, a node may be promoted or may become sticky. To do that properly, the actions should call node_save() again with $node->promote=1. Thus, we are recursively calling node_save() which leads to a double serialize problem. This 2 line fixes it. There may be more nodes which benefit from this. ------------------------------------------------------------------------ Thu, 18 Aug 2005 14:23:24 +0000 : moshe weitzman I meant to say - "there may be more *modules* which benefit from this." ------------------------------------------------------------------------ Thu, 18 Aug 2005 21:50:36 +0000 : Dries Won't commit. If you are calling node_save() recursively, something is really wrong. For one, you can no longer guarantee that the node is in a 'safe state'. Like, what happens if you saved half a node and you recursively try saving that node again? This needs a better solution.
Sorry if this isn't the right etiquette - I have a question(s) about this issue which didn't seem right to clutter up the "official" issue comments with: Why is node_save() called before the insert/update event fires (called via nodeapi) and not after? It seems a bit arbitrary. I guess there are situations where you don't want to fire that until the node is saved, but that would seem to preclude anything else from modifying that node without having to save it again. Was this a decision to be a bit more simple and accept the requirement of other modules having to re-save if they want to modify the node itself, and not just affect other things? Maybe what's needed is this: http://api.rubyonrails.com/classes/ActiveRecord/Callbacks.html Well I'm sorta joking, but the key thing here is as far as I can see (not very, I'm fairly new to the deep internals of drupal) is having a distinction between pre- and post-save [aka insert/update], since "on save" is ambigious isn't it? (well in name at least, the actual behavior in this case it appears to actually be a "post save" event, right?). Until I followed all the calls around, I assumed the callbacks had a chance to run before node->save was actually called -- like everyone gets to do their thing before the node is stored away in the DB -- but now I see you actually gotta use your copy and stick it in the database yourself, basically overwriting the core functionality -- so potentially you could have 0..* saves in response to the insert/update callback, yes? Am I anywhere near understanding what's going on here? -- ________________________________ toddgrimason*todd[ at ]slack.net
On 18 Aug 2005, at 4:48 PM, Todd Grimason wrote:
Sorry if this isn't the right etiquette - I have a question(s) about this issue which didn't seem right to clutter up the "official" issue comments with:
Why is node_save() called before the insert/update event fires (called via nodeapi) and not after? It seems a bit arbitrary.
I guess there are situations where you don't want to fire that until the node is saved, but that would seem to preclude anything else from modifying that node without having to save it again. Was this a decision to be a bit more simple and accept the requirement of other modules having to re-save if they want to modify the node itself, and not just affect other things?
Maybe what's needed is this: http://api.rubyonrails.com/classes/ActiveRecord/Callbacks.html
Well I'm sorta joking, but the key thing here is as far as I can see (not very, I'm fairly new to the deep internals of drupal) is having a distinction between pre- and post-save [aka insert/update], since "on save" is ambigious isn't it? (well in name at least, the actual behavior in this case it appears to actually be a "post save" event, right?).
I think that this is a very good point. I ran into something similar to this while working on HTML export for books - here the nodes were part of a tree, and children needed to be processed in the course of processing a node. In my situation, I implemented a function to traverse the structure - it accepts callbacks called at the start of a node visit and and the end. I think a similar approach would work for other structures. The problem of having nodes tracking how often they are visited is that the node can't (shouldn't) know about any context or global structure it appears in. So, I agree with Dries - having to save this kind of state in a node is an indication that something is wrong with the design.
Until I followed all the calls around, I assumed the callbacks had a chance to run before node->save was actually called -- like everyone gets to do their thing before the node is stored away in the DB -- but now I see you actually gotta use your copy and stick it in the database yourself, basically overwriting the core functionality -- so potentially you could have 0..* saves in response to the insert/update callback, yes?
Am I anywhere near understanding what's going on here?
--
________________________________ toddgrimason*todd[ at ]slack.net
puregin http://www.puregin.org
On Thu, 18 Aug 2005, Todd Grimason wrote:
Well I'm sorta joking, but the key thing here is as far as I can see (not very, I'm fairly new to the deep internals of drupal) is having a distinction between pre- and post-save [aka insert/update], since "on save" is ambigious isn't it? (well in name at least, the actual behavior in this case it appears to actually be a "post save" event, right?).
Right.
Until I followed all the calls around, I assumed the callbacks had a chance to run before node->save was actually called -- like everyone gets to do their thing before the node is stored away in the DB -- but now I see you actually gotta use your copy and stick it in the database yourself, basically overwriting the core functionality -- so potentially you could have 0..* saves in response to the insert/update callback, yes?
Am I anywhere near understanding what's going on here?
Yes. If you want to change "insert" to "post insert" and introduce "pre insert" and do the same to "update" I'd welcome the change. Not sure it would find general approval, though. Cheers, Gerhard
Friday, August 19, 2005, 1:48:35 AM, Todd Grimason wrote:
Why is node_save() called before the insert/update event fires (called via nodeapi) and not after? It seems a bit arbitrary.
Basically when you get to the _save part the node should be finalised, and all that remains is saving it. All modifications should be done during _validate. The reason node_save is called before results in a clean $node object, as it is re-fetched after saving it. Then the module that defines the node can save it's stuff, just to prevent other modules from screwing with its data, then the nodeapi is triggered to save all the other stuff. -- Kjartan <kjartan@zind.net> :: "Programming is like sex: one mistake and you have to support it for the rest of your life." - Michael Sinz
Issue status update for http://drupal.org/node/27007 Post a follow up: http://drupal.org/project/comments/add/27007 Project: Drupal Version: cvs Component: node system Category: bug reports Priority: normal Assigned to: jvandyk Reported by: jvandyk Updated by: jvandyk -Status: patch (ready to be committed) +Status: patch (code needs review) Attachment: http://drupal.org/files/issues/save_hooks.diff (1.83 KB) OK, here is an alternate approach. We leave node_save() as it is, but add two new nodeapi hooks to node_submit(). For reference, here's the meat from my message to the devel list, as chx wants background: The nodeapi calls for nodes go like this on node submission: 1. nodeapi validate hook 2. (node is saved) 3. nodeapi insert/update hook a... modules respond b... modules respond c... modules respond ... workflow modules responds, fires actions, the last of which saves the node (go to 2) x... modules respond y... modules respond z... modules respond As you can see, we have some problems. Because module_invoke() works alphabetically, any modules with names alphabetically after "workflow" have not yet responded when the workflow hook is invoked. Remember, the workflow hook activity looks like this: 1. nodeapi insert/update is caught by workflow module 2. workflow module fires the 'transition pre' workflow hook to let modules know a transition is about to occur 3. modules have a chance to veto the workflow transition 4. if no module vetoes the transition, the 'transition post' workflow hook is fired so modules can respond to the workflow state change. So to move ahead, we need a 1a nodeapi hook that says, "validation passed and this node will now be saved" so workflow changes can happen there, and/or a 4 nodeapi hook that says, "the node has been saved, all modules have done their thing, it's safe to do a node_load and get a node that is in a consistent state". I'd like to see both because it gives maximum flexibility for workflow. For example, it's much more efficient to change the status of the node to published after validation but before saving than it is to use the current method of saving the node again. So the revised flow looks like this: 1. nodeapi validate hook 1a. node is guaranteed validated; nodeapi save pre hook (workflow fires efficient actions like changing status) 2. (node is saved) 3. nodeapi insert/update hook a... modules respond b... modules respond c... modules respond ... x... modules respond y... modules respond z... modules respond 4. node is guaranteed saved; nodeapi save post hook (workflow fires actions like email notification) jvandyk Previous comments: ------------------------------------------------------------------------ Tue, 19 Jul 2005 02:14:16 +0000 : jvandyk Attachment: http://drupal.org/files/issues/node_serialized2x.patch (664 bytes) node_save() saves a node, then calls insert or update hooks. These hooks may run functions that, in turn, call node_save(). This all works fine except that if a revision is being created, the revision gets serialized twice -- once on the original node_save() call and once on the callback's node_save() call. The attached two-line patch simply inserts a boolean variable called rev_serialized to keep track of whether the serialization has already been done. I'd like to see cvs and 4.6 patched. This patch is necessary so I can release actions and workflow for 4.6. Background: http://drupal.org/node/24326 ------------------------------------------------------------------------ Tue, 19 Jul 2005 18:36:23 +0000 : Dries I don't like this patch. It's a hairy fix. There must be a better way to avoid recursion in first place. ------------------------------------------------------------------------ Thu, 18 Aug 2005 01:26:21 +0000 : moshe weitzman gosh, this a two line change with zero performanhce impact. seems like a useful and minimally invasive fix for our current dilemma. we'll remove it once a better fix comes around. is a shame to hold back workflow/actions for such a minor issue. any others want to comment or suggest a different solution?/ ------------------------------------------------------------------------ Thu, 18 Aug 2005 04:57:11 +0000 : chx If there would be a little comment why this is needed, then I'd +1 it heartily. ------------------------------------------------------------------------ Thu, 18 Aug 2005 11:59:14 +0000 : killes@www.drop.org Please commit. In case revisions shouldn't make it (for performance reasons ;) we need it fixed for 4.7. ------------------------------------------------------------------------ Thu, 18 Aug 2005 14:22:15 +0000 : moshe weitzman The first user of this will be workflow+actions modules. They listen on nodeapi's insert/update operations. When a node is seen there, arbitrary actions may fire according to the admin's wishes. So for example, a node may be promoted or may become sticky. To do that properly, the actions should call node_save() again with $node->promote=1. Thus, we are recursively calling node_save() which leads to a double serialize problem. This 2 line fixes it. There may be more nodes which benefit from this. ------------------------------------------------------------------------ Thu, 18 Aug 2005 14:23:24 +0000 : moshe weitzman I meant to say - "there may be more *modules* which benefit from this." ------------------------------------------------------------------------ Thu, 18 Aug 2005 21:50:36 +0000 : Dries Won't commit. If you are calling node_save() recursively, something is really wrong. For one, you can no longer guarantee that the node is in a 'safe state'. Like, what happens if you saved half a node and you recursively try saving that node again? This needs a better solution. ------------------------------------------------------------------------ Thu, 18 Aug 2005 23:21:41 +0000 : chx I have a possible solution, which does not need core change. Keep a stack of necessary actions in SESSION and pop the stack first in _exit then subsequent _init calls 'till the stack is empty.
Issue status update for http://drupal.org/node/27007 Post a follow up: http://drupal.org/project/comments/add/27007 Project: Drupal Version: cvs Component: node system Category: bug reports Priority: normal Assigned to: jvandyk Reported by: jvandyk Updated by: jvandyk Status: patch (code needs review) Attachment: http://drupal.org/files/issues/save_hooks2.diff (1.83 KB) After discussion with chx and moshe, changed names of hooks to be update pre, update post, insert pre, insert post so that listeners have a better context in which to make decisions. I don't like the _exit hook solution because that will slow down every page. Todd Grimason linked to this page [1] of Ruby on Rails hooks on drupal-dev. Although these are 4 hooks, note that only 2 will ever execute at a given time, because we are either doing an insert or an update. [1] http://api.rubyonrails.com/classes/ActiveRecord/Callbacks.html jvandyk Previous comments: ------------------------------------------------------------------------ Tue, 19 Jul 2005 02:14:16 +0000 : jvandyk Attachment: http://drupal.org/files/issues/node_serialized2x.patch (664 bytes) node_save() saves a node, then calls insert or update hooks. These hooks may run functions that, in turn, call node_save(). This all works fine except that if a revision is being created, the revision gets serialized twice -- once on the original node_save() call and once on the callback's node_save() call. The attached two-line patch simply inserts a boolean variable called rev_serialized to keep track of whether the serialization has already been done. I'd like to see cvs and 4.6 patched. This patch is necessary so I can release actions and workflow for 4.6. Background: http://drupal.org/node/24326 ------------------------------------------------------------------------ Tue, 19 Jul 2005 18:36:23 +0000 : Dries I don't like this patch. It's a hairy fix. There must be a better way to avoid recursion in first place. ------------------------------------------------------------------------ Thu, 18 Aug 2005 01:26:21 +0000 : moshe weitzman gosh, this a two line change with zero performanhce impact. seems like a useful and minimally invasive fix for our current dilemma. we'll remove it once a better fix comes around. is a shame to hold back workflow/actions for such a minor issue. any others want to comment or suggest a different solution?/ ------------------------------------------------------------------------ Thu, 18 Aug 2005 04:57:11 +0000 : chx If there would be a little comment why this is needed, then I'd +1 it heartily. ------------------------------------------------------------------------ Thu, 18 Aug 2005 11:59:14 +0000 : killes@www.drop.org Please commit. In case revisions shouldn't make it (for performance reasons ;) we need it fixed for 4.7. ------------------------------------------------------------------------ Thu, 18 Aug 2005 14:22:15 +0000 : moshe weitzman The first user of this will be workflow+actions modules. They listen on nodeapi's insert/update operations. When a node is seen there, arbitrary actions may fire according to the admin's wishes. So for example, a node may be promoted or may become sticky. To do that properly, the actions should call node_save() again with $node->promote=1. Thus, we are recursively calling node_save() which leads to a double serialize problem. This 2 line fixes it. There may be more nodes which benefit from this. ------------------------------------------------------------------------ Thu, 18 Aug 2005 14:23:24 +0000 : moshe weitzman I meant to say - "there may be more *modules* which benefit from this." ------------------------------------------------------------------------ Thu, 18 Aug 2005 21:50:36 +0000 : Dries Won't commit. If you are calling node_save() recursively, something is really wrong. For one, you can no longer guarantee that the node is in a 'safe state'. Like, what happens if you saved half a node and you recursively try saving that node again? This needs a better solution. ------------------------------------------------------------------------ Thu, 18 Aug 2005 23:21:41 +0000 : chx I have a possible solution, which does not need core change. Keep a stack of necessary actions in SESSION and pop the stack first in _exit then subsequent _init calls 'till the stack is empty. ------------------------------------------------------------------------ Fri, 19 Aug 2005 00:58:11 +0000 : jvandyk Attachment: http://drupal.org/files/issues/save_hooks.diff (1.83 KB) OK, here is an alternate approach. We leave node_save() as it is, but add two new nodeapi hooks to node_submit(). For reference, here's the meat from my message to the devel list, as chx wants background: The nodeapi calls for nodes go like this on node submission: 1. nodeapi validate hook 2. (node is saved) 3. nodeapi insert/update hook a... modules respond b... modules respond c... modules respond ... workflow modules responds, fires actions, the last of which saves the node (go to 2) x... modules respond y... modules respond z... modules respond As you can see, we have some problems. Because module_invoke() works alphabetically, any modules with names alphabetically after "workflow" have not yet responded when the workflow hook is invoked. Remember, the workflow hook activity looks like this: 1. nodeapi insert/update is caught by workflow module 2. workflow module fires the 'transition pre' workflow hook to let modules know a transition is about to occur 3. modules have a chance to veto the workflow transition 4. if no module vetoes the transition, the 'transition post' workflow hook is fired so modules can respond to the workflow state change. So to move ahead, we need a 1a nodeapi hook that says, "validation passed and this node will now be saved" so workflow changes can happen there, and/or a 4 nodeapi hook that says, "the node has been saved, all modules have done their thing, it's safe to do a node_load and get a node that is in a consistent state". I'd like to see both because it gives maximum flexibility for workflow. For example, it's much more efficient to change the status of the node to published after validation but before saving than it is to use the current method of saving the node again. So the revised flow looks like this: 1. nodeapi validate hook 1a. node is guaranteed validated; nodeapi save pre hook (workflow fires efficient actions like changing status) 2. (node is saved) 3. nodeapi insert/update hook a... modules respond b... modules respond c... modules respond ... x... modules respond y... modules respond z... modules respond 4. node is guaranteed saved; nodeapi save post hook (workflow fires actions like email notification)
Issue status update for http://drupal.org/node/27007 Post a follow up: http://drupal.org/project/comments/add/27007 Project: Drupal Version: cvs Component: node system Category: bug reports Priority: normal Assigned to: jvandyk Reported by: jvandyk Updated by: chx Status: patch (code needs review) I like this, but if this gets in, next issue will be to clean up 'validate'... chx Previous comments: ------------------------------------------------------------------------ Tue, 19 Jul 2005 02:14:16 +0000 : jvandyk Attachment: http://drupal.org/files/issues/node_serialized2x.patch (664 bytes) node_save() saves a node, then calls insert or update hooks. These hooks may run functions that, in turn, call node_save(). This all works fine except that if a revision is being created, the revision gets serialized twice -- once on the original node_save() call and once on the callback's node_save() call. The attached two-line patch simply inserts a boolean variable called rev_serialized to keep track of whether the serialization has already been done. I'd like to see cvs and 4.6 patched. This patch is necessary so I can release actions and workflow for 4.6. Background: http://drupal.org/node/24326 ------------------------------------------------------------------------ Tue, 19 Jul 2005 18:36:23 +0000 : Dries I don't like this patch. It's a hairy fix. There must be a better way to avoid recursion in first place. ------------------------------------------------------------------------ Thu, 18 Aug 2005 01:26:21 +0000 : moshe weitzman gosh, this a two line change with zero performanhce impact. seems like a useful and minimally invasive fix for our current dilemma. we'll remove it once a better fix comes around. is a shame to hold back workflow/actions for such a minor issue. any others want to comment or suggest a different solution?/ ------------------------------------------------------------------------ Thu, 18 Aug 2005 04:57:11 +0000 : chx If there would be a little comment why this is needed, then I'd +1 it heartily. ------------------------------------------------------------------------ Thu, 18 Aug 2005 11:59:14 +0000 : killes@www.drop.org Please commit. In case revisions shouldn't make it (for performance reasons ;) we need it fixed for 4.7. ------------------------------------------------------------------------ Thu, 18 Aug 2005 14:22:15 +0000 : moshe weitzman The first user of this will be workflow+actions modules. They listen on nodeapi's insert/update operations. When a node is seen there, arbitrary actions may fire according to the admin's wishes. So for example, a node may be promoted or may become sticky. To do that properly, the actions should call node_save() again with $node->promote=1. Thus, we are recursively calling node_save() which leads to a double serialize problem. This 2 line fixes it. There may be more nodes which benefit from this. ------------------------------------------------------------------------ Thu, 18 Aug 2005 14:23:24 +0000 : moshe weitzman I meant to say - "there may be more *modules* which benefit from this." ------------------------------------------------------------------------ Thu, 18 Aug 2005 21:50:36 +0000 : Dries Won't commit. If you are calling node_save() recursively, something is really wrong. For one, you can no longer guarantee that the node is in a 'safe state'. Like, what happens if you saved half a node and you recursively try saving that node again? This needs a better solution. ------------------------------------------------------------------------ Thu, 18 Aug 2005 23:21:41 +0000 : chx I have a possible solution, which does not need core change. Keep a stack of necessary actions in SESSION and pop the stack first in _exit then subsequent _init calls 'till the stack is empty. ------------------------------------------------------------------------ Fri, 19 Aug 2005 00:58:11 +0000 : jvandyk Attachment: http://drupal.org/files/issues/save_hooks.diff (1.83 KB) OK, here is an alternate approach. We leave node_save() as it is, but add two new nodeapi hooks to node_submit(). For reference, here's the meat from my message to the devel list, as chx wants background: The nodeapi calls for nodes go like this on node submission: 1. nodeapi validate hook 2. (node is saved) 3. nodeapi insert/update hook a... modules respond b... modules respond c... modules respond ... workflow modules responds, fires actions, the last of which saves the node (go to 2) x... modules respond y... modules respond z... modules respond As you can see, we have some problems. Because module_invoke() works alphabetically, any modules with names alphabetically after "workflow" have not yet responded when the workflow hook is invoked. Remember, the workflow hook activity looks like this: 1. nodeapi insert/update is caught by workflow module 2. workflow module fires the 'transition pre' workflow hook to let modules know a transition is about to occur 3. modules have a chance to veto the workflow transition 4. if no module vetoes the transition, the 'transition post' workflow hook is fired so modules can respond to the workflow state change. So to move ahead, we need a 1a nodeapi hook that says, "validation passed and this node will now be saved" so workflow changes can happen there, and/or a 4 nodeapi hook that says, "the node has been saved, all modules have done their thing, it's safe to do a node_load and get a node that is in a consistent state". I'd like to see both because it gives maximum flexibility for workflow. For example, it's much more efficient to change the status of the node to published after validation but before saving than it is to use the current method of saving the node again. So the revised flow looks like this: 1. nodeapi validate hook 1a. node is guaranteed validated; nodeapi save pre hook (workflow fires efficient actions like changing status) 2. (node is saved) 3. nodeapi insert/update hook a... modules respond b... modules respond c... modules respond ... x... modules respond y... modules respond z... modules respond 4. node is guaranteed saved; nodeapi save post hook (workflow fires actions like email notification) ------------------------------------------------------------------------ Fri, 19 Aug 2005 19:00:53 +0000 : jvandyk Attachment: http://drupal.org/files/issues/save_hooks2.diff (1.83 KB) After discussion with chx and moshe, changed names of hooks to be update pre, update post, insert pre, insert post so that listeners have a better context in which to make decisions. I don't like the _exit hook solution because that will slow down every page. Todd Grimason linked to this page [1] of Ruby on Rails hooks on drupal-dev. Although these are 4 hooks, note that only 2 will ever execute at a given time, because we are either doing an insert or an update. [1] http://api.rubyonrails.com/classes/ActiveRecord/Callbacks.html
Issue status update for http://drupal.org/node/27007 Post a follow up: http://drupal.org/project/comments/add/27007 Project: Drupal Version: cvs Component: node system Category: bug reports Priority: normal Assigned to: jvandyk Reported by: jvandyk Updated by: Kjartan Status: patch (code needs review)
From the conversations on IRC here is where I ended up on nodeapi calls.
Nodeapi calls when creating a new node: default values -> form pre -> form post -> validate -> finalize insert -> insert Nodeapi calls when updating an existing node: load -> form pre -> form post -> validate -> finalize update -> update Not sure I completely agree with splitting finalize, but I am sure those who support it will post nice examples to make it a clear choice :) This would remove the hack in node_form with validating an empty node, and let modules perform actions just before its saved. Kjartan Previous comments: ------------------------------------------------------------------------ Tue, 19 Jul 2005 02:14:16 +0000 : jvandyk Attachment: http://drupal.org/files/issues/node_serialized2x.patch (664 bytes) node_save() saves a node, then calls insert or update hooks. These hooks may run functions that, in turn, call node_save(). This all works fine except that if a revision is being created, the revision gets serialized twice -- once on the original node_save() call and once on the callback's node_save() call. The attached two-line patch simply inserts a boolean variable called rev_serialized to keep track of whether the serialization has already been done. I'd like to see cvs and 4.6 patched. This patch is necessary so I can release actions and workflow for 4.6. Background: http://drupal.org/node/24326 ------------------------------------------------------------------------ Tue, 19 Jul 2005 18:36:23 +0000 : Dries I don't like this patch. It's a hairy fix. There must be a better way to avoid recursion in first place. ------------------------------------------------------------------------ Thu, 18 Aug 2005 01:26:21 +0000 : moshe weitzman gosh, this a two line change with zero performanhce impact. seems like a useful and minimally invasive fix for our current dilemma. we'll remove it once a better fix comes around. is a shame to hold back workflow/actions for such a minor issue. any others want to comment or suggest a different solution?/ ------------------------------------------------------------------------ Thu, 18 Aug 2005 04:57:11 +0000 : chx If there would be a little comment why this is needed, then I'd +1 it heartily. ------------------------------------------------------------------------ Thu, 18 Aug 2005 11:59:14 +0000 : killes@www.drop.org Please commit. In case revisions shouldn't make it (for performance reasons ;) we need it fixed for 4.7. ------------------------------------------------------------------------ Thu, 18 Aug 2005 14:22:15 +0000 : moshe weitzman The first user of this will be workflow+actions modules. They listen on nodeapi's insert/update operations. When a node is seen there, arbitrary actions may fire according to the admin's wishes. So for example, a node may be promoted or may become sticky. To do that properly, the actions should call node_save() again with $node->promote=1. Thus, we are recursively calling node_save() which leads to a double serialize problem. This 2 line fixes it. There may be more nodes which benefit from this. ------------------------------------------------------------------------ Thu, 18 Aug 2005 14:23:24 +0000 : moshe weitzman I meant to say - "there may be more *modules* which benefit from this." ------------------------------------------------------------------------ Thu, 18 Aug 2005 21:50:36 +0000 : Dries Won't commit. If you are calling node_save() recursively, something is really wrong. For one, you can no longer guarantee that the node is in a 'safe state'. Like, what happens if you saved half a node and you recursively try saving that node again? This needs a better solution. ------------------------------------------------------------------------ Thu, 18 Aug 2005 23:21:41 +0000 : chx I have a possible solution, which does not need core change. Keep a stack of necessary actions in SESSION and pop the stack first in _exit then subsequent _init calls 'till the stack is empty. ------------------------------------------------------------------------ Fri, 19 Aug 2005 00:58:11 +0000 : jvandyk Attachment: http://drupal.org/files/issues/save_hooks.diff (1.83 KB) OK, here is an alternate approach. We leave node_save() as it is, but add two new nodeapi hooks to node_submit(). For reference, here's the meat from my message to the devel list, as chx wants background: The nodeapi calls for nodes go like this on node submission: 1. nodeapi validate hook 2. (node is saved) 3. nodeapi insert/update hook a... modules respond b... modules respond c... modules respond ... workflow modules responds, fires actions, the last of which saves the node (go to 2) x... modules respond y... modules respond z... modules respond As you can see, we have some problems. Because module_invoke() works alphabetically, any modules with names alphabetically after "workflow" have not yet responded when the workflow hook is invoked. Remember, the workflow hook activity looks like this: 1. nodeapi insert/update is caught by workflow module 2. workflow module fires the 'transition pre' workflow hook to let modules know a transition is about to occur 3. modules have a chance to veto the workflow transition 4. if no module vetoes the transition, the 'transition post' workflow hook is fired so modules can respond to the workflow state change. So to move ahead, we need a 1a nodeapi hook that says, "validation passed and this node will now be saved" so workflow changes can happen there, and/or a 4 nodeapi hook that says, "the node has been saved, all modules have done their thing, it's safe to do a node_load and get a node that is in a consistent state". I'd like to see both because it gives maximum flexibility for workflow. For example, it's much more efficient to change the status of the node to published after validation but before saving than it is to use the current method of saving the node again. So the revised flow looks like this: 1. nodeapi validate hook 1a. node is guaranteed validated; nodeapi save pre hook (workflow fires efficient actions like changing status) 2. (node is saved) 3. nodeapi insert/update hook a... modules respond b... modules respond c... modules respond ... x... modules respond y... modules respond z... modules respond 4. node is guaranteed saved; nodeapi save post hook (workflow fires actions like email notification) ------------------------------------------------------------------------ Fri, 19 Aug 2005 19:00:53 +0000 : jvandyk Attachment: http://drupal.org/files/issues/save_hooks2.diff (1.83 KB) After discussion with chx and moshe, changed names of hooks to be update pre, update post, insert pre, insert post so that listeners have a better context in which to make decisions. I don't like the _exit hook solution because that will slow down every page. Todd Grimason linked to this page [1] of Ruby on Rails hooks on drupal-dev. Although these are 4 hooks, note that only 2 will ever execute at a given time, because we are either doing an insert or an update. [1] http://api.rubyonrails.com/classes/ActiveRecord/Callbacks.html ------------------------------------------------------------------------ Fri, 19 Aug 2005 19:10:46 +0000 : chx I like this, but if this gets in, next issue will be to clean up 'validate'...
Issue status update for http://drupal.org/node/27007 Post a follow up: http://drupal.org/project/comments/add/27007 Project: Drupal Version: cvs Component: node system Category: bug reports Priority: normal Assigned to: jvandyk Reported by: jvandyk Updated by: jvandyk Status: patch (code needs review) Attachment: http://drupal.org/files/issues/save_hooks3.diff (1.78 KB) I have updated this patch to work with HEAD after the recent node changes. Here's a review of this patch: Why do we need the 'update pre' hook? Because we need a nodeapi call where modules can assume that they are working with a node that has passed validation. For example, if actions such as e-mail sending are going to fire, you don't want them to fire and then a module later in the node_invoke_nodeapi() sequence to throw a validation error. You want validation to pass as a separate step, and then be able to fire actions without worrying that a node_save() might not happen. Why do we need the 'update post' hook? Because we need to guarantee that we are working with a "clean" node. If you use the 'insert' or 'update' hooks, a module later in the node_invoke_nodeapi() sequence may make changes to database tables that will affect the node. Modules typically use the 'insert' and 'update' hooks to modify their own private tables. To ensure consistency and avoid corruption, you want to make certain that all modules have completed their own private database updates before moving on. Why do you want to make certain of this? Because you do not know what actions are going to do. Actions (as in actions.module) are by definition arbitrary. It is much better to have all tables in a consistent state before firing actions. You want to make sure that if an action does a node_load it will get a clean node, including all the updates that listening modules do. Steven, please chime in on this as this was your concern. One other point on the 'update post' hook. It is conceivable that actions fired may run a long time or run into problems. Of course actions need to be written as cleanly as possible. 'update post' gives actions a chance to decide that they will do things that may have problems (xmlrpc posts that may timeout, for example) and breathe easy knowing that even if there is a timeout or other error, the node has already been saved. Do we need 'insert pre' and 'update pre'? Why not just use 'save pre'? I think that since we already know the context, it is nice to pass that along to modules so they do not have to test for the existence of $node->nid to waste a cycle figuring out something we already know. What about naming? Kjartan has suggested changing the name 'update pre' to 'finalize update' and 'update insert' to 'finalize insert'. I have left the names as is because I think both hooks are important and I think the current names are easiest to understand. 'finalize' does not give a clue as to whether it is before or after the actual node_save(). But I am open to better naming if it is easy to understand for the developer. What's the downside? I do not see a downside to this patch. It adds two nodeapi calls on insert, or two on update. It is not a performance hit. It adds only 4 lines of code. It does not require anyone to rewrite anything. And it is the best solution we have to the problem. jvandyk Previous comments: ------------------------------------------------------------------------ Tue, 19 Jul 2005 02:14:16 +0000 : jvandyk Attachment: http://drupal.org/files/issues/node_serialized2x.patch (664 bytes) node_save() saves a node, then calls insert or update hooks. These hooks may run functions that, in turn, call node_save(). This all works fine except that if a revision is being created, the revision gets serialized twice -- once on the original node_save() call and once on the callback's node_save() call. The attached two-line patch simply inserts a boolean variable called rev_serialized to keep track of whether the serialization has already been done. I'd like to see cvs and 4.6 patched. This patch is necessary so I can release actions and workflow for 4.6. Background: http://drupal.org/node/24326 ------------------------------------------------------------------------ Tue, 19 Jul 2005 18:36:23 +0000 : Dries I don't like this patch. It's a hairy fix. There must be a better way to avoid recursion in first place. ------------------------------------------------------------------------ Thu, 18 Aug 2005 01:26:21 +0000 : moshe weitzman gosh, this a two line change with zero performanhce impact. seems like a useful and minimally invasive fix for our current dilemma. we'll remove it once a better fix comes around. is a shame to hold back workflow/actions for such a minor issue. any others want to comment or suggest a different solution?/ ------------------------------------------------------------------------ Thu, 18 Aug 2005 04:57:11 +0000 : chx If there would be a little comment why this is needed, then I'd +1 it heartily. ------------------------------------------------------------------------ Thu, 18 Aug 2005 11:59:14 +0000 : killes@www.drop.org Please commit. In case revisions shouldn't make it (for performance reasons ;) we need it fixed for 4.7. ------------------------------------------------------------------------ Thu, 18 Aug 2005 14:22:15 +0000 : moshe weitzman The first user of this will be workflow+actions modules. They listen on nodeapi's insert/update operations. When a node is seen there, arbitrary actions may fire according to the admin's wishes. So for example, a node may be promoted or may become sticky. To do that properly, the actions should call node_save() again with $node->promote=1. Thus, we are recursively calling node_save() which leads to a double serialize problem. This 2 line fixes it. There may be more nodes which benefit from this. ------------------------------------------------------------------------ Thu, 18 Aug 2005 14:23:24 +0000 : moshe weitzman I meant to say - "there may be more *modules* which benefit from this." ------------------------------------------------------------------------ Thu, 18 Aug 2005 21:50:36 +0000 : Dries Won't commit. If you are calling node_save() recursively, something is really wrong. For one, you can no longer guarantee that the node is in a 'safe state'. Like, what happens if you saved half a node and you recursively try saving that node again? This needs a better solution. ------------------------------------------------------------------------ Thu, 18 Aug 2005 23:21:41 +0000 : chx I have a possible solution, which does not need core change. Keep a stack of necessary actions in SESSION and pop the stack first in _exit then subsequent _init calls 'till the stack is empty. ------------------------------------------------------------------------ Fri, 19 Aug 2005 00:58:11 +0000 : jvandyk Attachment: http://drupal.org/files/issues/save_hooks.diff (1.83 KB) OK, here is an alternate approach. We leave node_save() as it is, but add two new nodeapi hooks to node_submit(). For reference, here's the meat from my message to the devel list, as chx wants background: The nodeapi calls for nodes go like this on node submission: 1. nodeapi validate hook 2. (node is saved) 3. nodeapi insert/update hook a... modules respond b... modules respond c... modules respond ... workflow modules responds, fires actions, the last of which saves the node (go to 2) x... modules respond y... modules respond z... modules respond As you can see, we have some problems. Because module_invoke() works alphabetically, any modules with names alphabetically after "workflow" have not yet responded when the workflow hook is invoked. Remember, the workflow hook activity looks like this: 1. nodeapi insert/update is caught by workflow module 2. workflow module fires the 'transition pre' workflow hook to let modules know a transition is about to occur 3. modules have a chance to veto the workflow transition 4. if no module vetoes the transition, the 'transition post' workflow hook is fired so modules can respond to the workflow state change. So to move ahead, we need a 1a nodeapi hook that says, "validation passed and this node will now be saved" so workflow changes can happen there, and/or a 4 nodeapi hook that says, "the node has been saved, all modules have done their thing, it's safe to do a node_load and get a node that is in a consistent state". I'd like to see both because it gives maximum flexibility for workflow. For example, it's much more efficient to change the status of the node to published after validation but before saving than it is to use the current method of saving the node again. So the revised flow looks like this: 1. nodeapi validate hook 1a. node is guaranteed validated; nodeapi save pre hook (workflow fires efficient actions like changing status) 2. (node is saved) 3. nodeapi insert/update hook a... modules respond b... modules respond c... modules respond ... x... modules respond y... modules respond z... modules respond 4. node is guaranteed saved; nodeapi save post hook (workflow fires actions like email notification) ------------------------------------------------------------------------ Fri, 19 Aug 2005 19:00:53 +0000 : jvandyk Attachment: http://drupal.org/files/issues/save_hooks2.diff (1.83 KB) After discussion with chx and moshe, changed names of hooks to be update pre, update post, insert pre, insert post so that listeners have a better context in which to make decisions. I don't like the _exit hook solution because that will slow down every page. Todd Grimason linked to this page [1] of Ruby on Rails hooks on drupal-dev. Although these are 4 hooks, note that only 2 will ever execute at a given time, because we are either doing an insert or an update. [1] http://api.rubyonrails.com/classes/ActiveRecord/Callbacks.html ------------------------------------------------------------------------ Fri, 19 Aug 2005 19:10:46 +0000 : chx I like this, but if this gets in, next issue will be to clean up 'validate'... ------------------------------------------------------------------------ Fri, 26 Aug 2005 16:18:46 +0000 : Kjartan
From the conversations on IRC here is where I ended up on nodeapi calls.
Nodeapi calls when creating a new node: default values -> form pre -> form post -> validate -> finalize insert -> insert Nodeapi calls when updating an existing node: load -> form pre -> form post -> validate -> finalize update -> update Not sure I completely agree with splitting finalize, but I am sure those who support it will post nice examples to make it a clear choice :) This would remove the hack in node_form with validating an empty node, and let modules perform actions just before its saved.
Issue status update for http://drupal.org/node/27007 Post a follow up: http://drupal.org/project/comments/add/27007 Project: Drupal Version: cvs Component: node system Category: bug reports -Priority: normal +Priority: critical Assigned to: jvandyk Reported by: jvandyk Updated by: I1uv4t4r Status: patch (code needs review) I am creating a module that creates an additional node in hook_nodeapi('insert', ...). The additional node was created OK, but the actual node (where hook_nodeapi was invoked for in the first place) not: information like taxonomy was lost. I couldn't find anything to fix this bug, untill I came accross this post. The problem appearently was that I was using node_save recursive, and my module name starts with an A and the taxonomy module with a T (duh). Well, the patch above (save_hooks3.diff) certainly fixed this problem for me by creating the node in the 'insert post' hook instead of the 'insert' hook. Very nice! I agree with jvandyk, there's no downside to this patch, there are some additional calls but at least module_implements 'caches' the list of modules that implement a certain hook, hook_nodeapi in this case. I myself cannot find a practical implementation of the 'insert pre' hook (or 'update pre'), though. jvandyk says something about sending an e-mail, but this can be done just as wel in either 'insert' or 'insert post'. By the way, another thing that worked for me was changing in node_invoke_nodeapi: foreach (module_implements('nodeapi') as $name) { to $implementations = module_implements('nodeapi'); foreach ($implementations as $name) { I don't know why this worked, but it probably has something todo with the static variable in module_implements. I think this patch is very usefull and this bug /should be fixed in 4.7/. Thank you. I1uv4t4r Previous comments: ------------------------------------------------------------------------ Tue, 19 Jul 2005 02:14:16 +0000 : jvandyk Attachment: http://drupal.org/files/issues/node_serialized2x.patch (664 bytes) node_save() saves a node, then calls insert or update hooks. These hooks may run functions that, in turn, call node_save(). This all works fine except that if a revision is being created, the revision gets serialized twice -- once on the original node_save() call and once on the callback's node_save() call. The attached two-line patch simply inserts a boolean variable called rev_serialized to keep track of whether the serialization has already been done. I'd like to see cvs and 4.6 patched. This patch is necessary so I can release actions and workflow for 4.6. Background: http://drupal.org/node/24326 ------------------------------------------------------------------------ Tue, 19 Jul 2005 18:36:23 +0000 : Dries I don't like this patch. It's a hairy fix. There must be a better way to avoid recursion in first place. ------------------------------------------------------------------------ Thu, 18 Aug 2005 01:26:21 +0000 : moshe weitzman gosh, this a two line change with zero performanhce impact. seems like a useful and minimally invasive fix for our current dilemma. we'll remove it once a better fix comes around. is a shame to hold back workflow/actions for such a minor issue. any others want to comment or suggest a different solution?/ ------------------------------------------------------------------------ Thu, 18 Aug 2005 04:57:11 +0000 : chx If there would be a little comment why this is needed, then I'd +1 it heartily. ------------------------------------------------------------------------ Thu, 18 Aug 2005 11:59:14 +0000 : killes@www.drop.org Please commit. In case revisions shouldn't make it (for performance reasons ;) we need it fixed for 4.7. ------------------------------------------------------------------------ Thu, 18 Aug 2005 14:22:15 +0000 : moshe weitzman The first user of this will be workflow+actions modules. They listen on nodeapi's insert/update operations. When a node is seen there, arbitrary actions may fire according to the admin's wishes. So for example, a node may be promoted or may become sticky. To do that properly, the actions should call node_save() again with $node->promote=1. Thus, we are recursively calling node_save() which leads to a double serialize problem. This 2 line fixes it. There may be more nodes which benefit from this. ------------------------------------------------------------------------ Thu, 18 Aug 2005 14:23:24 +0000 : moshe weitzman I meant to say - "there may be more *modules* which benefit from this." ------------------------------------------------------------------------ Thu, 18 Aug 2005 21:50:36 +0000 : Dries Won't commit. If you are calling node_save() recursively, something is really wrong. For one, you can no longer guarantee that the node is in a 'safe state'. Like, what happens if you saved half a node and you recursively try saving that node again? This needs a better solution. ------------------------------------------------------------------------ Thu, 18 Aug 2005 23:21:41 +0000 : chx I have a possible solution, which does not need core change. Keep a stack of necessary actions in SESSION and pop the stack first in _exit then subsequent _init calls 'till the stack is empty. ------------------------------------------------------------------------ Fri, 19 Aug 2005 00:58:11 +0000 : jvandyk Attachment: http://drupal.org/files/issues/save_hooks.diff (1.83 KB) OK, here is an alternate approach. We leave node_save() as it is, but add two new nodeapi hooks to node_submit(). For reference, here's the meat from my message to the devel list, as chx wants background: The nodeapi calls for nodes go like this on node submission: 1. nodeapi validate hook 2. (node is saved) 3. nodeapi insert/update hook a... modules respond b... modules respond c... modules respond ... workflow modules responds, fires actions, the last of which saves the node (go to 2) x... modules respond y... modules respond z... modules respond As you can see, we have some problems. Because module_invoke() works alphabetically, any modules with names alphabetically after "workflow" have not yet responded when the workflow hook is invoked. Remember, the workflow hook activity looks like this: 1. nodeapi insert/update is caught by workflow module 2. workflow module fires the 'transition pre' workflow hook to let modules know a transition is about to occur 3. modules have a chance to veto the workflow transition 4. if no module vetoes the transition, the 'transition post' workflow hook is fired so modules can respond to the workflow state change. So to move ahead, we need a 1a nodeapi hook that says, "validation passed and this node will now be saved" so workflow changes can happen there, and/or a 4 nodeapi hook that says, "the node has been saved, all modules have done their thing, it's safe to do a node_load and get a node that is in a consistent state". I'd like to see both because it gives maximum flexibility for workflow. For example, it's much more efficient to change the status of the node to published after validation but before saving than it is to use the current method of saving the node again. So the revised flow looks like this: 1. nodeapi validate hook 1a. node is guaranteed validated; nodeapi save pre hook (workflow fires efficient actions like changing status) 2. (node is saved) 3. nodeapi insert/update hook a... modules respond b... modules respond c... modules respond ... x... modules respond y... modules respond z... modules respond 4. node is guaranteed saved; nodeapi save post hook (workflow fires actions like email notification) ------------------------------------------------------------------------ Fri, 19 Aug 2005 19:00:53 +0000 : jvandyk Attachment: http://drupal.org/files/issues/save_hooks2.diff (1.83 KB) After discussion with chx and moshe, changed names of hooks to be update pre, update post, insert pre, insert post so that listeners have a better context in which to make decisions. I don't like the _exit hook solution because that will slow down every page. Todd Grimason linked to this page [1] of Ruby on Rails hooks on drupal-dev. Although these are 4 hooks, note that only 2 will ever execute at a given time, because we are either doing an insert or an update. [1] http://api.rubyonrails.com/classes/ActiveRecord/Callbacks.html ------------------------------------------------------------------------ Fri, 19 Aug 2005 19:10:46 +0000 : chx I like this, but if this gets in, next issue will be to clean up 'validate'... ------------------------------------------------------------------------ Fri, 26 Aug 2005 16:18:46 +0000 : Kjartan
From the conversations on IRC here is where I ended up on nodeapi calls.
Nodeapi calls when creating a new node: default values -> form pre -> form post -> validate -> finalize insert -> insert Nodeapi calls when updating an existing node: load -> form pre -> form post -> validate -> finalize update -> update Not sure I completely agree with splitting finalize, but I am sure those who support it will post nice examples to make it a clear choice :) This would remove the hack in node_form with validating an empty node, and let modules perform actions just before its saved. ------------------------------------------------------------------------ Tue, 30 Aug 2005 16:17:12 +0000 : jvandyk Attachment: http://drupal.org/files/issues/save_hooks3.diff (1.78 KB) I have updated this patch to work with HEAD after the recent node changes. Here's a review of this patch: Why do we need the 'update pre' hook? Because we need a nodeapi call where modules can assume that they are working with a node that has passed validation. For example, if actions such as e-mail sending are going to fire, you don't want them to fire and then a module later in the node_invoke_nodeapi() sequence to throw a validation error. You want validation to pass as a separate step, and then be able to fire actions without worrying that a node_save() might not happen. Why do we need the 'update post' hook? Because we need to guarantee that we are working with a "clean" node. If you use the 'insert' or 'update' hooks, a module later in the node_invoke_nodeapi() sequence may make changes to database tables that will affect the node. Modules typically use the 'insert' and 'update' hooks to modify their own private tables. To ensure consistency and avoid corruption, you want to make certain that all modules have completed their own private database updates before moving on. Why do you want to make certain of this? Because you do not know what actions are going to do. Actions (as in actions.module) are by definition arbitrary. It is much better to have all tables in a consistent state before firing actions. You want to make sure that if an action does a node_load it will get a clean node, including all the updates that listening modules do. Steven, please chime in on this as this was your concern. One other point on the 'update post' hook. It is conceivable that actions fired may run a long time or run into problems. Of course actions need to be written as cleanly as possible. 'update post' gives actions a chance to decide that they will do things that may have problems (xmlrpc posts that may timeout, for example) and breathe easy knowing that even if there is a timeout or other error, the node has already been saved. Do we need 'insert pre' and 'update pre'? Why not just use 'save pre'? I think that since we already know the context, it is nice to pass that along to modules so they do not have to test for the existence of $node->nid to waste a cycle figuring out something we already know. What about naming? Kjartan has suggested changing the name 'update pre' to 'finalize update' and 'update insert' to 'finalize insert'. I have left the names as is because I think both hooks are important and I think the current names are easiest to understand. 'finalize' does not give a clue as to whether it is before or after the actual node_save(). But I am open to better naming if it is easy to understand for the developer. What's the downside? I do not see a downside to this patch. It adds two nodeapi calls on insert, or two on update. It is not a performance hit. It adds only 4 lines of code. It does not require anyone to rewrite anything. And it is the best solution we have to the problem.
Issue status update for http://drupal.org/node/27007 Post a follow up: http://drupal.org/project/comments/add/27007 Project: Drupal Version: cvs Component: node system Category: bug reports Priority: critical Assigned to: jvandyk Reported by: jvandyk Updated by: jvandyk Status: patch (code needs review) I1uv4t4r says /I myself cannot find a practical implementation of the 'insert pre' hook (or 'update pre')/ For one thing, it's for performance. A quick example: it is faster to set the node status to "published" during the 'insert-pre' hook than to do an entire node_load and node_save again later to accomplish this change. jvandyk Previous comments: ------------------------------------------------------------------------ Tue, 19 Jul 2005 02:14:16 +0000 : jvandyk Attachment: http://drupal.org/files/issues/node_serialized2x.patch (664 bytes) node_save() saves a node, then calls insert or update hooks. These hooks may run functions that, in turn, call node_save(). This all works fine except that if a revision is being created, the revision gets serialized twice -- once on the original node_save() call and once on the callback's node_save() call. The attached two-line patch simply inserts a boolean variable called rev_serialized to keep track of whether the serialization has already been done. I'd like to see cvs and 4.6 patched. This patch is necessary so I can release actions and workflow for 4.6. Background: http://drupal.org/node/24326 ------------------------------------------------------------------------ Tue, 19 Jul 2005 18:36:23 +0000 : Dries I don't like this patch. It's a hairy fix. There must be a better way to avoid recursion in first place. ------------------------------------------------------------------------ Thu, 18 Aug 2005 01:26:21 +0000 : moshe weitzman gosh, this a two line change with zero performanhce impact. seems like a useful and minimally invasive fix for our current dilemma. we'll remove it once a better fix comes around. is a shame to hold back workflow/actions for such a minor issue. any others want to comment or suggest a different solution?/ ------------------------------------------------------------------------ Thu, 18 Aug 2005 04:57:11 +0000 : chx If there would be a little comment why this is needed, then I'd +1 it heartily. ------------------------------------------------------------------------ Thu, 18 Aug 2005 11:59:14 +0000 : killes@www.drop.org Please commit. In case revisions shouldn't make it (for performance reasons ;) we need it fixed for 4.7. ------------------------------------------------------------------------ Thu, 18 Aug 2005 14:22:15 +0000 : moshe weitzman The first user of this will be workflow+actions modules. They listen on nodeapi's insert/update operations. When a node is seen there, arbitrary actions may fire according to the admin's wishes. So for example, a node may be promoted or may become sticky. To do that properly, the actions should call node_save() again with $node->promote=1. Thus, we are recursively calling node_save() which leads to a double serialize problem. This 2 line fixes it. There may be more nodes which benefit from this. ------------------------------------------------------------------------ Thu, 18 Aug 2005 14:23:24 +0000 : moshe weitzman I meant to say - "there may be more *modules* which benefit from this." ------------------------------------------------------------------------ Thu, 18 Aug 2005 21:50:36 +0000 : Dries Won't commit. If you are calling node_save() recursively, something is really wrong. For one, you can no longer guarantee that the node is in a 'safe state'. Like, what happens if you saved half a node and you recursively try saving that node again? This needs a better solution. ------------------------------------------------------------------------ Thu, 18 Aug 2005 23:21:41 +0000 : chx I have a possible solution, which does not need core change. Keep a stack of necessary actions in SESSION and pop the stack first in _exit then subsequent _init calls 'till the stack is empty. ------------------------------------------------------------------------ Fri, 19 Aug 2005 00:58:11 +0000 : jvandyk Attachment: http://drupal.org/files/issues/save_hooks.diff (1.83 KB) OK, here is an alternate approach. We leave node_save() as it is, but add two new nodeapi hooks to node_submit(). For reference, here's the meat from my message to the devel list, as chx wants background: The nodeapi calls for nodes go like this on node submission: 1. nodeapi validate hook 2. (node is saved) 3. nodeapi insert/update hook a... modules respond b... modules respond c... modules respond ... workflow modules responds, fires actions, the last of which saves the node (go to 2) x... modules respond y... modules respond z... modules respond As you can see, we have some problems. Because module_invoke() works alphabetically, any modules with names alphabetically after "workflow" have not yet responded when the workflow hook is invoked. Remember, the workflow hook activity looks like this: 1. nodeapi insert/update is caught by workflow module 2. workflow module fires the 'transition pre' workflow hook to let modules know a transition is about to occur 3. modules have a chance to veto the workflow transition 4. if no module vetoes the transition, the 'transition post' workflow hook is fired so modules can respond to the workflow state change. So to move ahead, we need a 1a nodeapi hook that says, "validation passed and this node will now be saved" so workflow changes can happen there, and/or a 4 nodeapi hook that says, "the node has been saved, all modules have done their thing, it's safe to do a node_load and get a node that is in a consistent state". I'd like to see both because it gives maximum flexibility for workflow. For example, it's much more efficient to change the status of the node to published after validation but before saving than it is to use the current method of saving the node again. So the revised flow looks like this: 1. nodeapi validate hook 1a. node is guaranteed validated; nodeapi save pre hook (workflow fires efficient actions like changing status) 2. (node is saved) 3. nodeapi insert/update hook a... modules respond b... modules respond c... modules respond ... x... modules respond y... modules respond z... modules respond 4. node is guaranteed saved; nodeapi save post hook (workflow fires actions like email notification) ------------------------------------------------------------------------ Fri, 19 Aug 2005 19:00:53 +0000 : jvandyk Attachment: http://drupal.org/files/issues/save_hooks2.diff (1.83 KB) After discussion with chx and moshe, changed names of hooks to be update pre, update post, insert pre, insert post so that listeners have a better context in which to make decisions. I don't like the _exit hook solution because that will slow down every page. Todd Grimason linked to this page [1] of Ruby on Rails hooks on drupal-dev. Although these are 4 hooks, note that only 2 will ever execute at a given time, because we are either doing an insert or an update. [1] http://api.rubyonrails.com/classes/ActiveRecord/Callbacks.html ------------------------------------------------------------------------ Fri, 19 Aug 2005 19:10:46 +0000 : chx I like this, but if this gets in, next issue will be to clean up 'validate'... ------------------------------------------------------------------------ Fri, 26 Aug 2005 16:18:46 +0000 : Kjartan
From the conversations on IRC here is where I ended up on nodeapi calls.
Nodeapi calls when creating a new node: default values -> form pre -> form post -> validate -> finalize insert -> insert Nodeapi calls when updating an existing node: load -> form pre -> form post -> validate -> finalize update -> update Not sure I completely agree with splitting finalize, but I am sure those who support it will post nice examples to make it a clear choice :) This would remove the hack in node_form with validating an empty node, and let modules perform actions just before its saved. ------------------------------------------------------------------------ Tue, 30 Aug 2005 16:17:12 +0000 : jvandyk Attachment: http://drupal.org/files/issues/save_hooks3.diff (1.78 KB) I have updated this patch to work with HEAD after the recent node changes. Here's a review of this patch: Why do we need the 'update pre' hook? Because we need a nodeapi call where modules can assume that they are working with a node that has passed validation. For example, if actions such as e-mail sending are going to fire, you don't want them to fire and then a module later in the node_invoke_nodeapi() sequence to throw a validation error. You want validation to pass as a separate step, and then be able to fire actions without worrying that a node_save() might not happen. Why do we need the 'update post' hook? Because we need to guarantee that we are working with a "clean" node. If you use the 'insert' or 'update' hooks, a module later in the node_invoke_nodeapi() sequence may make changes to database tables that will affect the node. Modules typically use the 'insert' and 'update' hooks to modify their own private tables. To ensure consistency and avoid corruption, you want to make certain that all modules have completed their own private database updates before moving on. Why do you want to make certain of this? Because you do not know what actions are going to do. Actions (as in actions.module) are by definition arbitrary. It is much better to have all tables in a consistent state before firing actions. You want to make sure that if an action does a node_load it will get a clean node, including all the updates that listening modules do. Steven, please chime in on this as this was your concern. One other point on the 'update post' hook. It is conceivable that actions fired may run a long time or run into problems. Of course actions need to be written as cleanly as possible. 'update post' gives actions a chance to decide that they will do things that may have problems (xmlrpc posts that may timeout, for example) and breathe easy knowing that even if there is a timeout or other error, the node has already been saved. Do we need 'insert pre' and 'update pre'? Why not just use 'save pre'? I think that since we already know the context, it is nice to pass that along to modules so they do not have to test for the existence of $node->nid to waste a cycle figuring out something we already know. What about naming? Kjartan has suggested changing the name 'update pre' to 'finalize update' and 'update insert' to 'finalize insert'. I have left the names as is because I think both hooks are important and I think the current names are easiest to understand. 'finalize' does not give a clue as to whether it is before or after the actual node_save(). But I am open to better naming if it is easy to understand for the developer. What's the downside? I do not see a downside to this patch. It adds two nodeapi calls on insert, or two on update. It is not a performance hit. It adds only 4 lines of code. It does not require anyone to rewrite anything. And it is the best solution we have to the problem. ------------------------------------------------------------------------ Tue, 06 Sep 2005 14:28:08 +0000 : I1uv4t4r I am creating a module that creates an additional node in hook_nodeapi('insert', ...). The additional node was created OK, but the actual node (where hook_nodeapi was invoked for in the first place) not: information like taxonomy was lost. I couldn't find anything to fix this bug, untill I came accross this post. The problem appearently was that I was using node_save recursive, and my module name starts with an A and the taxonomy module with a T (duh). Well, the patch above (save_hooks3.diff) certainly fixed this problem for me by creating the node in the 'insert post' hook instead of the 'insert' hook. Very nice! I agree with jvandyk, there's no downside to this patch, there are some additional calls but at least module_implements 'caches' the list of modules that implement a certain hook, hook_nodeapi in this case. I myself cannot find a practical implementation of the 'insert pre' hook (or 'update pre'), though. jvandyk says something about sending an e-mail, but this can be done just as wel in either 'insert' or 'insert post'. By the way, another thing that worked for me was changing in node_invoke_nodeapi: foreach (module_implements('nodeapi') as $name) { to $implementations = module_implements('nodeapi'); foreach ($implementations as $name) { I don't know why this worked, but it probably has something todo with the static variable in module_implements. I think this patch is very usefull and this bug /should be fixed in 4.7/. Thank you.
Issue status update for http://drupal.org/node/27007 Post a follow up: http://drupal.org/project/comments/add/27007 Project: Drupal Version: cvs Component: node system Category: bug reports Priority: critical Assigned to: jvandyk Reported by: jvandyk Updated by: chx Status: patch (code needs review) Recently, I was given tasks that were solved easiest by setting up various node relations. When you do that, this patch becomes a godsend. chx Previous comments: ------------------------------------------------------------------------ Tue, 19 Jul 2005 02:14:16 +0000 : jvandyk Attachment: http://drupal.org/files/issues/node_serialized2x.patch (664 bytes) node_save() saves a node, then calls insert or update hooks. These hooks may run functions that, in turn, call node_save(). This all works fine except that if a revision is being created, the revision gets serialized twice -- once on the original node_save() call and once on the callback's node_save() call. The attached two-line patch simply inserts a boolean variable called rev_serialized to keep track of whether the serialization has already been done. I'd like to see cvs and 4.6 patched. This patch is necessary so I can release actions and workflow for 4.6. Background: http://drupal.org/node/24326 ------------------------------------------------------------------------ Tue, 19 Jul 2005 18:36:23 +0000 : Dries I don't like this patch. It's a hairy fix. There must be a better way to avoid recursion in first place. ------------------------------------------------------------------------ Thu, 18 Aug 2005 01:26:21 +0000 : moshe weitzman gosh, this a two line change with zero performanhce impact. seems like a useful and minimally invasive fix for our current dilemma. we'll remove it once a better fix comes around. is a shame to hold back workflow/actions for such a minor issue. any others want to comment or suggest a different solution?/ ------------------------------------------------------------------------ Thu, 18 Aug 2005 04:57:11 +0000 : chx If there would be a little comment why this is needed, then I'd +1 it heartily. ------------------------------------------------------------------------ Thu, 18 Aug 2005 11:59:14 +0000 : killes@www.drop.org Please commit. In case revisions shouldn't make it (for performance reasons ;) we need it fixed for 4.7. ------------------------------------------------------------------------ Thu, 18 Aug 2005 14:22:15 +0000 : moshe weitzman The first user of this will be workflow+actions modules. They listen on nodeapi's insert/update operations. When a node is seen there, arbitrary actions may fire according to the admin's wishes. So for example, a node may be promoted or may become sticky. To do that properly, the actions should call node_save() again with $node->promote=1. Thus, we are recursively calling node_save() which leads to a double serialize problem. This 2 line fixes it. There may be more nodes which benefit from this. ------------------------------------------------------------------------ Thu, 18 Aug 2005 14:23:24 +0000 : moshe weitzman I meant to say - "there may be more *modules* which benefit from this." ------------------------------------------------------------------------ Thu, 18 Aug 2005 21:50:36 +0000 : Dries Won't commit. If you are calling node_save() recursively, something is really wrong. For one, you can no longer guarantee that the node is in a 'safe state'. Like, what happens if you saved half a node and you recursively try saving that node again? This needs a better solution. ------------------------------------------------------------------------ Thu, 18 Aug 2005 23:21:41 +0000 : chx I have a possible solution, which does not need core change. Keep a stack of necessary actions in SESSION and pop the stack first in _exit then subsequent _init calls 'till the stack is empty. ------------------------------------------------------------------------ Fri, 19 Aug 2005 00:58:11 +0000 : jvandyk Attachment: http://drupal.org/files/issues/save_hooks.diff (1.83 KB) OK, here is an alternate approach. We leave node_save() as it is, but add two new nodeapi hooks to node_submit(). For reference, here's the meat from my message to the devel list, as chx wants background: The nodeapi calls for nodes go like this on node submission: 1. nodeapi validate hook 2. (node is saved) 3. nodeapi insert/update hook a... modules respond b... modules respond c... modules respond ... workflow modules responds, fires actions, the last of which saves the node (go to 2) x... modules respond y... modules respond z... modules respond As you can see, we have some problems. Because module_invoke() works alphabetically, any modules with names alphabetically after "workflow" have not yet responded when the workflow hook is invoked. Remember, the workflow hook activity looks like this: 1. nodeapi insert/update is caught by workflow module 2. workflow module fires the 'transition pre' workflow hook to let modules know a transition is about to occur 3. modules have a chance to veto the workflow transition 4. if no module vetoes the transition, the 'transition post' workflow hook is fired so modules can respond to the workflow state change. So to move ahead, we need a 1a nodeapi hook that says, "validation passed and this node will now be saved" so workflow changes can happen there, and/or a 4 nodeapi hook that says, "the node has been saved, all modules have done their thing, it's safe to do a node_load and get a node that is in a consistent state". I'd like to see both because it gives maximum flexibility for workflow. For example, it's much more efficient to change the status of the node to published after validation but before saving than it is to use the current method of saving the node again. So the revised flow looks like this: 1. nodeapi validate hook 1a. node is guaranteed validated; nodeapi save pre hook (workflow fires efficient actions like changing status) 2. (node is saved) 3. nodeapi insert/update hook a... modules respond b... modules respond c... modules respond ... x... modules respond y... modules respond z... modules respond 4. node is guaranteed saved; nodeapi save post hook (workflow fires actions like email notification) ------------------------------------------------------------------------ Fri, 19 Aug 2005 19:00:53 +0000 : jvandyk Attachment: http://drupal.org/files/issues/save_hooks2.diff (1.83 KB) After discussion with chx and moshe, changed names of hooks to be update pre, update post, insert pre, insert post so that listeners have a better context in which to make decisions. I don't like the _exit hook solution because that will slow down every page. Todd Grimason linked to this page [1] of Ruby on Rails hooks on drupal-dev. Although these are 4 hooks, note that only 2 will ever execute at a given time, because we are either doing an insert or an update. [1] http://api.rubyonrails.com/classes/ActiveRecord/Callbacks.html ------------------------------------------------------------------------ Fri, 19 Aug 2005 19:10:46 +0000 : chx I like this, but if this gets in, next issue will be to clean up 'validate'... ------------------------------------------------------------------------ Fri, 26 Aug 2005 16:18:46 +0000 : Kjartan
From the conversations on IRC here is where I ended up on nodeapi calls.
Nodeapi calls when creating a new node: default values -> form pre -> form post -> validate -> finalize insert -> insert Nodeapi calls when updating an existing node: load -> form pre -> form post -> validate -> finalize update -> update Not sure I completely agree with splitting finalize, but I am sure those who support it will post nice examples to make it a clear choice :) This would remove the hack in node_form with validating an empty node, and let modules perform actions just before its saved. ------------------------------------------------------------------------ Tue, 30 Aug 2005 16:17:12 +0000 : jvandyk Attachment: http://drupal.org/files/issues/save_hooks3.diff (1.78 KB) I have updated this patch to work with HEAD after the recent node changes. Here's a review of this patch: Why do we need the 'update pre' hook? Because we need a nodeapi call where modules can assume that they are working with a node that has passed validation. For example, if actions such as e-mail sending are going to fire, you don't want them to fire and then a module later in the node_invoke_nodeapi() sequence to throw a validation error. You want validation to pass as a separate step, and then be able to fire actions without worrying that a node_save() might not happen. Why do we need the 'update post' hook? Because we need to guarantee that we are working with a "clean" node. If you use the 'insert' or 'update' hooks, a module later in the node_invoke_nodeapi() sequence may make changes to database tables that will affect the node. Modules typically use the 'insert' and 'update' hooks to modify their own private tables. To ensure consistency and avoid corruption, you want to make certain that all modules have completed their own private database updates before moving on. Why do you want to make certain of this? Because you do not know what actions are going to do. Actions (as in actions.module) are by definition arbitrary. It is much better to have all tables in a consistent state before firing actions. You want to make sure that if an action does a node_load it will get a clean node, including all the updates that listening modules do. Steven, please chime in on this as this was your concern. One other point on the 'update post' hook. It is conceivable that actions fired may run a long time or run into problems. Of course actions need to be written as cleanly as possible. 'update post' gives actions a chance to decide that they will do things that may have problems (xmlrpc posts that may timeout, for example) and breathe easy knowing that even if there is a timeout or other error, the node has already been saved. Do we need 'insert pre' and 'update pre'? Why not just use 'save pre'? I think that since we already know the context, it is nice to pass that along to modules so they do not have to test for the existence of $node->nid to waste a cycle figuring out something we already know. What about naming? Kjartan has suggested changing the name 'update pre' to 'finalize update' and 'update insert' to 'finalize insert'. I have left the names as is because I think both hooks are important and I think the current names are easiest to understand. 'finalize' does not give a clue as to whether it is before or after the actual node_save(). But I am open to better naming if it is easy to understand for the developer. What's the downside? I do not see a downside to this patch. It adds two nodeapi calls on insert, or two on update. It is not a performance hit. It adds only 4 lines of code. It does not require anyone to rewrite anything. And it is the best solution we have to the problem. ------------------------------------------------------------------------ Tue, 06 Sep 2005 14:28:08 +0000 : I1uv4t4r I am creating a module that creates an additional node in hook_nodeapi('insert', ...). The additional node was created OK, but the actual node (where hook_nodeapi was invoked for in the first place) not: information like taxonomy was lost. I couldn't find anything to fix this bug, untill I came accross this post. The problem appearently was that I was using node_save recursive, and my module name starts with an A and the taxonomy module with a T (duh). Well, the patch above (save_hooks3.diff) certainly fixed this problem for me by creating the node in the 'insert post' hook instead of the 'insert' hook. Very nice! I agree with jvandyk, there's no downside to this patch, there are some additional calls but at least module_implements 'caches' the list of modules that implement a certain hook, hook_nodeapi in this case. I myself cannot find a practical implementation of the 'insert pre' hook (or 'update pre'), though. jvandyk says something about sending an e-mail, but this can be done just as wel in either 'insert' or 'insert post'. By the way, another thing that worked for me was changing in node_invoke_nodeapi: foreach (module_implements('nodeapi') as $name) { to $implementations = module_implements('nodeapi'); foreach ($implementations as $name) { I don't know why this worked, but it probably has something todo with the static variable in module_implements. I think this patch is very usefull and this bug /should be fixed in 4.7/. Thank you. ------------------------------------------------------------------------ Sun, 11 Sep 2005 03:43:34 +0000 : jvandyk I1uv4t4r says /I myself cannot find a practical implementation of the 'insert pre' hook (or 'update pre')/ For one thing, it's for performance. A quick example: it is faster to set the node status to "published" during the 'insert-pre' hook than to do an entire node_load and node_save again later to accomplish this change.
Issue status update for http://drupal.org/node/27007 Post a follow up: http://drupal.org/project/comments/add/27007 Project: Drupal Version: cvs Component: node system Category: bug reports Priority: critical Assigned to: jvandyk Reported by: jvandyk Updated by: moshe weitzman -Status: patch (code needs review) +Status: patch (ready to be committed) The approach of this most recent patch has seen a lot of review and comment here. I don't see any outstanding objections, so I set to 'ready for commit' moshe weitzman Previous comments: ------------------------------------------------------------------------ Tue, 19 Jul 2005 02:14:16 +0000 : jvandyk Attachment: http://drupal.org/files/issues/node_serialized2x.patch (664 bytes) node_save() saves a node, then calls insert or update hooks. These hooks may run functions that, in turn, call node_save(). This all works fine except that if a revision is being created, the revision gets serialized twice -- once on the original node_save() call and once on the callback's node_save() call. The attached two-line patch simply inserts a boolean variable called rev_serialized to keep track of whether the serialization has already been done. I'd like to see cvs and 4.6 patched. This patch is necessary so I can release actions and workflow for 4.6. Background: http://drupal.org/node/24326 ------------------------------------------------------------------------ Tue, 19 Jul 2005 18:36:23 +0000 : Dries I don't like this patch. It's a hairy fix. There must be a better way to avoid recursion in first place. ------------------------------------------------------------------------ Thu, 18 Aug 2005 01:26:21 +0000 : moshe weitzman gosh, this a two line change with zero performanhce impact. seems like a useful and minimally invasive fix for our current dilemma. we'll remove it once a better fix comes around. is a shame to hold back workflow/actions for such a minor issue. any others want to comment or suggest a different solution?/ ------------------------------------------------------------------------ Thu, 18 Aug 2005 04:57:11 +0000 : chx If there would be a little comment why this is needed, then I'd +1 it heartily. ------------------------------------------------------------------------ Thu, 18 Aug 2005 11:59:14 +0000 : killes@www.drop.org Please commit. In case revisions shouldn't make it (for performance reasons ;) we need it fixed for 4.7. ------------------------------------------------------------------------ Thu, 18 Aug 2005 14:22:15 +0000 : moshe weitzman The first user of this will be workflow+actions modules. They listen on nodeapi's insert/update operations. When a node is seen there, arbitrary actions may fire according to the admin's wishes. So for example, a node may be promoted or may become sticky. To do that properly, the actions should call node_save() again with $node->promote=1. Thus, we are recursively calling node_save() which leads to a double serialize problem. This 2 line fixes it. There may be more nodes which benefit from this. ------------------------------------------------------------------------ Thu, 18 Aug 2005 14:23:24 +0000 : moshe weitzman I meant to say - "there may be more *modules* which benefit from this." ------------------------------------------------------------------------ Thu, 18 Aug 2005 21:50:36 +0000 : Dries Won't commit. If you are calling node_save() recursively, something is really wrong. For one, you can no longer guarantee that the node is in a 'safe state'. Like, what happens if you saved half a node and you recursively try saving that node again? This needs a better solution. ------------------------------------------------------------------------ Thu, 18 Aug 2005 23:21:41 +0000 : chx I have a possible solution, which does not need core change. Keep a stack of necessary actions in SESSION and pop the stack first in _exit then subsequent _init calls 'till the stack is empty. ------------------------------------------------------------------------ Fri, 19 Aug 2005 00:58:11 +0000 : jvandyk Attachment: http://drupal.org/files/issues/save_hooks.diff (1.83 KB) OK, here is an alternate approach. We leave node_save() as it is, but add two new nodeapi hooks to node_submit(). For reference, here's the meat from my message to the devel list, as chx wants background: The nodeapi calls for nodes go like this on node submission: 1. nodeapi validate hook 2. (node is saved) 3. nodeapi insert/update hook a... modules respond b... modules respond c... modules respond ... workflow modules responds, fires actions, the last of which saves the node (go to 2) x... modules respond y... modules respond z... modules respond As you can see, we have some problems. Because module_invoke() works alphabetically, any modules with names alphabetically after "workflow" have not yet responded when the workflow hook is invoked. Remember, the workflow hook activity looks like this: 1. nodeapi insert/update is caught by workflow module 2. workflow module fires the 'transition pre' workflow hook to let modules know a transition is about to occur 3. modules have a chance to veto the workflow transition 4. if no module vetoes the transition, the 'transition post' workflow hook is fired so modules can respond to the workflow state change. So to move ahead, we need a 1a nodeapi hook that says, "validation passed and this node will now be saved" so workflow changes can happen there, and/or a 4 nodeapi hook that says, "the node has been saved, all modules have done their thing, it's safe to do a node_load and get a node that is in a consistent state". I'd like to see both because it gives maximum flexibility for workflow. For example, it's much more efficient to change the status of the node to published after validation but before saving than it is to use the current method of saving the node again. So the revised flow looks like this: 1. nodeapi validate hook 1a. node is guaranteed validated; nodeapi save pre hook (workflow fires efficient actions like changing status) 2. (node is saved) 3. nodeapi insert/update hook a... modules respond b... modules respond c... modules respond ... x... modules respond y... modules respond z... modules respond 4. node is guaranteed saved; nodeapi save post hook (workflow fires actions like email notification) ------------------------------------------------------------------------ Fri, 19 Aug 2005 19:00:53 +0000 : jvandyk Attachment: http://drupal.org/files/issues/save_hooks2.diff (1.83 KB) After discussion with chx and moshe, changed names of hooks to be update pre, update post, insert pre, insert post so that listeners have a better context in which to make decisions. I don't like the _exit hook solution because that will slow down every page. Todd Grimason linked to this page [1] of Ruby on Rails hooks on drupal-dev. Although these are 4 hooks, note that only 2 will ever execute at a given time, because we are either doing an insert or an update. [1] http://api.rubyonrails.com/classes/ActiveRecord/Callbacks.html ------------------------------------------------------------------------ Fri, 19 Aug 2005 19:10:46 +0000 : chx I like this, but if this gets in, next issue will be to clean up 'validate'... ------------------------------------------------------------------------ Fri, 26 Aug 2005 16:18:46 +0000 : Kjartan
From the conversations on IRC here is where I ended up on nodeapi calls.
Nodeapi calls when creating a new node: default values -> form pre -> form post -> validate -> finalize insert -> insert Nodeapi calls when updating an existing node: load -> form pre -> form post -> validate -> finalize update -> update Not sure I completely agree with splitting finalize, but I am sure those who support it will post nice examples to make it a clear choice :) This would remove the hack in node_form with validating an empty node, and let modules perform actions just before its saved. ------------------------------------------------------------------------ Tue, 30 Aug 2005 16:17:12 +0000 : jvandyk Attachment: http://drupal.org/files/issues/save_hooks3.diff (1.78 KB) I have updated this patch to work with HEAD after the recent node changes. Here's a review of this patch: Why do we need the 'update pre' hook? Because we need a nodeapi call where modules can assume that they are working with a node that has passed validation. For example, if actions such as e-mail sending are going to fire, you don't want them to fire and then a module later in the node_invoke_nodeapi() sequence to throw a validation error. You want validation to pass as a separate step, and then be able to fire actions without worrying that a node_save() might not happen. Why do we need the 'update post' hook? Because we need to guarantee that we are working with a "clean" node. If you use the 'insert' or 'update' hooks, a module later in the node_invoke_nodeapi() sequence may make changes to database tables that will affect the node. Modules typically use the 'insert' and 'update' hooks to modify their own private tables. To ensure consistency and avoid corruption, you want to make certain that all modules have completed their own private database updates before moving on. Why do you want to make certain of this? Because you do not know what actions are going to do. Actions (as in actions.module) are by definition arbitrary. It is much better to have all tables in a consistent state before firing actions. You want to make sure that if an action does a node_load it will get a clean node, including all the updates that listening modules do. Steven, please chime in on this as this was your concern. One other point on the 'update post' hook. It is conceivable that actions fired may run a long time or run into problems. Of course actions need to be written as cleanly as possible. 'update post' gives actions a chance to decide that they will do things that may have problems (xmlrpc posts that may timeout, for example) and breathe easy knowing that even if there is a timeout or other error, the node has already been saved. Do we need 'insert pre' and 'update pre'? Why not just use 'save pre'? I think that since we already know the context, it is nice to pass that along to modules so they do not have to test for the existence of $node->nid to waste a cycle figuring out something we already know. What about naming? Kjartan has suggested changing the name 'update pre' to 'finalize update' and 'update insert' to 'finalize insert'. I have left the names as is because I think both hooks are important and I think the current names are easiest to understand. 'finalize' does not give a clue as to whether it is before or after the actual node_save(). But I am open to better naming if it is easy to understand for the developer. What's the downside? I do not see a downside to this patch. It adds two nodeapi calls on insert, or two on update. It is not a performance hit. It adds only 4 lines of code. It does not require anyone to rewrite anything. And it is the best solution we have to the problem. ------------------------------------------------------------------------ Tue, 06 Sep 2005 14:28:08 +0000 : I1uv4t4r I am creating a module that creates an additional node in hook_nodeapi('insert', ...). The additional node was created OK, but the actual node (where hook_nodeapi was invoked for in the first place) not: information like taxonomy was lost. I couldn't find anything to fix this bug, untill I came accross this post. The problem appearently was that I was using node_save recursive, and my module name starts with an A and the taxonomy module with a T (duh). Well, the patch above (save_hooks3.diff) certainly fixed this problem for me by creating the node in the 'insert post' hook instead of the 'insert' hook. Very nice! I agree with jvandyk, there's no downside to this patch, there are some additional calls but at least module_implements 'caches' the list of modules that implement a certain hook, hook_nodeapi in this case. I myself cannot find a practical implementation of the 'insert pre' hook (or 'update pre'), though. jvandyk says something about sending an e-mail, but this can be done just as wel in either 'insert' or 'insert post'. By the way, another thing that worked for me was changing in node_invoke_nodeapi: foreach (module_implements('nodeapi') as $name) { to $implementations = module_implements('nodeapi'); foreach ($implementations as $name) { I don't know why this worked, but it probably has something todo with the static variable in module_implements. I think this patch is very usefull and this bug /should be fixed in 4.7/. Thank you. ------------------------------------------------------------------------ Sun, 11 Sep 2005 03:43:34 +0000 : jvandyk I1uv4t4r says /I myself cannot find a practical implementation of the 'insert pre' hook (or 'update pre')/ For one thing, it's for performance. A quick example: it is faster to set the node status to "published" during the 'insert-pre' hook than to do an entire node_load and node_save again later to accomplish this change. ------------------------------------------------------------------------ Sun, 11 Sep 2005 03:52:08 +0000 : chx Recently, I was given tasks that were solved easiest by setting up various node relations. When you do that, this patch becomes a godsend.
Issue status update for http://drupal.org/node/27007 Post a follow up: http://drupal.org/project/comments/add/27007 Project: Drupal Version: cvs Component: node system Category: bug reports Priority: critical Assigned to: jvandyk Reported by: jvandyk Updated by: mathias Status: patch (ready to be committed) +1 for this 4-line patch. Currently, bizarre module names (so they are the last to execute) and extraneous node_save calls are the only way around this problem. Since Drupal is built around the concept of a node it makes sense that core provides a sophisticated node workflow as offerred here. mathias Previous comments: ------------------------------------------------------------------------ Tue, 19 Jul 2005 02:14:16 +0000 : jvandyk Attachment: http://drupal.org/files/issues/node_serialized2x.patch (664 bytes) node_save() saves a node, then calls insert or update hooks. These hooks may run functions that, in turn, call node_save(). This all works fine except that if a revision is being created, the revision gets serialized twice -- once on the original node_save() call and once on the callback's node_save() call. The attached two-line patch simply inserts a boolean variable called rev_serialized to keep track of whether the serialization has already been done. I'd like to see cvs and 4.6 patched. This patch is necessary so I can release actions and workflow for 4.6. Background: http://drupal.org/node/24326 ------------------------------------------------------------------------ Tue, 19 Jul 2005 18:36:23 +0000 : Dries I don't like this patch. It's a hairy fix. There must be a better way to avoid recursion in first place. ------------------------------------------------------------------------ Thu, 18 Aug 2005 01:26:21 +0000 : moshe weitzman gosh, this a two line change with zero performanhce impact. seems like a useful and minimally invasive fix for our current dilemma. we'll remove it once a better fix comes around. is a shame to hold back workflow/actions for such a minor issue. any others want to comment or suggest a different solution?/ ------------------------------------------------------------------------ Thu, 18 Aug 2005 04:57:11 +0000 : chx If there would be a little comment why this is needed, then I'd +1 it heartily. ------------------------------------------------------------------------ Thu, 18 Aug 2005 11:59:14 +0000 : killes@www.drop.org Please commit. In case revisions shouldn't make it (for performance reasons ;) we need it fixed for 4.7. ------------------------------------------------------------------------ Thu, 18 Aug 2005 14:22:15 +0000 : moshe weitzman The first user of this will be workflow+actions modules. They listen on nodeapi's insert/update operations. When a node is seen there, arbitrary actions may fire according to the admin's wishes. So for example, a node may be promoted or may become sticky. To do that properly, the actions should call node_save() again with $node->promote=1. Thus, we are recursively calling node_save() which leads to a double serialize problem. This 2 line fixes it. There may be more nodes which benefit from this. ------------------------------------------------------------------------ Thu, 18 Aug 2005 14:23:24 +0000 : moshe weitzman I meant to say - "there may be more *modules* which benefit from this." ------------------------------------------------------------------------ Thu, 18 Aug 2005 21:50:36 +0000 : Dries Won't commit. If you are calling node_save() recursively, something is really wrong. For one, you can no longer guarantee that the node is in a 'safe state'. Like, what happens if you saved half a node and you recursively try saving that node again? This needs a better solution. ------------------------------------------------------------------------ Thu, 18 Aug 2005 23:21:41 +0000 : chx I have a possible solution, which does not need core change. Keep a stack of necessary actions in SESSION and pop the stack first in _exit then subsequent _init calls 'till the stack is empty. ------------------------------------------------------------------------ Fri, 19 Aug 2005 00:58:11 +0000 : jvandyk Attachment: http://drupal.org/files/issues/save_hooks.diff (1.83 KB) OK, here is an alternate approach. We leave node_save() as it is, but add two new nodeapi hooks to node_submit(). For reference, here's the meat from my message to the devel list, as chx wants background: The nodeapi calls for nodes go like this on node submission: 1. nodeapi validate hook 2. (node is saved) 3. nodeapi insert/update hook a... modules respond b... modules respond c... modules respond ... workflow modules responds, fires actions, the last of which saves the node (go to 2) x... modules respond y... modules respond z... modules respond As you can see, we have some problems. Because module_invoke() works alphabetically, any modules with names alphabetically after "workflow" have not yet responded when the workflow hook is invoked. Remember, the workflow hook activity looks like this: 1. nodeapi insert/update is caught by workflow module 2. workflow module fires the 'transition pre' workflow hook to let modules know a transition is about to occur 3. modules have a chance to veto the workflow transition 4. if no module vetoes the transition, the 'transition post' workflow hook is fired so modules can respond to the workflow state change. So to move ahead, we need a 1a nodeapi hook that says, "validation passed and this node will now be saved" so workflow changes can happen there, and/or a 4 nodeapi hook that says, "the node has been saved, all modules have done their thing, it's safe to do a node_load and get a node that is in a consistent state". I'd like to see both because it gives maximum flexibility for workflow. For example, it's much more efficient to change the status of the node to published after validation but before saving than it is to use the current method of saving the node again. So the revised flow looks like this: 1. nodeapi validate hook 1a. node is guaranteed validated; nodeapi save pre hook (workflow fires efficient actions like changing status) 2. (node is saved) 3. nodeapi insert/update hook a... modules respond b... modules respond c... modules respond ... x... modules respond y... modules respond z... modules respond 4. node is guaranteed saved; nodeapi save post hook (workflow fires actions like email notification) ------------------------------------------------------------------------ Fri, 19 Aug 2005 19:00:53 +0000 : jvandyk Attachment: http://drupal.org/files/issues/save_hooks2.diff (1.83 KB) After discussion with chx and moshe, changed names of hooks to be update pre, update post, insert pre, insert post so that listeners have a better context in which to make decisions. I don't like the _exit hook solution because that will slow down every page. Todd Grimason linked to this page [1] of Ruby on Rails hooks on drupal-dev. Although these are 4 hooks, note that only 2 will ever execute at a given time, because we are either doing an insert or an update. [1] http://api.rubyonrails.com/classes/ActiveRecord/Callbacks.html ------------------------------------------------------------------------ Fri, 19 Aug 2005 19:10:46 +0000 : chx I like this, but if this gets in, next issue will be to clean up 'validate'... ------------------------------------------------------------------------ Fri, 26 Aug 2005 16:18:46 +0000 : Kjartan
From the conversations on IRC here is where I ended up on nodeapi calls.
Nodeapi calls when creating a new node: default values -> form pre -> form post -> validate -> finalize insert -> insert Nodeapi calls when updating an existing node: load -> form pre -> form post -> validate -> finalize update -> update Not sure I completely agree with splitting finalize, but I am sure those who support it will post nice examples to make it a clear choice :) This would remove the hack in node_form with validating an empty node, and let modules perform actions just before its saved. ------------------------------------------------------------------------ Tue, 30 Aug 2005 16:17:12 +0000 : jvandyk Attachment: http://drupal.org/files/issues/save_hooks3.diff (1.78 KB) I have updated this patch to work with HEAD after the recent node changes. Here's a review of this patch: Why do we need the 'update pre' hook? Because we need a nodeapi call where modules can assume that they are working with a node that has passed validation. For example, if actions such as e-mail sending are going to fire, you don't want them to fire and then a module later in the node_invoke_nodeapi() sequence to throw a validation error. You want validation to pass as a separate step, and then be able to fire actions without worrying that a node_save() might not happen. Why do we need the 'update post' hook? Because we need to guarantee that we are working with a "clean" node. If you use the 'insert' or 'update' hooks, a module later in the node_invoke_nodeapi() sequence may make changes to database tables that will affect the node. Modules typically use the 'insert' and 'update' hooks to modify their own private tables. To ensure consistency and avoid corruption, you want to make certain that all modules have completed their own private database updates before moving on. Why do you want to make certain of this? Because you do not know what actions are going to do. Actions (as in actions.module) are by definition arbitrary. It is much better to have all tables in a consistent state before firing actions. You want to make sure that if an action does a node_load it will get a clean node, including all the updates that listening modules do. Steven, please chime in on this as this was your concern. One other point on the 'update post' hook. It is conceivable that actions fired may run a long time or run into problems. Of course actions need to be written as cleanly as possible. 'update post' gives actions a chance to decide that they will do things that may have problems (xmlrpc posts that may timeout, for example) and breathe easy knowing that even if there is a timeout or other error, the node has already been saved. Do we need 'insert pre' and 'update pre'? Why not just use 'save pre'? I think that since we already know the context, it is nice to pass that along to modules so they do not have to test for the existence of $node->nid to waste a cycle figuring out something we already know. What about naming? Kjartan has suggested changing the name 'update pre' to 'finalize update' and 'update insert' to 'finalize insert'. I have left the names as is because I think both hooks are important and I think the current names are easiest to understand. 'finalize' does not give a clue as to whether it is before or after the actual node_save(). But I am open to better naming if it is easy to understand for the developer. What's the downside? I do not see a downside to this patch. It adds two nodeapi calls on insert, or two on update. It is not a performance hit. It adds only 4 lines of code. It does not require anyone to rewrite anything. And it is the best solution we have to the problem. ------------------------------------------------------------------------ Tue, 06 Sep 2005 14:28:08 +0000 : Iluvatar I am creating a module that creates an additional node in hook_nodeapi('insert', ...). The additional node was created OK, but the actual node (where hook_nodeapi was invoked for in the first place) not: information like taxonomy was lost. I couldn't find anything to fix this bug, untill I came accross this post. The problem appearently was that I was using node_save recursive, and my module name starts with an A and the taxonomy module with a T (duh). Well, the patch above (save_hooks3.diff) certainly fixed this problem for me by creating the node in the 'insert post' hook instead of the 'insert' hook. Very nice! I agree with jvandyk, there's no downside to this patch, there are some additional calls but at least module_implements 'caches' the list of modules that implement a certain hook, hook_nodeapi in this case. I myself cannot find a practical implementation of the 'insert pre' hook (or 'update pre'), though. jvandyk says something about sending an e-mail, but this can be done just as wel in either 'insert' or 'insert post'. By the way, another thing that worked for me was changing in node_invoke_nodeapi: foreach (module_implements('nodeapi') as $name) { to $implementations = module_implements('nodeapi'); foreach ($implementations as $name) { I don't know why this worked, but it probably has something todo with the static variable in module_implements. I think this patch is very usefull and this bug /should be fixed in 4.7/. Thank you. ------------------------------------------------------------------------ Sun, 11 Sep 2005 03:43:34 +0000 : jvandyk I1uv4t4r says /I myself cannot find a practical implementation of the 'insert pre' hook (or 'update pre')/ For one thing, it's for performance. A quick example: it is faster to set the node status to "published" during the 'insert-pre' hook than to do an entire node_load and node_save again later to accomplish this change. ------------------------------------------------------------------------ Sun, 11 Sep 2005 03:52:08 +0000 : chx Recently, I was given tasks that were solved easiest by setting up various node relations. When you do that, this patch becomes a godsend. ------------------------------------------------------------------------ Mon, 12 Sep 2005 16:43:22 +0000 : moshe weitzman The approach of this most recent patch has seen a lot of review and comment here. I don't see any outstanding objections, so I set to 'ready for commit'
Issue status update for http://drupal.org/node/27007 Post a follow up: http://drupal.org/project/comments/add/27007 Project: Drupal Version: cvs Component: node system Category: bug reports Priority: critical Assigned to: jvandyk Reported by: jvandyk Updated by: killes@www.drop.org Status: patch (ready to be committed) I'd also like this patch to be committed. It will enable us to do cool and exciting new stuff with Drupal. killes@www.drop.org Previous comments: ------------------------------------------------------------------------ Tue, 19 Jul 2005 02:14:16 +0000 : jvandyk Attachment: http://drupal.org/files/issues/node_serialized2x.patch (664 bytes) node_save() saves a node, then calls insert or update hooks. These hooks may run functions that, in turn, call node_save(). This all works fine except that if a revision is being created, the revision gets serialized twice -- once on the original node_save() call and once on the callback's node_save() call. The attached two-line patch simply inserts a boolean variable called rev_serialized to keep track of whether the serialization has already been done. I'd like to see cvs and 4.6 patched. This patch is necessary so I can release actions and workflow for 4.6. Background: http://drupal.org/node/24326 ------------------------------------------------------------------------ Tue, 19 Jul 2005 18:36:23 +0000 : Dries I don't like this patch. It's a hairy fix. There must be a better way to avoid recursion in first place. ------------------------------------------------------------------------ Thu, 18 Aug 2005 01:26:21 +0000 : moshe weitzman gosh, this a two line change with zero performanhce impact. seems like a useful and minimally invasive fix for our current dilemma. we'll remove it once a better fix comes around. is a shame to hold back workflow/actions for such a minor issue. any others want to comment or suggest a different solution?/ ------------------------------------------------------------------------ Thu, 18 Aug 2005 04:57:11 +0000 : chx If there would be a little comment why this is needed, then I'd +1 it heartily. ------------------------------------------------------------------------ Thu, 18 Aug 2005 11:59:14 +0000 : killes@www.drop.org Please commit. In case revisions shouldn't make it (for performance reasons ;) we need it fixed for 4.7. ------------------------------------------------------------------------ Thu, 18 Aug 2005 14:22:15 +0000 : moshe weitzman The first user of this will be workflow+actions modules. They listen on nodeapi's insert/update operations. When a node is seen there, arbitrary actions may fire according to the admin's wishes. So for example, a node may be promoted or may become sticky. To do that properly, the actions should call node_save() again with $node->promote=1. Thus, we are recursively calling node_save() which leads to a double serialize problem. This 2 line fixes it. There may be more nodes which benefit from this. ------------------------------------------------------------------------ Thu, 18 Aug 2005 14:23:24 +0000 : moshe weitzman I meant to say - "there may be more *modules* which benefit from this." ------------------------------------------------------------------------ Thu, 18 Aug 2005 21:50:36 +0000 : Dries Won't commit. If you are calling node_save() recursively, something is really wrong. For one, you can no longer guarantee that the node is in a 'safe state'. Like, what happens if you saved half a node and you recursively try saving that node again? This needs a better solution. ------------------------------------------------------------------------ Thu, 18 Aug 2005 23:21:41 +0000 : chx I have a possible solution, which does not need core change. Keep a stack of necessary actions in SESSION and pop the stack first in _exit then subsequent _init calls 'till the stack is empty. ------------------------------------------------------------------------ Fri, 19 Aug 2005 00:58:11 +0000 : jvandyk Attachment: http://drupal.org/files/issues/save_hooks.diff (1.83 KB) OK, here is an alternate approach. We leave node_save() as it is, but add two new nodeapi hooks to node_submit(). For reference, here's the meat from my message to the devel list, as chx wants background: The nodeapi calls for nodes go like this on node submission: 1. nodeapi validate hook 2. (node is saved) 3. nodeapi insert/update hook a... modules respond b... modules respond c... modules respond ... workflow modules responds, fires actions, the last of which saves the node (go to 2) x... modules respond y... modules respond z... modules respond As you can see, we have some problems. Because module_invoke() works alphabetically, any modules with names alphabetically after "workflow" have not yet responded when the workflow hook is invoked. Remember, the workflow hook activity looks like this: 1. nodeapi insert/update is caught by workflow module 2. workflow module fires the 'transition pre' workflow hook to let modules know a transition is about to occur 3. modules have a chance to veto the workflow transition 4. if no module vetoes the transition, the 'transition post' workflow hook is fired so modules can respond to the workflow state change. So to move ahead, we need a 1a nodeapi hook that says, "validation passed and this node will now be saved" so workflow changes can happen there, and/or a 4 nodeapi hook that says, "the node has been saved, all modules have done their thing, it's safe to do a node_load and get a node that is in a consistent state". I'd like to see both because it gives maximum flexibility for workflow. For example, it's much more efficient to change the status of the node to published after validation but before saving than it is to use the current method of saving the node again. So the revised flow looks like this: 1. nodeapi validate hook 1a. node is guaranteed validated; nodeapi save pre hook (workflow fires efficient actions like changing status) 2. (node is saved) 3. nodeapi insert/update hook a... modules respond b... modules respond c... modules respond ... x... modules respond y... modules respond z... modules respond 4. node is guaranteed saved; nodeapi save post hook (workflow fires actions like email notification) ------------------------------------------------------------------------ Fri, 19 Aug 2005 19:00:53 +0000 : jvandyk Attachment: http://drupal.org/files/issues/save_hooks2.diff (1.83 KB) After discussion with chx and moshe, changed names of hooks to be update pre, update post, insert pre, insert post so that listeners have a better context in which to make decisions. I don't like the _exit hook solution because that will slow down every page. Todd Grimason linked to this page [1] of Ruby on Rails hooks on drupal-dev. Although these are 4 hooks, note that only 2 will ever execute at a given time, because we are either doing an insert or an update. [1] http://api.rubyonrails.com/classes/ActiveRecord/Callbacks.html ------------------------------------------------------------------------ Fri, 19 Aug 2005 19:10:46 +0000 : chx I like this, but if this gets in, next issue will be to clean up 'validate'... ------------------------------------------------------------------------ Fri, 26 Aug 2005 16:18:46 +0000 : Kjartan
From the conversations on IRC here is where I ended up on nodeapi calls.
Nodeapi calls when creating a new node: default values -> form pre -> form post -> validate -> finalize insert -> insert Nodeapi calls when updating an existing node: load -> form pre -> form post -> validate -> finalize update -> update Not sure I completely agree with splitting finalize, but I am sure those who support it will post nice examples to make it a clear choice :) This would remove the hack in node_form with validating an empty node, and let modules perform actions just before its saved. ------------------------------------------------------------------------ Tue, 30 Aug 2005 16:17:12 +0000 : jvandyk Attachment: http://drupal.org/files/issues/save_hooks3.diff (1.78 KB) I have updated this patch to work with HEAD after the recent node changes. Here's a review of this patch: Why do we need the 'update pre' hook? Because we need a nodeapi call where modules can assume that they are working with a node that has passed validation. For example, if actions such as e-mail sending are going to fire, you don't want them to fire and then a module later in the node_invoke_nodeapi() sequence to throw a validation error. You want validation to pass as a separate step, and then be able to fire actions without worrying that a node_save() might not happen. Why do we need the 'update post' hook? Because we need to guarantee that we are working with a "clean" node. If you use the 'insert' or 'update' hooks, a module later in the node_invoke_nodeapi() sequence may make changes to database tables that will affect the node. Modules typically use the 'insert' and 'update' hooks to modify their own private tables. To ensure consistency and avoid corruption, you want to make certain that all modules have completed their own private database updates before moving on. Why do you want to make certain of this? Because you do not know what actions are going to do. Actions (as in actions.module) are by definition arbitrary. It is much better to have all tables in a consistent state before firing actions. You want to make sure that if an action does a node_load it will get a clean node, including all the updates that listening modules do. Steven, please chime in on this as this was your concern. One other point on the 'update post' hook. It is conceivable that actions fired may run a long time or run into problems. Of course actions need to be written as cleanly as possible. 'update post' gives actions a chance to decide that they will do things that may have problems (xmlrpc posts that may timeout, for example) and breathe easy knowing that even if there is a timeout or other error, the node has already been saved. Do we need 'insert pre' and 'update pre'? Why not just use 'save pre'? I think that since we already know the context, it is nice to pass that along to modules so they do not have to test for the existence of $node->nid to waste a cycle figuring out something we already know. What about naming? Kjartan has suggested changing the name 'update pre' to 'finalize update' and 'update insert' to 'finalize insert'. I have left the names as is because I think both hooks are important and I think the current names are easiest to understand. 'finalize' does not give a clue as to whether it is before or after the actual node_save(). But I am open to better naming if it is easy to understand for the developer. What's the downside? I do not see a downside to this patch. It adds two nodeapi calls on insert, or two on update. It is not a performance hit. It adds only 4 lines of code. It does not require anyone to rewrite anything. And it is the best solution we have to the problem. ------------------------------------------------------------------------ Tue, 06 Sep 2005 14:28:08 +0000 : Iluvatar I am creating a module that creates an additional node in hook_nodeapi('insert', ...). The additional node was created OK, but the actual node (where hook_nodeapi was invoked for in the first place) not: information like taxonomy was lost. I couldn't find anything to fix this bug, untill I came accross this post. The problem appearently was that I was using node_save recursive, and my module name starts with an A and the taxonomy module with a T (duh). Well, the patch above (save_hooks3.diff) certainly fixed this problem for me by creating the node in the 'insert post' hook instead of the 'insert' hook. Very nice! I agree with jvandyk, there's no downside to this patch, there are some additional calls but at least module_implements 'caches' the list of modules that implement a certain hook, hook_nodeapi in this case. I myself cannot find a practical implementation of the 'insert pre' hook (or 'update pre'), though. jvandyk says something about sending an e-mail, but this can be done just as wel in either 'insert' or 'insert post'. By the way, another thing that worked for me was changing in node_invoke_nodeapi: foreach (module_implements('nodeapi') as $name) { to $implementations = module_implements('nodeapi'); foreach ($implementations as $name) { I don't know why this worked, but it probably has something todo with the static variable in module_implements. I think this patch is very usefull and this bug /should be fixed in 4.7/. Thank you. ------------------------------------------------------------------------ Sun, 11 Sep 2005 03:43:34 +0000 : jvandyk I1uv4t4r says /I myself cannot find a practical implementation of the 'insert pre' hook (or 'update pre')/ For one thing, it's for performance. A quick example: it is faster to set the node status to "published" during the 'insert-pre' hook than to do an entire node_load and node_save again later to accomplish this change. ------------------------------------------------------------------------ Sun, 11 Sep 2005 03:52:08 +0000 : chx Recently, I was given tasks that were solved easiest by setting up various node relations. When you do that, this patch becomes a godsend. ------------------------------------------------------------------------ Mon, 12 Sep 2005 16:43:22 +0000 : moshe weitzman The approach of this most recent patch has seen a lot of review and comment here. I don't see any outstanding objections, so I set to 'ready for commit' ------------------------------------------------------------------------ Thu, 15 Sep 2005 13:57:16 +0000 : mathias +1 for this 4-line patch. Currently, bizarre module names (so they are the last to execute) and extraneous node_save calls are the only way around this problem. Since Drupal is built around the concept of a node it makes sense that core provides a sophisticated node workflow as offerred here.
Issue status update for http://drupal.org/node/27007 Post a follow up: http://drupal.org/project/comments/add/27007 Project: Drupal Version: cvs Component: node system Category: bug reports Priority: critical Assigned to: jvandyk Reported by: jvandyk Updated by: Iluvatar Status: patch (ready to be committed) This is a very nice addition to the nodeapi hook, and I'm using it already for one of my modules. +1 from me. Iluvatar Previous comments: ------------------------------------------------------------------------ Tue, 19 Jul 2005 02:14:16 +0000 : jvandyk Attachment: http://drupal.org/files/issues/node_serialized2x.patch (664 bytes) node_save() saves a node, then calls insert or update hooks. These hooks may run functions that, in turn, call node_save(). This all works fine except that if a revision is being created, the revision gets serialized twice -- once on the original node_save() call and once on the callback's node_save() call. The attached two-line patch simply inserts a boolean variable called rev_serialized to keep track of whether the serialization has already been done. I'd like to see cvs and 4.6 patched. This patch is necessary so I can release actions and workflow for 4.6. Background: http://drupal.org/node/24326 ------------------------------------------------------------------------ Tue, 19 Jul 2005 18:36:23 +0000 : Dries I don't like this patch. It's a hairy fix. There must be a better way to avoid recursion in first place. ------------------------------------------------------------------------ Thu, 18 Aug 2005 01:26:21 +0000 : moshe weitzman gosh, this a two line change with zero performanhce impact. seems like a useful and minimally invasive fix for our current dilemma. we'll remove it once a better fix comes around. is a shame to hold back workflow/actions for such a minor issue. any others want to comment or suggest a different solution?/ ------------------------------------------------------------------------ Thu, 18 Aug 2005 04:57:11 +0000 : chx If there would be a little comment why this is needed, then I'd +1 it heartily. ------------------------------------------------------------------------ Thu, 18 Aug 2005 11:59:14 +0000 : killes@www.drop.org Please commit. In case revisions shouldn't make it (for performance reasons ;) we need it fixed for 4.7. ------------------------------------------------------------------------ Thu, 18 Aug 2005 14:22:15 +0000 : moshe weitzman The first user of this will be workflow+actions modules. They listen on nodeapi's insert/update operations. When a node is seen there, arbitrary actions may fire according to the admin's wishes. So for example, a node may be promoted or may become sticky. To do that properly, the actions should call node_save() again with $node->promote=1. Thus, we are recursively calling node_save() which leads to a double serialize problem. This 2 line fixes it. There may be more nodes which benefit from this. ------------------------------------------------------------------------ Thu, 18 Aug 2005 14:23:24 +0000 : moshe weitzman I meant to say - "there may be more *modules* which benefit from this." ------------------------------------------------------------------------ Thu, 18 Aug 2005 21:50:36 +0000 : Dries Won't commit. If you are calling node_save() recursively, something is really wrong. For one, you can no longer guarantee that the node is in a 'safe state'. Like, what happens if you saved half a node and you recursively try saving that node again? This needs a better solution. ------------------------------------------------------------------------ Thu, 18 Aug 2005 23:21:41 +0000 : chx I have a possible solution, which does not need core change. Keep a stack of necessary actions in SESSION and pop the stack first in _exit then subsequent _init calls 'till the stack is empty. ------------------------------------------------------------------------ Fri, 19 Aug 2005 00:58:11 +0000 : jvandyk Attachment: http://drupal.org/files/issues/save_hooks.diff (1.83 KB) OK, here is an alternate approach. We leave node_save() as it is, but add two new nodeapi hooks to node_submit(). For reference, here's the meat from my message to the devel list, as chx wants background: The nodeapi calls for nodes go like this on node submission: 1. nodeapi validate hook 2. (node is saved) 3. nodeapi insert/update hook a... modules respond b... modules respond c... modules respond ... workflow modules responds, fires actions, the last of which saves the node (go to 2) x... modules respond y... modules respond z... modules respond As you can see, we have some problems. Because module_invoke() works alphabetically, any modules with names alphabetically after "workflow" have not yet responded when the workflow hook is invoked. Remember, the workflow hook activity looks like this: 1. nodeapi insert/update is caught by workflow module 2. workflow module fires the 'transition pre' workflow hook to let modules know a transition is about to occur 3. modules have a chance to veto the workflow transition 4. if no module vetoes the transition, the 'transition post' workflow hook is fired so modules can respond to the workflow state change. So to move ahead, we need a 1a nodeapi hook that says, "validation passed and this node will now be saved" so workflow changes can happen there, and/or a 4 nodeapi hook that says, "the node has been saved, all modules have done their thing, it's safe to do a node_load and get a node that is in a consistent state". I'd like to see both because it gives maximum flexibility for workflow. For example, it's much more efficient to change the status of the node to published after validation but before saving than it is to use the current method of saving the node again. So the revised flow looks like this: 1. nodeapi validate hook 1a. node is guaranteed validated; nodeapi save pre hook (workflow fires efficient actions like changing status) 2. (node is saved) 3. nodeapi insert/update hook a... modules respond b... modules respond c... modules respond ... x... modules respond y... modules respond z... modules respond 4. node is guaranteed saved; nodeapi save post hook (workflow fires actions like email notification) ------------------------------------------------------------------------ Fri, 19 Aug 2005 19:00:53 +0000 : jvandyk Attachment: http://drupal.org/files/issues/save_hooks2.diff (1.83 KB) After discussion with chx and moshe, changed names of hooks to be update pre, update post, insert pre, insert post so that listeners have a better context in which to make decisions. I don't like the _exit hook solution because that will slow down every page. Todd Grimason linked to this page [1] of Ruby on Rails hooks on drupal-dev. Although these are 4 hooks, note that only 2 will ever execute at a given time, because we are either doing an insert or an update. [1] http://api.rubyonrails.com/classes/ActiveRecord/Callbacks.html ------------------------------------------------------------------------ Fri, 19 Aug 2005 19:10:46 +0000 : chx I like this, but if this gets in, next issue will be to clean up 'validate'... ------------------------------------------------------------------------ Fri, 26 Aug 2005 16:18:46 +0000 : Kjartan
From the conversations on IRC here is where I ended up on nodeapi calls.
Nodeapi calls when creating a new node: default values -> form pre -> form post -> validate -> finalize insert -> insert Nodeapi calls when updating an existing node: load -> form pre -> form post -> validate -> finalize update -> update Not sure I completely agree with splitting finalize, but I am sure those who support it will post nice examples to make it a clear choice :) This would remove the hack in node_form with validating an empty node, and let modules perform actions just before its saved. ------------------------------------------------------------------------ Tue, 30 Aug 2005 16:17:12 +0000 : jvandyk Attachment: http://drupal.org/files/issues/save_hooks3.diff (1.78 KB) I have updated this patch to work with HEAD after the recent node changes. Here's a review of this patch: Why do we need the 'update pre' hook? Because we need a nodeapi call where modules can assume that they are working with a node that has passed validation. For example, if actions such as e-mail sending are going to fire, you don't want them to fire and then a module later in the node_invoke_nodeapi() sequence to throw a validation error. You want validation to pass as a separate step, and then be able to fire actions without worrying that a node_save() might not happen. Why do we need the 'update post' hook? Because we need to guarantee that we are working with a "clean" node. If you use the 'insert' or 'update' hooks, a module later in the node_invoke_nodeapi() sequence may make changes to database tables that will affect the node. Modules typically use the 'insert' and 'update' hooks to modify their own private tables. To ensure consistency and avoid corruption, you want to make certain that all modules have completed their own private database updates before moving on. Why do you want to make certain of this? Because you do not know what actions are going to do. Actions (as in actions.module) are by definition arbitrary. It is much better to have all tables in a consistent state before firing actions. You want to make sure that if an action does a node_load it will get a clean node, including all the updates that listening modules do. Steven, please chime in on this as this was your concern. One other point on the 'update post' hook. It is conceivable that actions fired may run a long time or run into problems. Of course actions need to be written as cleanly as possible. 'update post' gives actions a chance to decide that they will do things that may have problems (xmlrpc posts that may timeout, for example) and breathe easy knowing that even if there is a timeout or other error, the node has already been saved. Do we need 'insert pre' and 'update pre'? Why not just use 'save pre'? I think that since we already know the context, it is nice to pass that along to modules so they do not have to test for the existence of $node->nid to waste a cycle figuring out something we already know. What about naming? Kjartan has suggested changing the name 'update pre' to 'finalize update' and 'update insert' to 'finalize insert'. I have left the names as is because I think both hooks are important and I think the current names are easiest to understand. 'finalize' does not give a clue as to whether it is before or after the actual node_save(). But I am open to better naming if it is easy to understand for the developer. What's the downside? I do not see a downside to this patch. It adds two nodeapi calls on insert, or two on update. It is not a performance hit. It adds only 4 lines of code. It does not require anyone to rewrite anything. And it is the best solution we have to the problem. ------------------------------------------------------------------------ Tue, 06 Sep 2005 14:28:08 +0000 : Iluvatar I am creating a module that creates an additional node in hook_nodeapi('insert', ...). The additional node was created OK, but the actual node (where hook_nodeapi was invoked for in the first place) not: information like taxonomy was lost. I couldn't find anything to fix this bug, untill I came accross this post. The problem appearently was that I was using node_save recursive, and my module name starts with an A and the taxonomy module with a T (duh). Well, the patch above (save_hooks3.diff) certainly fixed this problem for me by creating the node in the 'insert post' hook instead of the 'insert' hook. Very nice! I agree with jvandyk, there's no downside to this patch, there are some additional calls but at least module_implements 'caches' the list of modules that implement a certain hook, hook_nodeapi in this case. I myself cannot find a practical implementation of the 'insert pre' hook (or 'update pre'), though. jvandyk says something about sending an e-mail, but this can be done just as wel in either 'insert' or 'insert post'. By the way, another thing that worked for me was changing in node_invoke_nodeapi: foreach (module_implements('nodeapi') as $name) { to $implementations = module_implements('nodeapi'); foreach ($implementations as $name) { I don't know why this worked, but it probably has something todo with the static variable in module_implements. I think this patch is very usefull and this bug /should be fixed in 4.7/. Thank you. ------------------------------------------------------------------------ Sun, 11 Sep 2005 03:43:34 +0000 : jvandyk I1uv4t4r says /I myself cannot find a practical implementation of the 'insert pre' hook (or 'update pre')/ For one thing, it's for performance. A quick example: it is faster to set the node status to "published" during the 'insert-pre' hook than to do an entire node_load and node_save again later to accomplish this change. ------------------------------------------------------------------------ Sun, 11 Sep 2005 03:52:08 +0000 : chx Recently, I was given tasks that were solved easiest by setting up various node relations. When you do that, this patch becomes a godsend. ------------------------------------------------------------------------ Mon, 12 Sep 2005 16:43:22 +0000 : moshe weitzman The approach of this most recent patch has seen a lot of review and comment here. I don't see any outstanding objections, so I set to 'ready for commit' ------------------------------------------------------------------------ Thu, 15 Sep 2005 13:57:16 +0000 : mathias +1 for this 4-line patch. Currently, bizarre module names (so they are the last to execute) and extraneous node_save calls are the only way around this problem. Since Drupal is built around the concept of a node it makes sense that core provides a sophisticated node workflow as offerred here. ------------------------------------------------------------------------ Thu, 15 Sep 2005 15:08:15 +0000 : killes@www.drop.org I'd also like this patch to be committed. It will enable us to do cool and exciting new stuff with Drupal.
participants (15)
-
chx -
Dries -
Dries Buytaert -
Gerhard Killesreiter -
I1uv4t4r -
Iluvatar -
jvandyk -
killes -
Kjartan -
Kjartan Mannes -
mathias -
moshe weitzman -
puregin -
Sohodojo Jim -
Todd Grimason