Development
Threads by month
- ----- 2026 -----
- June
- May
- April
- March
- February
- January
- ----- 2025 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2024 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2023 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2022 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2021 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2020 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2019 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2018 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2017 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2016 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2015 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2014 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2013 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2012 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2011 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2010 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2009 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2008 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2007 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2006 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2005 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
August 2005
- 73 participants
- 637 discussions
[drupal-devel] [feature] create system option to prevent users from changing the theme
by robertDouglass 20 Aug '05
by robertDouglass 20 Aug '05
20 Aug '05
Issue status update for
http://drupal.org/node/26302
Post a follow up:
http://drupal.org/project/comments/add/26302
Project: Drupal
Version: 4.6.0
Component: system.module
Category: feature requests
Priority: normal
Assigned to: Torenware
Reported by: Torenware
Updated by: robertDouglass
Status: patch (code needs work)
I'm not for removing it. There are a lot of efforts going forward to do
things like theme an individual's blog differently (blog_theme module,
I think), and this is very interesting, especially if combined with
some form of theme editor. The best interface we have for choosing
themes is part of the user edit page. Let's not get rid of it. What's
wrong with an permission to allow users to choose their own theme?
robertDouglass
Previous comments:
------------------------------------------------------------------------
Mon, 04 Jul 2005 01:50:45 +0000 : Torenware
Attachment: http://drupal.org/files/issues/add-user-themable-option.diff (1.86 KB)
>From the forum thread How To Prevent Users From Changing The Theme. I
asked:
"
I'm working on a module that uses organic groups. I want to allow
different groups to choose different themes, but I don't want
individual users to have that capability.
I'm assuming that this is easy to do, but I can't find a setting. Can
anybody point me in the right direction?
Thanks,
Rob
"
At sepeck's suggestion, I'm submitting a patch to do this. It's
against 4.6.0, and modifies theme.inc and system.module. It creates a
variable 'configurable_user_themes' which defaults to "enabled"
(4.6.0's current behavior), and if activated, makes system.module
remove the UI for selected a theme, and also causes theme.inc to ignore
any setting for $user->theme if it exists.
------------------------------------------------------------------------
Mon, 04 Jul 2005 01:54:40 +0000 : Torenware
Attachment: http://drupal.org/files/issues/add-user-themable-option_0.diff (1.86 KB)
Didn't set this to "patch" initially, so I'm including the patch again.
Not sure what the procedure is just yet.
------------------------------------------------------------------------
Mon, 04 Jul 2005 05:07:44 +0000 : TDobes
-1 -- sorry, but I feel that we should use a different method for this
sort of thing. Any theme that is checked in the admin-> themes screen
should be selectable by users. Modules should get their list of
available themes from list_themes(), which provides a list of all
available themes, not just those that are selected for use by users.
------------------------------------------------------------------------
Mon, 04 Jul 2005 09:21:57 +0000 : Bèr Kessels
I think we should simply rename the "enabled" into "selecatble". Or any
better term.
For the only thing tha checkbox does is enable it for users to select.
It does not enable or disable the theme at all.
------------------------------------------------------------------------
Mon, 04 Jul 2005 18:56:53 +0000 : Torenware
A couple of comments:
To TDobbs: for most applications, you're right. But for some
applications where it's important to the site that user experience is
well tested, letting individual users set the theme is something of a
nightmare; a problem with the theme is perceived as a problem with the
site. So for /those/ sites (such as the one I'm currently working on),
letting the user set the theme was APITA.
Still, for a community site, I agree that the current behavior is fine.
So I made the current behavior the default.
To Bèr Kessels: Let me see if I understand what you mean.
Currently, I've coded the UI to say in its English default:
User theme settings
Configurable Themes:
* Disabled
* Enabled
Enable or disable user-configurable themes. When enabled, users can
set their own theme to override the site's default theme.
Would you prefer:
User theme settings
Selectable Themes:
* Disabled
* Enabled
Enable or disable user-selectable themes. When enabled, users can
select their own theme to override the site's default theme.
I don't really want to lose the "Disable"/"Enable" part, because if you
look at the current UI in 4.6, all similar settings are done that way.
So using a checkbox would be inconsistent with the rest of the page.
How's that?
Rob
------------------------------------------------------------------------
Mon, 04 Jul 2005 21:31:14 +0000 : Bèr Kessels
Attachment: http://drupal.org/files/issues/theme.png (17.19 KB)
you indeed misunderstood me.
We have a "enable theme" checkbox in the theme overview. That checkbos
*does not enable anything* it only gives users the opportunity to
select that theme.
Please se the screeny (sorry, Dutch locale enabled) for what I mean. Th
e second comumn. That chackbox. Just investigte its use. You will its
complete and utter cruft, that checkbox.
Ber
------------------------------------------------------------------------
Mon, 04 Jul 2005 23:20:55 +0000 : Torenware
Ber,
I have a site that may well use themes set at the group level, so
unactivating the theme via admin/theme is not a great solution for me.
The problem with your solution is that if a theme is deactivated, it is
not available for anyone.
I need multiple themes available, but need to make the rather confusing
UI that appears in user/n/edit to go away. Your options will not do
that.
------------------------------------------------------------------------
Mon, 04 Jul 2005 23:28:27 +0000 : Torenware
Ber,
/Please se the screeny (sorry, Dutch locale enabled) for what I mean.
Th e second comumn. That chackbox. Just investigte its use. You will
its complete and utter cruft, that checkbox./
I'm not following you. Could you please explain a bit more?
Thanks,
Rob
------------------------------------------------------------------------
Mon, 04 Jul 2005 23:37:13 +0000 : TDobes
/I have a site that may well use themes set at the group level, so
unactivating the theme via admin/theme is not a great solution for me./
Sure it is! Deactivating themes in admin->theme removes them *ONLY*
from the list of themes available for users to select. It should not
remove them from the list of available themes for modules to use (via.
the $custom_theme variable). If themes are disappearing from the
modules' lists, these are bugs in the modules and should be reported
elsewhere.
I was under the impression that you were writing your own module, which
is why I suggested you use the list_themes function. This function
returns all available themes, not just those that are selected in
admin->themes.
Ber has mentioned that the term "enabled" is poorly chosen and should
be replaced with something else, such as "user-selectable". I agree
with him.
/The problem with your solution is that if a theme is deactivated, it
is not available for anyone./
This is completely false. It is not available to be chosen in the "my
account" screen, but it is still very readily available for use by
modules through the $custom_theme variable.
/I need multiple themes available, but need to make the rather
confusing UI that appears in user/n/edit to go away. Your options will
not do that./
Yes they do. How are you determining the list of available themes in
your module? If the themes are disappearing when you deselect them
from admin->themes, then your module is working improperly.
------------------------------------------------------------------------
Mon, 04 Jul 2005 23:59:25 +0000 : Torenware
OK, if I'm understanding you, then indeed the UI in admin/themes is bad
done enough to fool a native English speaker. What y'all from Europe
make of some of the terms, I can't even guess :-)
I'll see if the UI disappears if all of the items are "deactivated"
makes the UI go away (and doesn't break Organic Groups...). But it
sounds like the real "bug" is in the documentation. This string:
/
Select which themes are available to your users and specify the default
theme. To configure site-wide display settings, click the "configure"
task above. Alternately, to override these settings in a specific
theme, click the "configure" link for the corresponding theme.
/
needs changing. Perhaps something like "Choose a default theme for
your site, and select which themes your users can choose to override
the default."
For the record: I do not have a clue what "Alternately, to override
these settings in a specific theme, click the "configure" link for the
corresponding theme.". If someone will tell me what that means, maybe
some of us can translate it into English :-) Then someone can translate
it into whatever else we need to support.
------------------------------------------------------------------------
Sat, 20 Aug 2005 05:22:53 +0000 : TDobes
Please see the patch I've attached to this issue [1], which renames
"enabled" to "user-selectable"... this should hopefully clear up a lot
of confusion.
[1] http://drupal.org/node/29002
------------------------------------------------------------------------
Sat, 20 Aug 2005 07:52:28 +0000 : robertDouglass
I'd like to propose a different solution; create a "can select own
theme" permission for users.
I agree that there are cases where some people should be able to choose
their theme and others shouldn't. Personally, I don't make many sites
where you get to "skin" the site just for fun. More likely, I use theme
selection for development purposes (I develop the next version of a
site's theme and can switch back and forth, and so can the developer,
but nobody else).
If we had a permission that could be given to roles, then og group
administrators could be given this permission, if I'm not mistaken.
Would this fit the bill, Torenware?
------------------------------------------------------------------------
Sat, 20 Aug 2005 17:30:29 +0000 : Bèr Kessels
personally I am all for removing that option alltogether. "for
development reasons" is simply not a reason to leave this in.
What about getting rid of this "let users select their alternative
theme". And focus more on sections/themeswitchers ?
1
0
[drupal-devel] [feature] create system option to prevent users from changing the theme
by Bèr Kessels 20 Aug '05
by Bèr Kessels 20 Aug '05
20 Aug '05
Issue status update for
http://drupal.org/node/26302
Post a follow up:
http://drupal.org/project/comments/add/26302
Project: Drupal
Version: 4.6.0
Component: system.module
Category: feature requests
Priority: normal
Assigned to: Torenware
Reported by: Torenware
Updated by: Bèr Kessels
Status: patch (code needs work)
personally I am all for removing that option alltogether. "for
development reasons" is simply not a reason to leave this in.
What about getting rid of this "let users select their alternative
theme". And focus more on sections/themeswitchers ?
Bèr Kessels
Previous comments:
------------------------------------------------------------------------
Mon, 04 Jul 2005 01:50:45 +0000 : Torenware
Attachment: http://drupal.org/files/issues/add-user-themable-option.diff (1.86 KB)
>From the forum thread How To Prevent Users From Changing The Theme. I
asked:
"
I'm working on a module that uses organic groups. I want to allow
different groups to choose different themes, but I don't want
individual users to have that capability.
I'm assuming that this is easy to do, but I can't find a setting. Can
anybody point me in the right direction?
Thanks,
Rob
"
At sepeck's suggestion, I'm submitting a patch to do this. It's
against 4.6.0, and modifies theme.inc and system.module. It creates a
variable 'configurable_user_themes' which defaults to "enabled"
(4.6.0's current behavior), and if activated, makes system.module
remove the UI for selected a theme, and also causes theme.inc to ignore
any setting for $user->theme if it exists.
------------------------------------------------------------------------
Mon, 04 Jul 2005 01:54:40 +0000 : Torenware
Attachment: http://drupal.org/files/issues/add-user-themable-option_0.diff (1.86 KB)
Didn't set this to "patch" initially, so I'm including the patch again.
Not sure what the procedure is just yet.
------------------------------------------------------------------------
Mon, 04 Jul 2005 05:07:44 +0000 : TDobes
-1 -- sorry, but I feel that we should use a different method for this
sort of thing. Any theme that is checked in the admin-> themes screen
should be selectable by users. Modules should get their list of
available themes from list_themes(), which provides a list of all
available themes, not just those that are selected for use by users.
------------------------------------------------------------------------
Mon, 04 Jul 2005 09:21:57 +0000 : Bèr Kessels
I think we should simply rename the "enabled" into "selecatble". Or any
better term.
For the only thing tha checkbox does is enable it for users to select.
It does not enable or disable the theme at all.
------------------------------------------------------------------------
Mon, 04 Jul 2005 18:56:53 +0000 : Torenware
A couple of comments:
To TDobbs: for most applications, you're right. But for some
applications where it's important to the site that user experience is
well tested, letting individual users set the theme is something of a
nightmare; a problem with the theme is perceived as a problem with the
site. So for /those/ sites (such as the one I'm currently working on),
letting the user set the theme was APITA.
Still, for a community site, I agree that the current behavior is fine.
So I made the current behavior the default.
To Bèr Kessels: Let me see if I understand what you mean.
Currently, I've coded the UI to say in its English default:
User theme settings
Configurable Themes:
* Disabled
* Enabled
Enable or disable user-configurable themes. When enabled, users can
set their own theme to override the site's default theme.
Would you prefer:
User theme settings
Selectable Themes:
* Disabled
* Enabled
Enable or disable user-selectable themes. When enabled, users can
select their own theme to override the site's default theme.
I don't really want to lose the "Disable"/"Enable" part, because if you
look at the current UI in 4.6, all similar settings are done that way.
So using a checkbox would be inconsistent with the rest of the page.
How's that?
Rob
------------------------------------------------------------------------
Mon, 04 Jul 2005 21:31:14 +0000 : Bèr Kessels
Attachment: http://drupal.org/files/issues/theme.png (17.19 KB)
you indeed misunderstood me.
We have a "enable theme" checkbox in the theme overview. That checkbos
*does not enable anything* it only gives users the opportunity to
select that theme.
Please se the screeny (sorry, Dutch locale enabled) for what I mean. Th
e second comumn. That chackbox. Just investigte its use. You will its
complete and utter cruft, that checkbox.
Ber
------------------------------------------------------------------------
Mon, 04 Jul 2005 23:20:55 +0000 : Torenware
Ber,
I have a site that may well use themes set at the group level, so
unactivating the theme via admin/theme is not a great solution for me.
The problem with your solution is that if a theme is deactivated, it is
not available for anyone.
I need multiple themes available, but need to make the rather confusing
UI that appears in user/n/edit to go away. Your options will not do
that.
------------------------------------------------------------------------
Mon, 04 Jul 2005 23:28:27 +0000 : Torenware
Ber,
/Please se the screeny (sorry, Dutch locale enabled) for what I mean.
Th e second comumn. That chackbox. Just investigte its use. You will
its complete and utter cruft, that checkbox./
I'm not following you. Could you please explain a bit more?
Thanks,
Rob
------------------------------------------------------------------------
Mon, 04 Jul 2005 23:37:13 +0000 : TDobes
/I have a site that may well use themes set at the group level, so
unactivating the theme via admin/theme is not a great solution for me./
Sure it is! Deactivating themes in admin->theme removes them *ONLY*
from the list of themes available for users to select. It should not
remove them from the list of available themes for modules to use (via.
the $custom_theme variable). If themes are disappearing from the
modules' lists, these are bugs in the modules and should be reported
elsewhere.
I was under the impression that you were writing your own module, which
is why I suggested you use the list_themes function. This function
returns all available themes, not just those that are selected in
admin->themes.
Ber has mentioned that the term "enabled" is poorly chosen and should
be replaced with something else, such as "user-selectable". I agree
with him.
/The problem with your solution is that if a theme is deactivated, it
is not available for anyone./
This is completely false. It is not available to be chosen in the "my
account" screen, but it is still very readily available for use by
modules through the $custom_theme variable.
/I need multiple themes available, but need to make the rather
confusing UI that appears in user/n/edit to go away. Your options will
not do that./
Yes they do. How are you determining the list of available themes in
your module? If the themes are disappearing when you deselect them
from admin->themes, then your module is working improperly.
------------------------------------------------------------------------
Mon, 04 Jul 2005 23:59:25 +0000 : Torenware
OK, if I'm understanding you, then indeed the UI in admin/themes is bad
done enough to fool a native English speaker. What y'all from Europe
make of some of the terms, I can't even guess :-)
I'll see if the UI disappears if all of the items are "deactivated"
makes the UI go away (and doesn't break Organic Groups...). But it
sounds like the real "bug" is in the documentation. This string:
/
Select which themes are available to your users and specify the default
theme. To configure site-wide display settings, click the "configure"
task above. Alternately, to override these settings in a specific
theme, click the "configure" link for the corresponding theme.
/
needs changing. Perhaps something like "Choose a default theme for
your site, and select which themes your users can choose to override
the default."
For the record: I do not have a clue what "Alternately, to override
these settings in a specific theme, click the "configure" link for the
corresponding theme.". If someone will tell me what that means, maybe
some of us can translate it into English :-) Then someone can translate
it into whatever else we need to support.
------------------------------------------------------------------------
Sat, 20 Aug 2005 05:22:53 +0000 : TDobes
Please see the patch I've attached to this issue [1], which renames
"enabled" to "user-selectable"... this should hopefully clear up a lot
of confusion.
[1] http://drupal.org/node/29002
------------------------------------------------------------------------
Sat, 20 Aug 2005 07:52:28 +0000 : robertDouglass
I'd like to propose a different solution; create a "can select own
theme" permission for users.
I agree that there are cases where some people should be able to choose
their theme and others shouldn't. Personally, I don't make many sites
where you get to "skin" the site just for fun. More likely, I use theme
selection for development purposes (I develop the next version of a
site's theme and can switch back and forth, and so can the developer,
but nobody else).
If we had a permission that could be given to roles, then og group
administrators could be given this permission, if I'm not mistaken.
Would this fit the bill, Torenware?
1
0
Issue status update for
http://drupal.org/node/7582
Post a follow up:
http://drupal.org/project/comments/add/7582
Project: Drupal
Version: cvs
Component: node system
Category: bug reports
Priority: normal
Assigned to: killes(a)www.drop.org
Reported by: killes(a)www.drop.org
Updated by: killes(a)www.drop.org
Status: patch (code needs review)
Attachment: http://drupal.org/files/issues/revisions_35.patch (55.51 KB)
Ok, I coudl hear them in advance "1000 nodes without revisions are not
sufficient". Well, here are results for 30000 nodes without revisions.
/node
new:
Requests per second: 0.45 [#/sec] (mean)
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 2162 2202 58.9 2188 2485
Waiting: 2143 2182 54.1 2169 2413
Total: 2162 2202 58.9 2188 2485
old:
Requests per second: 0.43 [#/sec] (mean)
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 2319 2346 77.6 2332 2761
Waiting: 2223 2324 79.0 2311 2742
Total: 2319 2346 77.6 2332 2761
One could argue that the new approach s slightly faster. I'd like to
see mroe results to believe it, but it is pretty clear it isn't slower.
/tracker
new:
Requests per second: 0.41 [#/sec] (mean)
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 2330 2418 116.9 2403 3194
Waiting: 2302 2397 117.7 2382 3175
Total: 2330 2418 116.9 2403 3194
old:
Requests per second: 0.39 [#/sec] (mean)
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 2496 2548 50.1 2536 2841
Waiting: 2477 2528 50.0 2518 2822
Total: 2496 2548 50.1 2536 2841
Statistically insignificant differences.
All those tests where done with ab and as user No 1.
model name : Intel(R) Celeron(R) CPU 2.40GHz
512MB RAM
Besides the differences for revisions the change in served pages with
different configurations is interesting.
A new version is attached. node_save now returns the complete node (not
only the nid).
killes(a)www.drop.org
Previous comments:
------------------------------------------------------------------------
Wed, 05 May 2004 16:25:27 +0000 : killes(a)www.drop.org
Currently all node revisions are stored in a serialized field in
node.table and retrieved for _each_ page view although they are rarely
needed. However, we have agreed that serializing data is bad and that
we should try to keep the memory foot print pf Drupal small.
Therefore I propose to create a separate revisions table which would be
in principle identical to the node table, only that it could have
several old copies of the same node. Extra data added by other modules
could be added in a serialized field unless we find a better solution.
------------------------------------------------------------------------
Wed, 05 May 2004 17:06:35 +0000 : jhriggs
I too think the serialized approach is less than desirable, but here's
an alternative. This would likely take some considerable rework in
core and contrib, but the following is how we handle similar types of
situations in our databases at work. It is more elegant that a
separate table, and avoids the (almost exact) duplication of a table.
Instead of separate tables, keep all revisions of nodes in the node
table as follows:
* add field: active (0/1 or Y/N)
* add field: revision
* every revision of a node is stored in the node table; however, only
one revision can be active at any given time
* nid can no longer be unique -- primary/unique key becomes (nid,
active)
* any time a node is loaded, updated (without revision), etc., the
active version is used.
Thoughts?
------------------------------------------------------------------------
Wed, 05 May 2004 17:57:48 +0000 : killes(a)www.drop.org
I am not opposed to your scheme, but I want to stress the following:
* Duplicating a table's structure is not bad (IMHO) as long as the
content is different.
* having two tables will allow us to have a rather small node table.
This is (maybe) a performance gain.
------------------------------------------------------------------------
Wed, 05 May 2004 18:37:29 +0000 : jhriggs
I don't necessarily think that duplicating a table's structure is _bad_.
It just seems to be wasteful and a pain to maintain. (Every change to
the node table is made twice...easy to do, but also easy to miss
perhaps.)
As for performance, as long as nid and the active indicator are
indexed, there shouldn't be any performance loss. Also, archiving an
old version when making a new revision will be much simpler: just
change the active indicator rather than copying an entire node to
another table (and ensuring everything gets copied...again a potential
maintainance issue).
To be honest, I would just like to see the serialized data go away,
regarless of what approach is taken.
------------------------------------------------------------------------
Fri, 30 Jul 2004 19:49:33 +0000 : Nick Nassar
Attachment: http://drupal.org/files/issues/Drupal-Improved_Revision_Schema_07-30-2004.p… (10.47 KB)
I'm interested in using Drupal for a large scale wiki-type project. In
order to do this, I need revisions to be in their own table.
Attached is a patch to do just that. Most of the changes are pretty
self explanatory. Spreading out node data across two tables meant that
I had to add database functions to do locking/transactions. Without
this, race conditions in which the database becomes corrupted are
possible.
------------------------------------------------------------------------
Fri, 30 Jul 2004 19:54:36 +0000 : Nick Nassar
Oh yeah... The patch is a diff against Drupal CVS
------------------------------------------------------------------------
Sat, 31 Jul 2004 00:00:08 +0000 : Anonymous
Gerhard speaking.
Nick, thanks a lot for your nice patch! It saves me a great deal of
labour. I looked through it and immediately liked it. You not only put
the old revisions into a new table but also the current one. Do you
have an estimate how much more expensive the additional join is?
Besides a few minor coding style issues I found a major one: Just a few
hours before you uploaded your patch JonBob's node access patch hit
core. That means your patch won't apply anymore as all the queries you
change have been changed. Can I bug you to update your patch?
------------------------------------------------------------------------
Sat, 31 Jul 2004 01:11:59 +0000 : Anonymous
Also I think that your upgrade path loses existing revisions.
------------------------------------------------------------------------
Sat, 31 Jul 2004 02:39:12 +0000 : drumm
I think this is the proper way to do things. No columns are duplicated,
there is no serialized data, and only the fields that are logically
revised are stored. Nothing jumped out at me as a way to have my node
modules be able to keep a table of revisions of additional fields. I'm
guessing this could be done within the confines of _insert and _update.
Assuming the upgrade path works and modules can extend it I give it a
+1.
------------------------------------------------------------------------
Sat, 31 Jul 2004 14:40:15 +0000 : Nick Nassar
It figures that just as I finish a big patch, another patch comes along
and breaks it. Oh well, it should be a pretty easy to fix. I'll work on
it.
Fixing the upgrade path to keep revisions should be fairly painless.
I found another issue that needs to be fixed before this patch gets
merged. There format of a node needs to be stored for each revision.
Otherwise, for modules that store a format for the nodes, such as page
and book, if you write one revision in PHP and the next in HTML, the
PHP revision will be displayed as HTML. This is part of a larger issue
of how node modules should store revisions of additional fields. I
think each module that wants to do this should create another table
with (nid, revid) as the primary key. Just as when they want to add
fields to a node they create another table with nid as the primary key.
As far as performance goes, for sites that make heavy use of revisions,
an extra join on primary keys is going to be a lot faster than grabbing
all of the revisions from that database everytime. We would need to run
benchmarks to determine is the overall difference in speed is for an
average site is a gain or a loss. I'm guessing it's very minor either
way.
------------------------------------------------------------------------
Mon, 23 Aug 2004 13:55:49 +0000 : Nick Nassar
Attachment: http://drupal.org/files/issues/Drupal-Improved_Revision_Schema_08-23-2004.p… (10.92 KB)
Here's an updated patch against CVS that puts revisions in their own
table, provides an upgrade path, and fixes the format related bugs in
the last patch.
Hopefully, this can make it into CVS as soon as the freeze is over.
------------------------------------------------------------------------
Mon, 23 Aug 2004 14:10:39 +0000 : moshe weitzman
Interesting patch ... drumm's question is still outstanding. how do
modules store revisions of their fields? Are they expected to manage
this on their own? Thats not how it works today.
As an aside, i am seeing profile_ fields in my node.revisions column.
One could argue that those need not be saved. They pertain to the node
author, not to the node itself.
------------------------------------------------------------------------
Mon, 23 Aug 2004 16:14:39 +0000 : Nick Nassar
Having modules be responsible for storing revisions of their own fields
is a side-effect of storing revision data in tables. There's really no
way around it. However, revisions generally don't make sense for node
types that don't have PHP/HTML content, such as polls. I think it's
going to be a pretty rare scenario for a new node type to want another
field to change per-revision, so it's a pretty good trade-off.
Storing fields that shouldn't be part of revisions, such as the
profile_ fields, is a side-effect of storing revisions as serialized
objects. Applying this patch will free up that wasted space. :)
------------------------------------------------------------------------
Mon, 23 Aug 2004 17:20:57 +0000 : Anonymous
There should be a hook that let's the module choose whether it supports
history. This way a module author can prevent the user from doing
something that may break his module or just cause undefined behavior.
If the module doesn't support history then don't let the user/admin
choose to add history to nodes of that type.
Craig
------------------------------------------------------------------------
Mon, 23 Aug 2004 19:23:29 +0000 : Nick Nassar
I agree, there should be an API change to make specifying support for
revisions easier. In the interests of keeping patches small and keeping
to one change per patch, I think the API change should be a separate
issue.
A sort of ad-hoc API to decide whether or not a module supports
revisions by default already exists. Instead of having a hook, modules
set the default value of the "Create new revision" field in the edit
form. The admin can change this option in
admin/node/configure/defaults. This patch doesn't change that.
Revisions are broken for node types that have their own database
structure, like polls, even when storing them as serialized objects.
This patch doesn't change that, either.
------------------------------------------------------------------------
Tue, 26 Oct 2004 02:35:06 +0000 : moshe weitzman
I'm guessing that someone is going to have to demonstrate that this
patch performs as well as current drupal before it gets comitted. i
think this patch is a few benchmarks from being comitted.
------------------------------------------------------------------------
Wed, 27 Oct 2004 01:04:09 +0000 : Nick Nassar
Attachment: http://drupal.org/files/issues/Drupal-Improved_Revision_Schema_10-26-2004.p… (11 KB)
I ran some really unscientific benchmarks, and it looks like this patch
has a negligible affect on performance.
I used apache bench and the database from theregular.org, which doesn't
contain any revisions (worst case scenario for this patch) and contains
several hundred nodes. Both the patched and unpatched versions hovered
between 2.36 and 2.38 requests per second.
The command I used was:
ab -n50 -C 'PHPSESSID=b01a9f92880ef215b0ed6f1314a5eba2'
http://192.168.0.100/
An updated patch that should apply to CVS is attached.
------------------------------------------------------------------------
Wed, 27 Oct 2004 01:05:16 +0000 : Nick Nassar
I ran some really unscientific benchmarks, and it looks like this patch
has a negligible affect on performance.
I used apache bench and the database from theregular.org, which doesn't
contain any revisions (worst case scenario for this patch) and contains
several hundred nodes. Both the patched and unpatched versions hovered
between 2.36 and 2.38 requests per second.
The command I used was:
ab -n50 -C 'PHPSESSID=b01a9f92880ef215b0ed6f1314a5eba2'
http://192.168.0.100/
An updated patch that should apply to CVS is attached.
------------------------------------------------------------------------
Wed, 27 Oct 2004 01:05:26 +0000 : Nick Nassar
Attachment: http://drupal.org/files/issues/Drupal-Improved_Revision_Schema_10-26-2004.p… (11 KB)
I ran some really unscientific benchmarks, and it looks like this patch
has a negligible affect on performance.
I used apache bench and the database from theregular.org, which doesn't
contain any revisions (worst case scenario for this patch) and contains
several hundred nodes. Both the patched and unpatched versions hovered
between 2.36 and 2.38 requests per second.
The command I used was:
ab -n50 -C 'PHPSESSID=b01a9f92880ef215b0ed6f1314a5eba2'
http://192.168.0.100/
An updated patch that should apply to CVS is attached.
------------------------------------------------------------------------
Mon, 15 Nov 2004 05:05:30 +0000 : elias1884
please overthink the revision system default workflow as well. don't
look at the revision system as an isolated system but as a part of the
whole workflow system!
if you combine revisions with the moderation queue the most logic
default workflow would be like that:
auth user creates node (revision #0)
admin approves the node (status = 1, moderation = 0)
=> node publicly available
auth user finds typo and changes node (revision #1, status = 0,
moderation = 1)
-------
what happens at that point at the moment is, that the node is not
accessible anymore at all until the new revision is approved by admin.
of course the new revision should not go online until reviewed and
approved, this is absolutely correct, but there is no reason to not
take the old revision offline, since it was already approved and should
therefore be online until the new revision is approved. it is not
practical if a node disappears only because the author corrected a
typo.
-------
admin approves the node (status = 1, moderation = 0)
eventhough I first thought a plain boolean active field would not be
capable of providing that functionality if finally came to the
conclusion, that it can. The only thing to do is to not set that bit,
when a new revision is created, but when it is approved (in case
moderation is activated under default workflow). Every revision should
have its own moderation, status and active field and on approval they
are set like this (status=1, moderation=0, active=Y).
When you wanna rollback to an old revision, you can chose between all
revisions that already have the moderation bit set back to 0 again and
the published to 1. There should be an extra permission for rollback!
another concern that I have about the default workflow is, that users
can't see the content, they have just created, when moderation is
enabled. Eventhough, there is a big fat "submission accepted" presented
after submissions, unexperienced users tend to question the information
those stupid tincans give them, if they can't find their content
afterwards. Many users are really lazy bastards and they don't even
read the status messages. The best feedback about whether his story was
submitted successfully or not of course is, if he can find the story
somewhere on the site, maybe with a status message on top of it,
mentioning, that the content is currently not publicly available since
it has not been approved yet. there should be a my content section
under my account, like somebody is trying to do with the workspace
module I guess.
so my suggestion is to make (status=0, moderation=1) still available
for the creator under a my content section somewhere!
------------------------------------------------------------------------
Wed, 24 Nov 2004 04:21:18 +0000 : Nick Nassar
I agree. The current workflow for moderation queues and revisions needs
to change, but this patch isn't the place for it. The patch is already
too big, and it only does the backend stuff.
Instead of adding more to this patch and making it take even longer to
get into core, would you mind creating a new issue for your UI
suggestions, so the those changes can be added as a separate patch?
Thanks,
Nick
------------------------------------------------------------------------
Sat, 11 Dec 2004 12:26:02 +0000 : Dries
This patch is _much_ needed so I'd love to see someone revive it. In
order for this patch to be accepted, the following needs to be done:
* Update this patch to CVS HEAD.
* Rename revid to vid.
* Rename node_rev to node_revisions.
* Rename node_rev.changed to node_revisions.timestamp.
* Rename $rnode to $revision.
* Fix the coding style to match Drupal's: proper spacing, single quotes
where possible, proper variable names.
* Benchmark this patch with a large database with enough revisions.
I'd be happy to benchmark this on my local copy of the drupal.org
database.
* The book.log field should probably move to the node_revisions table.
This can be done in a separate patch.
* Investigate whether transactions are well-supported.
------------------------------------------------------------------------
Mon, 13 Dec 2004 00:25:40 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/Drupal-Improved_Revision_Schema_10-26-2004-r… (11.02 KB)
I've worked a bit on the patch (coding style issues as mentioned by
Dries). One thing I noticed is that the patch uses REPLACE. IIRC this
needs to be chagned to "UPDATE, if fail INSERT" for pgsql
compatibility.
Nick, are you still interested in working on that patch? I'd like to
know how it works on your site and work on getting it into core.
------------------------------------------------------------------------
Mon, 13 Dec 2004 12:33:08 +0000 : Dries
Gerhard: your patch does not apply.
------------------------------------------------------------------------
Mon, 13 Dec 2004 17:10:12 +0000 : killes(a)www.drop.org
Yes, I know, that was the same version as I mailed to you earlier.
------------------------------------------------------------------------
Mon, 13 Dec 2004 21:02:06 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions.patch (52.96 KB)
Ok, upüdated the patch to cvs.
------------------------------------------------------------------------
Tue, 14 Dec 2004 08:58:36 +0000 : Dries
Some more comments:
* db_begin_transaction() and db_end_transaction() do not belong in
database.inc, but in database.mysql.inc and database.pgsql.inc
respectively.
* The node module calls node_revisionsision_list() which is not
defined. (Fxed that on my local copy.)
* Do db_begin_transaction() and db_end_transaction() deprecate Jeremy's
table locking patch?
* The upgrade path assigns the wrong user ID to each revision.
* The upgrade path assigns the wrong date to each revision (that or a
node's revision page shows the wrong usernames/dates).
* The coding style needs a bit of work, but we can worry about that
later.
------------------------------------------------------------------------
Tue, 14 Dec 2004 17:34:44 +0000 : Nick Nassar
If you need any help getting those things fixed, just let me know.
------------------------------------------------------------------------
Tue, 14 Dec 2004 17:50:30 +0000 : Nick Nassar
How this relates to Jeremy's node locking patch:
There was lots of discussion, and node locking was decided against
because from an end user point of view you never want a node to be
locked. He's now advocating for a much simpler patch that warns users
if their changes will overwrite someone elses. That patch still has a
race condition, which might be fixed using db_begin_transaction().
http://drupal.org/node/6025
------------------------------------------------------------------------
Tue, 14 Dec 2004 22:26:19 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_0.patch (55.96 KB)
Here is an updated patch that tries to address Dries concerns.
------------------------------------------------------------------------
Wed, 15 Dec 2004 08:32:50 +0000 : Dries
Attachment: http://drupal.org/files/issues/revisions-bug.png (76.06 KB)
It didn't fix the aforementioned bugs. See attached screenshot.
------------------------------------------------------------------------
Thu, 06 Jan 2005 20:15:01 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_1.patch (51.77 KB)
Ok, here is a new version. Dries and myself worked hart at it, so please
have a look.
what is still missing
- database upgrades for the core modules with an own table
- contrib modules need an upgrade too.
- do we need nid and vid in both the node and the node_revisions table?
- the amount of sql queries means a good stress testing for large
databases.
------------------------------------------------------------------------
Wed, 19 Jan 2005 21:43:49 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_2.patch (49.49 KB)
Here is an updated patch. We discussed to keep the current title in node
module and also in the revisiosn table. This is content duplication but
will save many joins as many queries only need the title of a node.
Discussion is welcome.
------------------------------------------------------------------------
Wed, 19 Jan 2005 23:33:32 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_3.patch (29.93 KB)
I've implemented the aforementioned solution. This makes the patch much
smaller. The patch now also removes taxonomy_node_has_term() which
wasn't used anywhere. I'd really apprciate if some people could test
drive the patch. It will be another huge improvement for 4.6.
------------------------------------------------------------------------
Thu, 20 Jan 2005 00:05:54 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_4.patch (30 KB)
Another revision. Steven didn't like my literal $node->vid in queries.
------------------------------------------------------------------------
Thu, 20 Jan 2005 01:10:50 +0000 : killes(a)www.drop.org
- database upgrades for the core modules with an own table
- contrib modules need an upgrade too.
- do we need nid and vid in both the node and the node_revisions table?
- the amount of sql queries means a good stress testing for large
databases.
These issues are still open, btw. Especially the first one needs to be
tackled.
------------------------------------------------------------------------
Tue, 25 Jan 2005 20:11:59 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_5.patch (51.13 KB)
Here is a patch that has the database tables updated for forum, book,
and page module.
------------------------------------------------------------------------
Sat, 29 Jan 2005 22:55:59 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_6.patch (49.18 KB)
Yet another update to keep it working with head. The patch now also
removes the table definitons for the page table.
------------------------------------------------------------------------
Sat, 29 Jan 2005 22:57:40 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_7.patch (55.69 KB)
Sorry, that was the old version, this is the right one.
------------------------------------------------------------------------
Mon, 31 Jan 2005 19:55:03 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_8.patch (55.71 KB)
Updated once more.
------------------------------------------------------------------------
Mon, 31 Jan 2005 20:52:08 +0000 : Dries
Anyone to help review/test this?
------------------------------------------------------------------------
Mon, 31 Jan 2005 21:22:36 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_9.patch (49.29 KB)
Updated again, the update functions occurred twice. Thanks Bart.
------------------------------------------------------------------------
Wed, 02 Feb 2005 00:27:05 +0000 : killes(a)www.drop.org
Don't know if the db I am using is corrupted or what. I still do have
some didficulties.
The latest patch is attached.
------------------------------------------------------------------------
Wed, 02 Feb 2005 00:27:49 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_10.patch (49.67 KB)
I am probably slowly going mad ...
------------------------------------------------------------------------
Wed, 02 Feb 2005 01:54:58 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_11.patch (48.95 KB)
The update issue still needs investigating. This patch is updated for
cvs.
------------------------------------------------------------------------
Wed, 02 Feb 2005 20:20:51 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_12.patch (49.83 KB)
Ok, here is a new version. I've solved my troubles with book.module.
There are still some issues with forum module. Possibly due to
inconsistent database.
------------------------------------------------------------------------
Wed, 02 Feb 2005 21:31:05 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_13.patch (49.83 KB)
Turns out the drupal.org database had indeed some quirks. Please run
this query in your oldest db and tell me the result:
select nid,type from node where type like '%/%';
If you get a non-zero result we might need to add another security
update.
The patch could use still more testing, though.
------------------------------------------------------------------------
Thu, 03 Feb 2005 01:16:54 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_14.patch (49 KB)
Ok, we are getting somewhere. At a first glance the update is working.
There is a problem remaining: the revisions tab will be shown whether
the node has revisions or not. Not sure we can/need to fix this.
People with a drupal.org account can log in at
http://killes.drupaldevs.org/revision/ and poke around. Your
permissions will be the same as on drupal.org. Feel free to vreak
everything but don't forget to file complaints here. (Note: this is
only a pruned version of the drupal.org database with all project nodes
and nodes with nids > 7000 dropped).
------------------------------------------------------------------------
Thu, 03 Feb 2005 04:19:14 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_15.patch (52.39 KB)
There was some error in node_save and also the patches to the
database.inc files got lost...
------------------------------------------------------------------------
Thu, 03 Feb 2005 07:07:27 +0000 : robertDouglass
Submitting book pages doesn't work on your test site. It puts the entire
content of the preview inside the body textarea. I wrote a sentence in
the body and the log, and pressing preview put several lines of HTML
containing both sentences in the body textarea on the preview page,
plus the book page wouldn't submit.
-R
------------------------------------------------------------------------
Thu, 03 Feb 2005 07:50:59 +0000 : Junyor
0 results here. I started using Drupal with version 4.4, though.
------------------------------------------------------------------------
Thu, 03 Feb 2005 23:56:18 +0000 : killes(a)www.drop.org
@Junyor: Thanks, that's a good sign. Maybe somebody else has an older db
to try.
@robertDouglass: The first effect you describe is due to drupaldevs
running on PHP 5. I am unsure why the second thing does not work. In
node_save() the node object has a nid although there is none in the
form. Very strange.
I've enabled display of db queries on the testsite.
------------------------------------------------------------------------
Fri, 04 Feb 2005 19:17:55 +0000 : dmjossel
No results here on the query:
select nid,type from node where type like '%/%';
On a database that was put in place prior to Drupal 4 and is now
running on 4.5.2.
------------------------------------------------------------------------
Fri, 04 Feb 2005 20:44:23 +0000 : killes(a)www.drop.org
@dmjossel: thanks.
@all. The strange problem I reported was apparently php 5 related.
After applying Steven's php 5 patch it went away. One error is
remaining: If I create a new forum topic it will be shown as part of
the book on preview. Hmm, that was due to a db that got corrupted
during testing so that is fixed too.
Please poke around at the test site and look if you find more errors.
------------------------------------------------------------------------
Sat, 05 Feb 2005 07:16:22 +0000 : Steven
By the way, I just remembered that Drupal.org has some blogs lingering
on in the database even though blog.module is not enabled. Perhaps this
is causing troubles?
------------------------------------------------------------------------
Sat, 05 Feb 2005 11:22:59 +0000 : Anonymous
I can't see why it would. Drupal.org will need extra updates for images
and project nodes because those have their own tables. GK.
------------------------------------------------------------------------
Sun, 06 Feb 2005 12:49:55 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_16.patch (52.49 KB)
Updated to apply to cvs again.
------------------------------------------------------------------------
Tue, 22 Feb 2005 20:15:40 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_17.patch (49.64 KB)
Updated again.
All we need is a patch to upload module and an upgrade path for it.
------------------------------------------------------------------------
Fri, 04 Mar 2005 04:22:58 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_18.patch (52.31 KB)
Updated once more. Moved log field from book to node_revisions table as
discussed in Antwerp. upload module still missing.
We need to decide under which circumstances the log field should be
displayed. Should that be added to the workflow? Should it depend on
the revisions setting?
------------------------------------------------------------------------
Sat, 05 Mar 2005 19:27:03 +0000 : Anonymous
Attachment: http://drupal.org/files/issues/revisions_20.patch (75.52 KB)
Ok, here it is: Yet another revision of this grrrrreat patch.
Changes from previous versions:
- supports versioning for uploaded files. A problem is that if you
delete a file, it will be gone for all revisions.
- the log field is now in the node_revsions table, but each module has
to decide whether to show it or not.
I've implemented it for the page and the book type odes. Also, the
field can be edited when adding non-book nodes to the book. The log is
displayed on the revisions page and if a node is moderated.
- the revisions are moved to an old_revisions table to a) get the node
table smaller and b) still leave the mavailable for contrib modules
that want to retreive old version data.
The patch has been applied to killes.drupaldevs.org/revision where it
can be tested by anybody (especially people who have "site admin"
rights on drupal.org)
The database is from drupal.org and you should b able to log in with
your pass or simply mail yourself a new one.
Gerhard
------------------------------------------------------------------------
Sat, 05 Mar 2005 19:51:56 +0000 : Anonymous
Attachment: http://drupal.org/files/issues/revisions_21.patch (59.42 KB)
BTW, I marked this a bug because atm the revisions field can grow quite
big. Neil has reported problems from some users who were not able to
load some nodes due to to many large revisions.
Also, som unrelated stuff crept into the patch. New version attached.
------------------------------------------------------------------------
Tue, 08 Mar 2005 05:56:01 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_22.patch (60.29 KB)
Ok, I think I got it.
Changes to last version:
- uploads are no properly versioned.
Missing are still pgsql checks and updates.
------------------------------------------------------------------------
Thu, 10 Mar 2005 16:58:41 +0000 : Anonymous
Was able to get http://drupal.org/files/issues/revisions_21.patch to
work with drupal-cvs.tar.gz (10 March 2005) by:
- includes/database.mysql.inc: Commenting out duplicates for functions
function db_begin_transaction and function db_commit_transaction
- modules: node.module: Removing "'title' => $node->title," from
$node_table_values variable declaration and removing "'title' =>
"'%s'"," from "$node_table_types" variable declaration.
Happy to submit a patch if requested. I'll watch this thread.
------------------------------------------------------------------------
Fri, 11 Mar 2005 23:59:45 +0000 : killes(a)www.drop.org
The duplicate function has been removed in rev 22 of this patch.
Why do you think the changes in node_save are needed? Titles are saved
in both tables for performance reasons.
------------------------------------------------------------------------
Sun, 13 Mar 2005 16:12:21 +0000 : jlerner
Hi - I posted comment #62. The changes to node_save appear to be needed
because recent patches (both 21 and 22) remove the field 'title' from
table 'node'. So without the changes to node_save, node.module is
broken and generates errors.
Joshua
------------------------------------------------------------------------
Sun, 13 Mar 2005 16:29:42 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_23.patch (61.17 KB)
Thanks, Joshua, for catching this. node:title is there to stay.
------------------------------------------------------------------------
Wed, 13 Apr 2005 16:29:23 +0000 : moshe weitzman
since HEAD is open again, perhaps it is a good time to revisit this
patch.
once this is committed, lets address - http://drupal.org/node/11071
"node_validate does not respect group editing"
------------------------------------------------------------------------
Mon, 18 Apr 2005 15:43:42 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_24.patch (60.39 KB)
Updated.
------------------------------------------------------------------------
Mon, 18 Apr 2005 16:16:32 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_24_0.patch (60.39 KB)
Updated.
------------------------------------------------------------------------
Tue, 19 Apr 2005 05:19:42 +0000 : Dries
I'll commit this patch later this week! If you haven't checked this
patch already, I urge you to test/check it out because it will have
significant impact on existing code and modules!
------------------------------------------------------------------------
Tue, 19 Apr 2005 05:21:33 +0000 : Dries
Also, what do people think about the n.title being duplicated?
------------------------------------------------------------------------
Tue, 19 Apr 2005 05:26:58 +0000 : chx
I won't lose any sleep because of duplicated titles...
------------------------------------------------------------------------
Tue, 19 Apr 2005 18:35:58 +0000 : killes(a)www.drop.org
Let me explain why I have chosen to duplicate the title (and also the
uid): If you look at all the queries in Drupal, you will find that most
of them only need the title and th uid of a node. That is, by
duplicating it, we save expensive joins on the node_revisions table.
Due to this fact, this patch is actually a performance improvement.
A note about updating contrib module:
Strictly speaking they wouldn't need to be updated. They only need to
if their authors decide that their info should be available for
revisioning. The upgrade path for forum.module in my update.inc patch
(+ the forum patch)
should show you what needs to be done.
I will write a note for the update page once the patch hits core.
------------------------------------------------------------------------
Sun, 24 Apr 2005 21:21:19 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_25.patch (60.38 KB)
Updated to cvs.
Dries: Based on some remarks in #drupal this is the last update I am
going to do. Apply it or won't fix it.
------------------------------------------------------------------------
Sat, 30 Apr 2005 03:42:39 +0000 : Jeremy(a)kerneltrap.org
Attachment: http://drupal.org/files/issues/revisions_25.patch.patch (528 bytes)
That's a big patch. I've only started looking through it. I noticed
one little typo, affecting updates. A patch to your last patch is
attached.
I'm running with the revision patch on my dev server now happily. I
like the concept.
What happens if you click 'stop' on your browser in the middle of a
MySQL "transaction"? I assume that kills the connection to MySQL, and
the lock is freed? But this then leaves changes only partially
applied?
What exactly does locking/unlocking the tables buy us in MySQL? I
don't see anywhere that we detect if an apply fails part way through,
and thus roll back...? What am I missing?
------------------------------------------------------------------------
Sat, 30 Apr 2005 07:11:28 +0000 : Dries
Jeremy: many of us are worried about the performance ramifications this
patch introduces. Early experiments showed a small performance
improvement (while a performance regression might be expected). More
performance reports from large sites like kerneltrap.org will certainly
help this patch. Mind to do a quick performance comparision and to
report back with some numbers? Thanks.
------------------------------------------------------------------------
Sat, 30 Apr 2005 12:38:02 +0000 : Jeremy(a)kerneltrap.org
Dries: I'm not running HEAD on kerneltrap, so this really isn't a
possibility. Furthermore, until I understand why we're locking tables,
I don't like it. The idea of revisions in their own tables is great.
The idea of locking tables to get (without any obvious benefit) there
really worries me.
------------------------------------------------------------------------
Sat, 30 Apr 2005 14:16:01 +0000 : killes(a)www.drop.org
@Jeremy: Thanks for looking at the patch! Also for catching the typo. :)
Did you try to upgrade your database? If yes, how did it go? One of
Dries' concerns is the complexity of the upgrade. How many nodes and
revisions did the db have?
About database locking: This part of the patch was created by Nick and
I simply continued to use it.
Maybe the code should rather be:
if(db_begin_transaction(array('{node}', '{node_revisions}',
'{watchdog}', '{sessions}', '{files}'))) {
db_query($node_query, $node_table_values);
db_query($revisions_query, $revisions_table_values);
db_commit_transaction();
...
}
The idea is probably to avoid two updates at the same time. I don't
think the locking helps if you abort the script at an inconvenient
time. Rollbacks aren't implemented in all mysql versions.
We could omit the db locking if deemed inappropriate. Maybe Nick can
explain his ideas behind this.
@Dries: I wonder who the "many of us" are. They certainly haven't
spoken to me. Moshe had some reservations about the upgrade path and
project module, but the time that project module abused revisions to
store issue updates was long ago and his reservations were resolved.
Nobody else (besides you of course and now Jeremy) has voiced
reservations in a way that was audible to me.
If you grep through the patch you will notice that there are only four
queries which have a join on the node_revisions table. Two of them are
in node_load and in the other cases the join replaced a join on the
node table. The two queries in node_load are the only ones that have
both a join on the node and the revisions query. Thus, loading of
individual nodes might become somewhat slower. All other queries will
be faster since the node table is now much smaller. Also, node loading
does not have to be slower, it depends on your node table. If you had
a lot of revisions and thus a large table, then the new scheme will
make your queries actually faster since we do not load the revisions
on each and every node load anymore. If you didn't have many revisions
your node_load migth be somewhat slower.
WRT to the update script Karoly pointed out that we could use multiple
insert queries instead one query per revision. This would probably
make the update somewhat faster. I am willing to work on this iff you
declare that you will commit the patch afterwards. I'd need to know if
this will work on pgsql and on all supported mysql versions before I
work on it.
About locking: Database locking is dog slow, at least on mysql. I was
using locks in an earlier version of the upgrade script but had to
remove it for (serious!) performance reasons.
------------------------------------------------------------------------
Mon, 09 May 2005 15:07:34 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_26.patch (46.45 KB)
Ok, another update, cause I need it myself.
I've left out the transaction stuff for now. It is in principle
unrelated to this patch and should be discussed elsewhere.
This also makes the patch smaller and easier to review (hint, hint).
------------------------------------------------------------------------
Mon, 09 May 2005 20:32:09 +0000 : killes(a)www.drop.org
The patch contained the update functions twice.
------------------------------------------------------------------------
Mon, 09 May 2005 20:32:26 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_27.patch (39.05 KB)
The patch contained the update functions twice.
------------------------------------------------------------------------
Mon, 09 May 2005 21:23:06 +0000 : Dries
Got one error during the upgrade path:
ALTER TABLE {book} ADD PRIMARY KEY vid (vid)
FAILED
------------------------------------------------------------------------
Mon, 09 May 2005 21:26:19 +0000 : killes(a)www.drop.org
This had happend to me as well, when I tested this patch. The reason is
that for some reason the vid is not unique. Most likely there are some
entries with vid = 0 in there. Can you check which node types those
have? it always was an error in the test database. See:
http://drupal.org/node/7582#comment-20678
------------------------------------------------------------------------
Mon, 09 May 2005 21:27:06 +0000 : Dries
Actually, I got 2850 errors during the upgrade.
Some of these:
sprintf() [function.sprintf]: Too few arguments in
drupal-cvs/includes/database.inc on line 154.
Some of these:
Query was empty query: in drupal-cvs/includes/database.mysql.inc on
line 66.
And this:
Unknown table 'n' in field list query: SELECT n.nid, n.vid FROM node
INNER JOIN files f ON n.nid = f.nid in
drupal-cvs/includes/database.mysql.inc on line 66.
:-)
------------------------------------------------------------------------
Mon, 09 May 2005 21:29:19 +0000 : Dries
Or this:
user error: Unknown column 'log' in 'field list'
query: SELECT parent, weight, log FROM book WHERE nid = 1 in
drupal-cvs/includes/database.mysql.inc on line 66
------------------------------------------------------------------------
Mon, 09 May 2005 21:52:12 +0000 : Dries
The time required to generate my main page went from 902 ms (before
upgrade) to 2139 ms (after upgrade).
The time required to generate a forum listing (?q=forum/x) went from
1872 ms (before upgrade) to 2874 ms (after upgrade).
Maybe this is because my database is not consistent as the result of
the upgrade errors (yet I don't see any errors on the pages I
benchmarked).
------------------------------------------------------------------------
Tue, 10 May 2005 00:24:38 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_28.patch (53.47 KB)
Ok, let me get to this from the bottom to the top:
- my test runs indicated a different development wrt timing. If I had
gotten your results, I had stopped working on this long ago. So your
results are wrong for some reason.
- user error: Unknown column 'log' in 'field list'
Wasn't my day, the book patch got lost. It is contained now. First -R
the old patch, then apply this one.
- Unknown table 'n' in field list query:
Walkah found this, but I forgot to fix it. Fixed now.
- I've no idea where the other queries come from. I am hoping that
either your test db is borken or they are follow ups from the other
ones.
If you let me have your test db, I'll try some debugging.
Thanks for wasting your time, too.
------------------------------------------------------------------------
Tue, 10 May 2005 05:07:31 +0000 : Dries
I double-checked and the numbers don't seem to lie. I'll test some more
after work on another machine to make sure it is not platform-specific.
------------------------------------------------------------------------
Wed, 11 May 2005 03:32:47 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_29.patch (54.83 KB)
Ok, here I am again.
What I did:
1) Ask Dries to let me have drupal.org database
2) get 400MB of SQL inserts...
3) take 23 minutes to import said data
4) Remove all image and project nodes (don't want to install their
modules), 11765 nodes left
5) back up data
6) take tests on non-cached /node page (as anonymous user).
ab results:
-c 1 -n 25:
Requests per second: 1.29 [#/sec] (mean)
Connection Times (ms)
min mean[+/-sd] median max
Total: 663 775 179.7 689 1264
7) Do the same for the tracker page:
Requests per second: 0.83 [#/sec] (mean)
Total: 1182 1199 7.4 1199 1217
8) Apply my patch (rev. 28).
9) run db update and hold breath
10) update times out...
11) play back backup from 5)
12) wait
13) getting annoyed and removing cache, watchdog, and accesslog before
playing in backup.
14) wait again. Understand why Dries doesn't try this patch often.
Maybe a smaller DB would do for testing?
15) wait more. get really annoyed.
16) Set time limit to 18000 in update.php
17) try again
18) fails again before the second update is completed.
19) curse.
20) delete search stuff from db. Ooops, sooo much smaller...
21) import again, below 2 minutes...
22) rewrite to use extended insert. Found a bug.
23) still does not complete. Mysql logging to the rescue!
24) tid = 0? Not good.
25) Well, the update works fine till node 10834. 5595 nodes done, 6136
to go.
26) Writing shell based update script. Discovery: 24MB aren't enough.
Hopefully 64 are. Nope.
extended inserts for revisions are apparently not the brightest idea:
Huge memory consumption.
Hmm, no, all updates got through. Selecting the revisions to put them
into old_revisions table screwed it. Learned about CREATE TABLE
old_revisions SELECT syntax.
Yay! finally. 24 MB are just not enough the update.php script seems to
still break.
27) Benchmarks!
/node
Requests per second: 1.54 [#/sec] (mean)
Connection Times (ms)
min mean[+/-sd] median max
Total: 632 649 40.5 636 791
/tracker
Requests per second: 0.86 [#/sec] (mean)
Total: 1119 1165 65.8 1160 1461
Ok: So we get an improvement for many node_loads, but none for simple
selects from node.
More tests can be done.
28) roll new patch
Ain't Drupal fun?
------------------------------------------------------------------------
Wed, 18 May 2005 13:38:05 +0000 : Dries
I did another round of tests on _another_ machine and it is not looking
good:
Before upgrade After upgrade
?q= (main page) 218 ms/request 340 ms/request
?q=forum (forum overview) 754 ms/request 1520 ms/request
?q=about (book page) 375 ms/request 5400 ms/request
The upgrade process itself gave me a number of 'query was empty' and
'sprintf(): too few arguments' reports. Everything seems to work fine
though.
Looking at the ?q=about page, I see that the following query is
executed twice _and_ that each time, it take more than 2 seconds to
complete:
SELECT n.nid, n.title, b.parent, b.weight FROM node n INNER JOIN book b
ON n.vid = b.vid WHERE n.status = 1 AND n.moderate = 0 ORDER BY
b.weight, n.title;
--8 SHOW INDEX FROM book;
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name |
Collation | Cardinality | Sub_part | Packed | Null | Index_type |
Comment |
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| book | 1 | book_parent | 1 | parent | A
| 92 | NULL | NULL | | BTREE | |
| book | 1 | nid | 1 | nid | A
| 369 | NULL | NULL | | BTREE | |
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
2 rows in set (0.00 sec)
The book module does not appear to have a primary key? Sounds like a
bad idea so I added one:
mysql> ALTER TABLE book ADD PRIMARY KEY nid (nid);
Query OK, 369 rows affected (0.02 sec)
Records: 369 Duplicates: 0 Warnings: 0
Next, I wanted to make the vid column a unique key in all node tables:
mysql> ALTER TABLE node ADD UNIQUE vid (vid);
Query OK, 20392 rows affected (0.81 sec)
Records: 20392 Duplicates: 0 Warnings: 0
mysql> ALTER TABLE book ADD UNIQUE vid (vid);
ERROR 1062: Duplicate entry '0' for key 2
mysql> ALTER TABLE forum ADD UNIQUE vid (vid);
Query OK, 10806 rows affected (0.10 sec)
Records: 10806 Duplicates: 0 Warnings: 0
As you can see, it fails for the book table which makes me believe
there is some inconsistent data ... I set out to fix that:
mysql> SELECT nid, COUNT(nid) AS vids FROM book GROUP BY vid HAVING
vids > 1;
+-----+------+
| nid | vids |
+-----+------+
| 871 | 2 |
+-----+------+
1 row in set (0.00 sec)
mysql> SELECT title FROM node WHERE nid = 871;
Empty set (0.00 sec)
mysql> DELETE FROM book WHERE nid = 871;
Query OK, 1 row affected (0.00 sec)
mysql> ALTER TABLE book ADD UNIQUE vid (vid);
Query OK, 368 rows affected (0.01 sec)
Records: 368 Duplicates: 0 Warnings: 0
Looks like everything is well now. Ran some new benchmarks:
Before upgrade After upgrade With
indices
?q= (main page) 218 ms/request 340 ms/request 336
ms/request
?q=forum (forum overview) 754 ms/request 1520 ms/request 1531
ms/request
?q=about (book page) 375 ms/request 5400 ms/request 475
ms/request
Unfortunately, we're still slower than the original code.
------------------------------------------------------------------------
Wed, 18 May 2005 21:53:31 +0000 : killes(a)www.drop.org
Dries, thanks for testing it again.
I do think the broken queries you observer have something to do with
the bad performance after the update. Please log the queries any I will
have a look at them. I've never seen any such queries.
My update script also tries to create the appropriate indices, but it
will of course fail if the database contains cruft. the indices for the
forum are probably missing, too.
I am still convinced that the patch is core worthy.
------------------------------------------------------------------------
Thu, 19 May 2005 04:36:09 +0000 : Dries
It wouldn't hurt if more people would benchmark this patch. The patch's
current performance worries me.
Did you check your watchdog messages after upgrading the drupal.org
database? Depending on your settings, errors might only be shown in
the watchdog. I'll look into the remaining glitches as time permits.
Thanks for your persistence in keeping this patch up-to-date. :)
------------------------------------------------------------------------
Thu, 19 May 2005 11:59:22 +0000 : killes(a)www.drop.org
Dries: Can you please let me have your updated database? I want to have
a look at it and try my own benchmarks with it.
And yes, if I did learn something on this project is how to be
persistant. ;)
------------------------------------------------------------------------
Fri, 24 Jun 2005 16:25:34 +0000 : killes(a)www.drop.org
Here is an idea that occurred to me:
The problem with the upgrade process is that keeping the existing
revisions requires a lot of work to do. This generates a huge amount of
sql queries for a large database and also requires a huge amount of
memory.
My suggestions is to let update.php only handle the basic upgrade, ie
without old revisions. An additional module could be created that would
implement a cron based approach to upgrading old revisions one node at a
time. it could expose a hook to let contrib modules do their own
upgrades.
Dries, what do you think? (I am writing "Dries" because he seems to be
the only one who is interested in getting this into core...)
------------------------------------------------------------------------
Fri, 24 Jun 2005 22:25:11 +0000 : Junyor
Killes:
I'm also interested in seeing this hit core. What about adding
something to legacy.module to do it?
------------------------------------------------------------------------
Sun, 26 Jun 2005 21:14:54 +0000 : chx
This is a sensible approach. Maybe this is the _only_ sensible approach.
I have a little problem though: while the conversion is running somehow
both revision handlers should be available.
------------------------------------------------------------------------
Sun, 26 Jun 2005 22:16:21 +0000 : killes(a)www.drop.org
hehe, one only has to whine and bug enough and one gets some feedback.
;)
@junyor: legacy.module would be a good place. my current idea is to
auto-enable it in update.php and then disable it again in legacy_cron
after all nodes are updated.. ;)
@chx: When somebody wants to look at revisions of a node that node
could be auto-updated.
The only problem are contrib modules: they's need to have some hook in
order to update their own data. When somebody looks at the revisions of
a node than cannot be updated because the contrib module in question has
no such hook, we can optionally let the user discard old revisions I
guess.
Dries, what do you think?
------------------------------------------------------------------------
Mon, 25 Jul 2005 16:48:51 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_30.patch (53.13 KB)
Sooooo; I've updated this patch once again.
Dries didn't like my idea of legacy updates so we will have an option
to discard old revisions in case the update should prove difficult.
Always keep a backup.
------------------------------------------------------------------------
Mon, 25 Jul 2005 17:31:12 +0000 : Bèr Kessels
can we please accept and commit this patch? We can iron out any issues
later. This patch is just far too big and complex to be
perfect-at-once.
------------------------------------------------------------------------
Mon, 25 Jul 2005 19:22:50 +0000 : Dries
@Ber: no, we can't commit this patch blindly.
Data loss is a much bigger problem than a syntax error or other code
glitch. We can break Drupal but we can't break their data.
I've spent quite a bit of time testing the previous version of this
patch and noticed significant performance degradation.
Tell me, Ber, why can't you test this patch first?
------------------------------------------------------------------------
Mon, 25 Jul 2005 19:40:37 +0000 : Bèr Kessels
I was not referring to not testing it. If it is just the upgrade path
that proves to be cumbersome, then why can we not fix it afterwards,
I.e. when everyone has had a good chance to look at it.
Testing such a huge patch requires a lot of work. something no-one just
does in his spare hours. We had discussions before to deal in a slightly
different way with big changes; to commit them quicker and to leave the
ironing out of any left overs for the community. I hooked into that
discussion here. For two reasons. One is that Gerhard has spent
numerous hours on maintaining this patch. The other one is that the
community can be of help with ironing out issues in such a large
change, much better than that Gerhard can do on his own.
And yes, dataloss is very bad, but no-one should loose data, when
he/she followed the instructions (backing up)...
------------------------------------------------------------------------
Mon, 25 Jul 2005 20:00:38 +0000 : Dries
Ber: applying this patch takes 15 seconds. Whether I apply this patch
for you, or whether you apply it yourself, it will hardly reduce the
'testing cost'. The problem is that data loss can be subtle; it might
go unnoticed for a couple days. Make no mistake, I'd like to see this
patch committed ASAP, but it warrants some testing. Let's
test/benchmark it and commit it.
------------------------------------------------------------------------
Mon, 25 Jul 2005 20:55:03 +0000 : drumm
The upgrade script already takes a long time to execute and does not
provide feedback to the user about how it is doing. I plan on making
the upgrade script able to spread the updates across multiple page
views and give user feedback showing that progress is being made. This
will hopefully make the speed of this update a moot point.
I am working on this for my dayjob, CivicSpace, so it should get done
"real soon now", but it should be expected to take awhile (smallish
number of weeks). Please make comprimises if necessary.
------------------------------------------------------------------------
Wed, 27 Jul 2005 17:38:57 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_31.patch (48.78 KB)
@drumm: the timeout isn't usually the problem. Memory consumption is.
I'll look into doing the updates in chunks of 1000 nodes or so.
Clousseau found a bug in the patch, updated.
------------------------------------------------------------------------
Thu, 28 Jul 2005 18:28:08 +0000 : moshe weitzman
Even if the upgrade issues are resolved, we have to figure out why this
patch is slowing down drupal (according to benchmarks posted here).
------------------------------------------------------------------------
Fri, 29 Jul 2005 07:51:45 +0000 : Dries
We can't proceed without some additional benchmarking. I already
benchmarked it on two machines (see comments above), and I'll benchmark
it again after August 10. The performance impact of this patch worries
me so if two other people could test and benchmark this patch
extensively, that would come a long way. August 10 is close to the
code freeze so some help is necessary.
If Gerhard/killes reviewed/tested one of your patches, it is time to
return the favor ...
------------------------------------------------------------------------
Fri, 29 Jul 2005 15:40:31 +0000 : Jose A Reyero
After looking at the patch I'm not really surprised this slows down
everything. I thought that the reason why we wanted revisions in their
own table was in the first place to have a simpler -and faster- node
table. But this patch:
- Adds fields and complexity not only to the node table but also to all
the main tables for data
- Needs additional joins just to retrieve single nodes (which is done
many times for a typical main page)
- Does away with encapsulation of the revision functionality requiring
other modules to handle revissions related data.
So please, please, please:
- Make revisions table only needed when actually using revisions
- Keep that old nice thing of hiding revisions system from other
modules
- And what's so bad with serializing data anyway? I agree that usually
there are better ways to store data. But this is one of the cases where
serialization does make sense. Not the current huge serialized field,
but only a new table with one serialized node per row.
- In general, do it much simpler. This is way too complex and it
removes more functionality than it adds.
In just four words: keep it simple, please.
------------------------------------------------------------------------
Fri, 29 Jul 2005 21:25:26 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_32.patch (51.27 KB)
Patch updated (v31 was buggy, and there was an update to updates.inc).
I've had a discussion with Jose and he will maybe do some testing over
the weekend. John is also doing some testing as we speak.
------------------------------------------------------------------------
Sat, 30 Jul 2005 16:06:59 +0000 : Nick Nassar
It has been exactly one year since I first submitted this patch.
You know what that means: *old patch party in IRC!* Virtual beers on
me!
w00t!
------------------------------------------------------------------------
Sat, 30 Jul 2005 16:08:33 +0000 : Jose A Reyero
Hi again.
I had a talk with killes -I have to say he enlightned me about this-
and also did some testing. Sorry but I dont have a good set up to
provide benchmarks.
Here are my conclusions -and also some explanation for the varied
results abt performance:
- This is faster for simple node listings
- This is slower for full node/load
....thus the performance increase/decrease for a given page should
depend on the relation between the number of simple node listings
(usually blocks) and the number of nodes displayed in the main page.
I've also experienced some trouble with attachments -not re-created
when creating a new revision- and book relations, same.
I'm not sure I like this new "feature" of creating a new revision every
time you rollback an old one -in Drupal 4.6 this seems not to happen.
Besides these details, I think this patch is good enough -and quite a
powerful approach actually- though I still feel it's bit too complex
when simply dealing with nodes -without revisions-, which is what we do
most of the time, and complexity will be even higher with contrib
modules using still more tables, not to mention flexinode -versioning
info will have to be scattered everywhere in the db.
So this is not a +1 nor -1. On one side I think this is powerful. On
the other side I feel this is too complex and really, it's more than I
need.
I would be happy with a 'revision' table having: nid, vid,
data(serialized), that you can forget about when not using revisions.
------------------------------------------------------------------------
Sat, 30 Jul 2005 19:16:41 +0000 : Nick Nassar
On a much more serious note that my last comment, unless I'm missing
something, you've introduced a subtle bug into this patch since I left
it.
A and B are both updating revision 5 of a node.
A calls db_next_id() and gets a revision id of 6
B calls db_next_id() and gets a revision id of 7
B updates the node's revision to 7
B adds revision 7 to the revision table
A updates the node's revision to 6
A adds revision 6 to the revision table
Now, you have a situation where the current revision (6) isn't actually
the last revision(7). For some uses, that's not a big deal, but in the
case of wikis, which is what I originally hoped to use this patch for,
it creates problems.
------------------------------------------------------------------------
Mon, 01 Aug 2005 00:04:14 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_33.patch (51.27 KB)
Updated the patch. updates.inc was changed and chx found a typo.
@Nick: I don't think the issue is due to this patch. Drupal has
currently no mechanism to not let people commit a new version of the
same node, whether with or without revisions. drupal_next_id ensures
that the revision IDs are unique and the later revisions will be the
current one. Which revision is the current one does not depend on order
of the version ID, btw.
@Jose: what exactly were your problems with uploads? what do I need to
do to reproduce?
------------------------------------------------------------------------
Mon, 01 Aug 2005 11:33:56 +0000 : Jose A Reyero
Sorry, I dont know what I did the other day but I cannot reproduce it
either :-)
I applied this new patch. Problems I'm having now (Testing with
"stories")
- Book outline is gone after creating a new revision
- I cannot delete attachments (not sure it is related to your patch or
to something else)
- Usage of the log message: Is that mensage intended for revisions or
only for book outlines? Currently the field is displayed only for
outlines but shown in the revision tab
Btw, the db update failed for me in patch 32, so I did it manually. I
didnt do a db update again after rolling back v32 and applying v33
-should I?
------------------------------------------------------------------------
Sun, 14 Aug 2005 16:03:41 +0000 : killes(a)www.drop.org
Jose, thanks a lot for testing!
"- Book outline is gone after creating a new revision"
Sounds like a bug, it should not.
"- I cannot delete attachments (not sure it is related to your patch or
to something else)"
Another bug
"- Usage of the log message: Is that mensage intended for revisions or
only for book outlines? Currently the field is displayed only for
outlines but shown in the revision tab"
Log messages are possible for nodes of type book and page as well as
all nodes that are put into the book. Log messages can be added for any
node type, the module author should implement those.
"Btw, the db update failed for me in patch 32, so I did it manually. I
didnt do a db update again after rolling back v32 and applying v33
-should I?"
No, the db update should be the same.
I will try to fix the bugs you found.
WRT the db update I've got an announcement to make: I will not migrate
old revisions.
Why?
1) it is a rather complicated process that requires a lot of memory and
is likely to give people on cheap hosting trouble.
2) testing it takes a lot of time.
3) I don't personally need it
I will still move old revisions to an old_revisions table from where
they can be converted by a later update or a custom script.
------------------------------------------------------------------------
Tue, 16 Aug 2005 21:34:43 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_34.patch (52.92 KB)
Ok, here is incarnation No. 34 of this patch. Dries agrees with me that
we can put the revision rescue operation into a later update, possible
using Drumm's installer.
The patch fixes the issue with the file uploads not being deletable, I
could not reproduce the problem with the outlines.
------------------------------------------------------------------------
Sat, 20 Aug 2005 13:33:40 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/scripts.tar.gz (4.39 KB)
I've started to do some performance testing.
I created a script that creates nodes that have revisions according to
the new system. A node has a 50% chance to have a revision and then an
equal chance fro between 1 and 25 revisions. I've created a database
with 50 nodes and 500 comments, 7 vocabs and 100 total taxo terms.
I've also created a script to convert these revisions to the old format
(much easier than the other way around). Initial tests didn't show a
significant change for /node. I'll try with more nodes and /tracker.
Attached are the two scripts in a tarball. They need to be run against
the patched install.
Also included are the two preliminary results.
------------------------------------------------------------------------
Sat, 20 Aug 2005 15:16:04 +0000 : killes(a)www.drop.org
I've now created a databse which has 1000 nodes and each node has 50 or
51 revisions. I have no taxo terms assigned and no comments (those
parts didn't change). I am only using stories and pages. I am now
fairly confident that this patch is a performance improvement for this
kind of setup.
for the new patch on the /node page I get
Requests per second: 0.27 [#/sec] (mean)
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 3592 3648 56.9 3640 3985
Waiting: 3573 3626 55.1 3620 3949
Total: 3592 3648 56.9 3640 3985
for the old one I get:
Requests per second: 0.23 [#/sec] (mean)
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 4247 4297 64.0 4286 4657
Waiting: 4156 4263 65.3 4253 4624
Total: 4247 4297 64.0 4286 4657
For the tracker the results are even better:
old:
Requests per second: 2.00 [#/sec] (mean)
new:
Requests per second: 3.64 [#/sec] (mean)
Anybody interested in obtaining this database can mail me. It is 60MB
(gzipped).
------------------------------------------------------------------------
Sat, 20 Aug 2005 16:33:26 +0000 : killes(a)www.drop.org
Ok, I've now deleted the revisions and only kept the current one. The
initial tests showed that the performance of the patch was miserable.
then I remembered that I had shreddered the revisions table and did a
mysql optimization.
Now the results are as follows:
/node:
Requests per second: 3.19 [#/sec] (mean)
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 275 312 44.3 290 499
Waiting: 255 291 44.1 268 479
Total: 275 312 44.3 290 499
Requests per second: 3.18 [#/sec] (mean)
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 274 314 42.4 292 407
Waiting: 255 294 42.0 273 388
Total: 274 314 42.4 292 407
The lower one is the new system, there is no difference considering
statistics.
/tracker
old:
Requests per second: 3.72 [#/sec] (mean)
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 238 268 33.8 249 332
Waiting: 219 247 33.3 230 313
Total: 238 268 33.8 249 332
new:
Requests per second: 3.77 [#/sec] (mean)
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 240 264 29.8 250 324
Waiting: 221 244 29.0 229 305
Total: 240 264 29.8 250 324
Same applies, no real difference.
So, as soon as somebody tests the upgrade path the patch should be
committed.
1
0
Issue status update for
http://drupal.org/node/7582
Post a follow up:
http://drupal.org/project/comments/add/7582
Project: Drupal
Version: cvs
Component: node system
Category: bug reports
Priority: normal
Assigned to: killes(a)www.drop.org
Reported by: killes(a)www.drop.org
Updated by: killes(a)www.drop.org
Status: patch (code needs review)
Ok, I've now deleted the revisions and only kept the current one. The
initial tests showed that the performance of the patch was miserable.
then I remembered that I had shreddered the revisions table and did a
mysql optimization.
Now the results are as follows:
/node:
Requests per second: 3.19 [#/sec] (mean)
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 275 312 44.3 290 499
Waiting: 255 291 44.1 268 479
Total: 275 312 44.3 290 499
Requests per second: 3.18 [#/sec] (mean)
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 274 314 42.4 292 407
Waiting: 255 294 42.0 273 388
Total: 274 314 42.4 292 407
The lower one is the new system, there is no difference considering
statistics.
/tracker
old:
Requests per second: 3.72 [#/sec] (mean)
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 238 268 33.8 249 332
Waiting: 219 247 33.3 230 313
Total: 238 268 33.8 249 332
new:
Requests per second: 3.77 [#/sec] (mean)
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 240 264 29.8 250 324
Waiting: 221 244 29.0 229 305
Total: 240 264 29.8 250 324
Same applies, no real difference.
So, as soon as somebody tests the upgrade path the patch should be
committed.
killes(a)www.drop.org
Previous comments:
------------------------------------------------------------------------
Wed, 05 May 2004 16:25:27 +0000 : killes(a)www.drop.org
Currently all node revisions are stored in a serialized field in
node.table and retrieved for _each_ page view although they are rarely
needed. However, we have agreed that serializing data is bad and that
we should try to keep the memory foot print pf Drupal small.
Therefore I propose to create a separate revisions table which would be
in principle identical to the node table, only that it could have
several old copies of the same node. Extra data added by other modules
could be added in a serialized field unless we find a better solution.
------------------------------------------------------------------------
Wed, 05 May 2004 17:06:35 +0000 : jhriggs
I too think the serialized approach is less than desirable, but here's
an alternative. This would likely take some considerable rework in
core and contrib, but the following is how we handle similar types of
situations in our databases at work. It is more elegant that a
separate table, and avoids the (almost exact) duplication of a table.
Instead of separate tables, keep all revisions of nodes in the node
table as follows:
* add field: active (0/1 or Y/N)
* add field: revision
* every revision of a node is stored in the node table; however, only
one revision can be active at any given time
* nid can no longer be unique -- primary/unique key becomes (nid,
active)
* any time a node is loaded, updated (without revision), etc., the
active version is used.
Thoughts?
------------------------------------------------------------------------
Wed, 05 May 2004 17:57:48 +0000 : killes(a)www.drop.org
I am not opposed to your scheme, but I want to stress the following:
* Duplicating a table's structure is not bad (IMHO) as long as the
content is different.
* having two tables will allow us to have a rather small node table.
This is (maybe) a performance gain.
------------------------------------------------------------------------
Wed, 05 May 2004 18:37:29 +0000 : jhriggs
I don't necessarily think that duplicating a table's structure is _bad_.
It just seems to be wasteful and a pain to maintain. (Every change to
the node table is made twice...easy to do, but also easy to miss
perhaps.)
As for performance, as long as nid and the active indicator are
indexed, there shouldn't be any performance loss. Also, archiving an
old version when making a new revision will be much simpler: just
change the active indicator rather than copying an entire node to
another table (and ensuring everything gets copied...again a potential
maintainance issue).
To be honest, I would just like to see the serialized data go away,
regarless of what approach is taken.
------------------------------------------------------------------------
Fri, 30 Jul 2004 19:49:33 +0000 : Nick Nassar
Attachment: http://drupal.org/files/issues/Drupal-Improved_Revision_Schema_07-30-2004.p… (10.47 KB)
I'm interested in using Drupal for a large scale wiki-type project. In
order to do this, I need revisions to be in their own table.
Attached is a patch to do just that. Most of the changes are pretty
self explanatory. Spreading out node data across two tables meant that
I had to add database functions to do locking/transactions. Without
this, race conditions in which the database becomes corrupted are
possible.
------------------------------------------------------------------------
Fri, 30 Jul 2004 19:54:36 +0000 : Nick Nassar
Oh yeah... The patch is a diff against Drupal CVS
------------------------------------------------------------------------
Sat, 31 Jul 2004 00:00:08 +0000 : Anonymous
Gerhard speaking.
Nick, thanks a lot for your nice patch! It saves me a great deal of
labour. I looked through it and immediately liked it. You not only put
the old revisions into a new table but also the current one. Do you
have an estimate how much more expensive the additional join is?
Besides a few minor coding style issues I found a major one: Just a few
hours before you uploaded your patch JonBob's node access patch hit
core. That means your patch won't apply anymore as all the queries you
change have been changed. Can I bug you to update your patch?
------------------------------------------------------------------------
Sat, 31 Jul 2004 01:11:59 +0000 : Anonymous
Also I think that your upgrade path loses existing revisions.
------------------------------------------------------------------------
Sat, 31 Jul 2004 02:39:12 +0000 : drumm
I think this is the proper way to do things. No columns are duplicated,
there is no serialized data, and only the fields that are logically
revised are stored. Nothing jumped out at me as a way to have my node
modules be able to keep a table of revisions of additional fields. I'm
guessing this could be done within the confines of _insert and _update.
Assuming the upgrade path works and modules can extend it I give it a
+1.
------------------------------------------------------------------------
Sat, 31 Jul 2004 14:40:15 +0000 : Nick Nassar
It figures that just as I finish a big patch, another patch comes along
and breaks it. Oh well, it should be a pretty easy to fix. I'll work on
it.
Fixing the upgrade path to keep revisions should be fairly painless.
I found another issue that needs to be fixed before this patch gets
merged. There format of a node needs to be stored for each revision.
Otherwise, for modules that store a format for the nodes, such as page
and book, if you write one revision in PHP and the next in HTML, the
PHP revision will be displayed as HTML. This is part of a larger issue
of how node modules should store revisions of additional fields. I
think each module that wants to do this should create another table
with (nid, revid) as the primary key. Just as when they want to add
fields to a node they create another table with nid as the primary key.
As far as performance goes, for sites that make heavy use of revisions,
an extra join on primary keys is going to be a lot faster than grabbing
all of the revisions from that database everytime. We would need to run
benchmarks to determine is the overall difference in speed is for an
average site is a gain or a loss. I'm guessing it's very minor either
way.
------------------------------------------------------------------------
Mon, 23 Aug 2004 13:55:49 +0000 : Nick Nassar
Attachment: http://drupal.org/files/issues/Drupal-Improved_Revision_Schema_08-23-2004.p… (10.92 KB)
Here's an updated patch against CVS that puts revisions in their own
table, provides an upgrade path, and fixes the format related bugs in
the last patch.
Hopefully, this can make it into CVS as soon as the freeze is over.
------------------------------------------------------------------------
Mon, 23 Aug 2004 14:10:39 +0000 : moshe weitzman
Interesting patch ... drumm's question is still outstanding. how do
modules store revisions of their fields? Are they expected to manage
this on their own? Thats not how it works today.
As an aside, i am seeing profile_ fields in my node.revisions column.
One could argue that those need not be saved. They pertain to the node
author, not to the node itself.
------------------------------------------------------------------------
Mon, 23 Aug 2004 16:14:39 +0000 : Nick Nassar
Having modules be responsible for storing revisions of their own fields
is a side-effect of storing revision data in tables. There's really no
way around it. However, revisions generally don't make sense for node
types that don't have PHP/HTML content, such as polls. I think it's
going to be a pretty rare scenario for a new node type to want another
field to change per-revision, so it's a pretty good trade-off.
Storing fields that shouldn't be part of revisions, such as the
profile_ fields, is a side-effect of storing revisions as serialized
objects. Applying this patch will free up that wasted space. :)
------------------------------------------------------------------------
Mon, 23 Aug 2004 17:20:57 +0000 : Anonymous
There should be a hook that let's the module choose whether it supports
history. This way a module author can prevent the user from doing
something that may break his module or just cause undefined behavior.
If the module doesn't support history then don't let the user/admin
choose to add history to nodes of that type.
Craig
------------------------------------------------------------------------
Mon, 23 Aug 2004 19:23:29 +0000 : Nick Nassar
I agree, there should be an API change to make specifying support for
revisions easier. In the interests of keeping patches small and keeping
to one change per patch, I think the API change should be a separate
issue.
A sort of ad-hoc API to decide whether or not a module supports
revisions by default already exists. Instead of having a hook, modules
set the default value of the "Create new revision" field in the edit
form. The admin can change this option in
admin/node/configure/defaults. This patch doesn't change that.
Revisions are broken for node types that have their own database
structure, like polls, even when storing them as serialized objects.
This patch doesn't change that, either.
------------------------------------------------------------------------
Tue, 26 Oct 2004 02:35:06 +0000 : moshe weitzman
I'm guessing that someone is going to have to demonstrate that this
patch performs as well as current drupal before it gets comitted. i
think this patch is a few benchmarks from being comitted.
------------------------------------------------------------------------
Wed, 27 Oct 2004 01:04:09 +0000 : Nick Nassar
Attachment: http://drupal.org/files/issues/Drupal-Improved_Revision_Schema_10-26-2004.p… (11 KB)
I ran some really unscientific benchmarks, and it looks like this patch
has a negligible affect on performance.
I used apache bench and the database from theregular.org, which doesn't
contain any revisions (worst case scenario for this patch) and contains
several hundred nodes. Both the patched and unpatched versions hovered
between 2.36 and 2.38 requests per second.
The command I used was:
ab -n50 -C 'PHPSESSID=b01a9f92880ef215b0ed6f1314a5eba2'
http://192.168.0.100/
An updated patch that should apply to CVS is attached.
------------------------------------------------------------------------
Wed, 27 Oct 2004 01:05:16 +0000 : Nick Nassar
I ran some really unscientific benchmarks, and it looks like this patch
has a negligible affect on performance.
I used apache bench and the database from theregular.org, which doesn't
contain any revisions (worst case scenario for this patch) and contains
several hundred nodes. Both the patched and unpatched versions hovered
between 2.36 and 2.38 requests per second.
The command I used was:
ab -n50 -C 'PHPSESSID=b01a9f92880ef215b0ed6f1314a5eba2'
http://192.168.0.100/
An updated patch that should apply to CVS is attached.
------------------------------------------------------------------------
Wed, 27 Oct 2004 01:05:26 +0000 : Nick Nassar
Attachment: http://drupal.org/files/issues/Drupal-Improved_Revision_Schema_10-26-2004.p… (11 KB)
I ran some really unscientific benchmarks, and it looks like this patch
has a negligible affect on performance.
I used apache bench and the database from theregular.org, which doesn't
contain any revisions (worst case scenario for this patch) and contains
several hundred nodes. Both the patched and unpatched versions hovered
between 2.36 and 2.38 requests per second.
The command I used was:
ab -n50 -C 'PHPSESSID=b01a9f92880ef215b0ed6f1314a5eba2'
http://192.168.0.100/
An updated patch that should apply to CVS is attached.
------------------------------------------------------------------------
Mon, 15 Nov 2004 05:05:30 +0000 : elias1884
please overthink the revision system default workflow as well. don't
look at the revision system as an isolated system but as a part of the
whole workflow system!
if you combine revisions with the moderation queue the most logic
default workflow would be like that:
auth user creates node (revision #0)
admin approves the node (status = 1, moderation = 0)
=> node publicly available
auth user finds typo and changes node (revision #1, status = 0,
moderation = 1)
-------
what happens at that point at the moment is, that the node is not
accessible anymore at all until the new revision is approved by admin.
of course the new revision should not go online until reviewed and
approved, this is absolutely correct, but there is no reason to not
take the old revision offline, since it was already approved and should
therefore be online until the new revision is approved. it is not
practical if a node disappears only because the author corrected a
typo.
-------
admin approves the node (status = 1, moderation = 0)
eventhough I first thought a plain boolean active field would not be
capable of providing that functionality if finally came to the
conclusion, that it can. The only thing to do is to not set that bit,
when a new revision is created, but when it is approved (in case
moderation is activated under default workflow). Every revision should
have its own moderation, status and active field and on approval they
are set like this (status=1, moderation=0, active=Y).
When you wanna rollback to an old revision, you can chose between all
revisions that already have the moderation bit set back to 0 again and
the published to 1. There should be an extra permission for rollback!
another concern that I have about the default workflow is, that users
can't see the content, they have just created, when moderation is
enabled. Eventhough, there is a big fat "submission accepted" presented
after submissions, unexperienced users tend to question the information
those stupid tincans give them, if they can't find their content
afterwards. Many users are really lazy bastards and they don't even
read the status messages. The best feedback about whether his story was
submitted successfully or not of course is, if he can find the story
somewhere on the site, maybe with a status message on top of it,
mentioning, that the content is currently not publicly available since
it has not been approved yet. there should be a my content section
under my account, like somebody is trying to do with the workspace
module I guess.
so my suggestion is to make (status=0, moderation=1) still available
for the creator under a my content section somewhere!
------------------------------------------------------------------------
Wed, 24 Nov 2004 04:21:18 +0000 : Nick Nassar
I agree. The current workflow for moderation queues and revisions needs
to change, but this patch isn't the place for it. The patch is already
too big, and it only does the backend stuff.
Instead of adding more to this patch and making it take even longer to
get into core, would you mind creating a new issue for your UI
suggestions, so the those changes can be added as a separate patch?
Thanks,
Nick
------------------------------------------------------------------------
Sat, 11 Dec 2004 12:26:02 +0000 : Dries
This patch is _much_ needed so I'd love to see someone revive it. In
order for this patch to be accepted, the following needs to be done:
* Update this patch to CVS HEAD.
* Rename revid to vid.
* Rename node_rev to node_revisions.
* Rename node_rev.changed to node_revisions.timestamp.
* Rename $rnode to $revision.
* Fix the coding style to match Drupal's: proper spacing, single quotes
where possible, proper variable names.
* Benchmark this patch with a large database with enough revisions.
I'd be happy to benchmark this on my local copy of the drupal.org
database.
* The book.log field should probably move to the node_revisions table.
This can be done in a separate patch.
* Investigate whether transactions are well-supported.
------------------------------------------------------------------------
Mon, 13 Dec 2004 00:25:40 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/Drupal-Improved_Revision_Schema_10-26-2004-r… (11.02 KB)
I've worked a bit on the patch (coding style issues as mentioned by
Dries). One thing I noticed is that the patch uses REPLACE. IIRC this
needs to be chagned to "UPDATE, if fail INSERT" for pgsql
compatibility.
Nick, are you still interested in working on that patch? I'd like to
know how it works on your site and work on getting it into core.
------------------------------------------------------------------------
Mon, 13 Dec 2004 12:33:08 +0000 : Dries
Gerhard: your patch does not apply.
------------------------------------------------------------------------
Mon, 13 Dec 2004 17:10:12 +0000 : killes(a)www.drop.org
Yes, I know, that was the same version as I mailed to you earlier.
------------------------------------------------------------------------
Mon, 13 Dec 2004 21:02:06 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions.patch (52.96 KB)
Ok, upüdated the patch to cvs.
------------------------------------------------------------------------
Tue, 14 Dec 2004 08:58:36 +0000 : Dries
Some more comments:
* db_begin_transaction() and db_end_transaction() do not belong in
database.inc, but in database.mysql.inc and database.pgsql.inc
respectively.
* The node module calls node_revisionsision_list() which is not
defined. (Fxed that on my local copy.)
* Do db_begin_transaction() and db_end_transaction() deprecate Jeremy's
table locking patch?
* The upgrade path assigns the wrong user ID to each revision.
* The upgrade path assigns the wrong date to each revision (that or a
node's revision page shows the wrong usernames/dates).
* The coding style needs a bit of work, but we can worry about that
later.
------------------------------------------------------------------------
Tue, 14 Dec 2004 17:34:44 +0000 : Nick Nassar
If you need any help getting those things fixed, just let me know.
------------------------------------------------------------------------
Tue, 14 Dec 2004 17:50:30 +0000 : Nick Nassar
How this relates to Jeremy's node locking patch:
There was lots of discussion, and node locking was decided against
because from an end user point of view you never want a node to be
locked. He's now advocating for a much simpler patch that warns users
if their changes will overwrite someone elses. That patch still has a
race condition, which might be fixed using db_begin_transaction().
http://drupal.org/node/6025
------------------------------------------------------------------------
Tue, 14 Dec 2004 22:26:19 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_0.patch (55.96 KB)
Here is an updated patch that tries to address Dries concerns.
------------------------------------------------------------------------
Wed, 15 Dec 2004 08:32:50 +0000 : Dries
Attachment: http://drupal.org/files/issues/revisions-bug.png (76.06 KB)
It didn't fix the aforementioned bugs. See attached screenshot.
------------------------------------------------------------------------
Thu, 06 Jan 2005 20:15:01 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_1.patch (51.77 KB)
Ok, here is a new version. Dries and myself worked hart at it, so please
have a look.
what is still missing
- database upgrades for the core modules with an own table
- contrib modules need an upgrade too.
- do we need nid and vid in both the node and the node_revisions table?
- the amount of sql queries means a good stress testing for large
databases.
------------------------------------------------------------------------
Wed, 19 Jan 2005 21:43:49 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_2.patch (49.49 KB)
Here is an updated patch. We discussed to keep the current title in node
module and also in the revisiosn table. This is content duplication but
will save many joins as many queries only need the title of a node.
Discussion is welcome.
------------------------------------------------------------------------
Wed, 19 Jan 2005 23:33:32 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_3.patch (29.93 KB)
I've implemented the aforementioned solution. This makes the patch much
smaller. The patch now also removes taxonomy_node_has_term() which
wasn't used anywhere. I'd really apprciate if some people could test
drive the patch. It will be another huge improvement for 4.6.
------------------------------------------------------------------------
Thu, 20 Jan 2005 00:05:54 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_4.patch (30 KB)
Another revision. Steven didn't like my literal $node->vid in queries.
------------------------------------------------------------------------
Thu, 20 Jan 2005 01:10:50 +0000 : killes(a)www.drop.org
- database upgrades for the core modules with an own table
- contrib modules need an upgrade too.
- do we need nid and vid in both the node and the node_revisions table?
- the amount of sql queries means a good stress testing for large
databases.
These issues are still open, btw. Especially the first one needs to be
tackled.
------------------------------------------------------------------------
Tue, 25 Jan 2005 20:11:59 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_5.patch (51.13 KB)
Here is a patch that has the database tables updated for forum, book,
and page module.
------------------------------------------------------------------------
Sat, 29 Jan 2005 22:55:59 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_6.patch (49.18 KB)
Yet another update to keep it working with head. The patch now also
removes the table definitons for the page table.
------------------------------------------------------------------------
Sat, 29 Jan 2005 22:57:40 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_7.patch (55.69 KB)
Sorry, that was the old version, this is the right one.
------------------------------------------------------------------------
Mon, 31 Jan 2005 19:55:03 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_8.patch (55.71 KB)
Updated once more.
------------------------------------------------------------------------
Mon, 31 Jan 2005 20:52:08 +0000 : Dries
Anyone to help review/test this?
------------------------------------------------------------------------
Mon, 31 Jan 2005 21:22:36 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_9.patch (49.29 KB)
Updated again, the update functions occurred twice. Thanks Bart.
------------------------------------------------------------------------
Wed, 02 Feb 2005 00:27:05 +0000 : killes(a)www.drop.org
Don't know if the db I am using is corrupted or what. I still do have
some didficulties.
The latest patch is attached.
------------------------------------------------------------------------
Wed, 02 Feb 2005 00:27:49 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_10.patch (49.67 KB)
I am probably slowly going mad ...
------------------------------------------------------------------------
Wed, 02 Feb 2005 01:54:58 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_11.patch (48.95 KB)
The update issue still needs investigating. This patch is updated for
cvs.
------------------------------------------------------------------------
Wed, 02 Feb 2005 20:20:51 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_12.patch (49.83 KB)
Ok, here is a new version. I've solved my troubles with book.module.
There are still some issues with forum module. Possibly due to
inconsistent database.
------------------------------------------------------------------------
Wed, 02 Feb 2005 21:31:05 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_13.patch (49.83 KB)
Turns out the drupal.org database had indeed some quirks. Please run
this query in your oldest db and tell me the result:
select nid,type from node where type like '%/%';
If you get a non-zero result we might need to add another security
update.
The patch could use still more testing, though.
------------------------------------------------------------------------
Thu, 03 Feb 2005 01:16:54 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_14.patch (49 KB)
Ok, we are getting somewhere. At a first glance the update is working.
There is a problem remaining: the revisions tab will be shown whether
the node has revisions or not. Not sure we can/need to fix this.
People with a drupal.org account can log in at
http://killes.drupaldevs.org/revision/ and poke around. Your
permissions will be the same as on drupal.org. Feel free to vreak
everything but don't forget to file complaints here. (Note: this is
only a pruned version of the drupal.org database with all project nodes
and nodes with nids > 7000 dropped).
------------------------------------------------------------------------
Thu, 03 Feb 2005 04:19:14 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_15.patch (52.39 KB)
There was some error in node_save and also the patches to the
database.inc files got lost...
------------------------------------------------------------------------
Thu, 03 Feb 2005 07:07:27 +0000 : robertDouglass
Submitting book pages doesn't work on your test site. It puts the entire
content of the preview inside the body textarea. I wrote a sentence in
the body and the log, and pressing preview put several lines of HTML
containing both sentences in the body textarea on the preview page,
plus the book page wouldn't submit.
-R
------------------------------------------------------------------------
Thu, 03 Feb 2005 07:50:59 +0000 : Junyor
0 results here. I started using Drupal with version 4.4, though.
------------------------------------------------------------------------
Thu, 03 Feb 2005 23:56:18 +0000 : killes(a)www.drop.org
@Junyor: Thanks, that's a good sign. Maybe somebody else has an older db
to try.
@robertDouglass: The first effect you describe is due to drupaldevs
running on PHP 5. I am unsure why the second thing does not work. In
node_save() the node object has a nid although there is none in the
form. Very strange.
I've enabled display of db queries on the testsite.
------------------------------------------------------------------------
Fri, 04 Feb 2005 19:17:55 +0000 : dmjossel
No results here on the query:
select nid,type from node where type like '%/%';
On a database that was put in place prior to Drupal 4 and is now
running on 4.5.2.
------------------------------------------------------------------------
Fri, 04 Feb 2005 20:44:23 +0000 : killes(a)www.drop.org
@dmjossel: thanks.
@all. The strange problem I reported was apparently php 5 related.
After applying Steven's php 5 patch it went away. One error is
remaining: If I create a new forum topic it will be shown as part of
the book on preview. Hmm, that was due to a db that got corrupted
during testing so that is fixed too.
Please poke around at the test site and look if you find more errors.
------------------------------------------------------------------------
Sat, 05 Feb 2005 07:16:22 +0000 : Steven
By the way, I just remembered that Drupal.org has some blogs lingering
on in the database even though blog.module is not enabled. Perhaps this
is causing troubles?
------------------------------------------------------------------------
Sat, 05 Feb 2005 11:22:59 +0000 : Anonymous
I can't see why it would. Drupal.org will need extra updates for images
and project nodes because those have their own tables. GK.
------------------------------------------------------------------------
Sun, 06 Feb 2005 12:49:55 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_16.patch (52.49 KB)
Updated to apply to cvs again.
------------------------------------------------------------------------
Tue, 22 Feb 2005 20:15:40 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_17.patch (49.64 KB)
Updated again.
All we need is a patch to upload module and an upgrade path for it.
------------------------------------------------------------------------
Fri, 04 Mar 2005 04:22:58 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_18.patch (52.31 KB)
Updated once more. Moved log field from book to node_revisions table as
discussed in Antwerp. upload module still missing.
We need to decide under which circumstances the log field should be
displayed. Should that be added to the workflow? Should it depend on
the revisions setting?
------------------------------------------------------------------------
Sat, 05 Mar 2005 19:27:03 +0000 : Anonymous
Attachment: http://drupal.org/files/issues/revisions_20.patch (75.52 KB)
Ok, here it is: Yet another revision of this grrrrreat patch.
Changes from previous versions:
- supports versioning for uploaded files. A problem is that if you
delete a file, it will be gone for all revisions.
- the log field is now in the node_revsions table, but each module has
to decide whether to show it or not.
I've implemented it for the page and the book type odes. Also, the
field can be edited when adding non-book nodes to the book. The log is
displayed on the revisions page and if a node is moderated.
- the revisions are moved to an old_revisions table to a) get the node
table smaller and b) still leave the mavailable for contrib modules
that want to retreive old version data.
The patch has been applied to killes.drupaldevs.org/revision where it
can be tested by anybody (especially people who have "site admin"
rights on drupal.org)
The database is from drupal.org and you should b able to log in with
your pass or simply mail yourself a new one.
Gerhard
------------------------------------------------------------------------
Sat, 05 Mar 2005 19:51:56 +0000 : Anonymous
Attachment: http://drupal.org/files/issues/revisions_21.patch (59.42 KB)
BTW, I marked this a bug because atm the revisions field can grow quite
big. Neil has reported problems from some users who were not able to
load some nodes due to to many large revisions.
Also, som unrelated stuff crept into the patch. New version attached.
------------------------------------------------------------------------
Tue, 08 Mar 2005 05:56:01 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_22.patch (60.29 KB)
Ok, I think I got it.
Changes to last version:
- uploads are no properly versioned.
Missing are still pgsql checks and updates.
------------------------------------------------------------------------
Thu, 10 Mar 2005 16:58:41 +0000 : Anonymous
Was able to get http://drupal.org/files/issues/revisions_21.patch to
work with drupal-cvs.tar.gz (10 March 2005) by:
- includes/database.mysql.inc: Commenting out duplicates for functions
function db_begin_transaction and function db_commit_transaction
- modules: node.module: Removing "'title' => $node->title," from
$node_table_values variable declaration and removing "'title' =>
"'%s'"," from "$node_table_types" variable declaration.
Happy to submit a patch if requested. I'll watch this thread.
------------------------------------------------------------------------
Fri, 11 Mar 2005 23:59:45 +0000 : killes(a)www.drop.org
The duplicate function has been removed in rev 22 of this patch.
Why do you think the changes in node_save are needed? Titles are saved
in both tables for performance reasons.
------------------------------------------------------------------------
Sun, 13 Mar 2005 16:12:21 +0000 : jlerner
Hi - I posted comment #62. The changes to node_save appear to be needed
because recent patches (both 21 and 22) remove the field 'title' from
table 'node'. So without the changes to node_save, node.module is
broken and generates errors.
Joshua
------------------------------------------------------------------------
Sun, 13 Mar 2005 16:29:42 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_23.patch (61.17 KB)
Thanks, Joshua, for catching this. node:title is there to stay.
------------------------------------------------------------------------
Wed, 13 Apr 2005 16:29:23 +0000 : moshe weitzman
since HEAD is open again, perhaps it is a good time to revisit this
patch.
once this is committed, lets address - http://drupal.org/node/11071
"node_validate does not respect group editing"
------------------------------------------------------------------------
Mon, 18 Apr 2005 15:43:42 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_24.patch (60.39 KB)
Updated.
------------------------------------------------------------------------
Mon, 18 Apr 2005 16:16:32 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_24_0.patch (60.39 KB)
Updated.
------------------------------------------------------------------------
Tue, 19 Apr 2005 05:19:42 +0000 : Dries
I'll commit this patch later this week! If you haven't checked this
patch already, I urge you to test/check it out because it will have
significant impact on existing code and modules!
------------------------------------------------------------------------
Tue, 19 Apr 2005 05:21:33 +0000 : Dries
Also, what do people think about the n.title being duplicated?
------------------------------------------------------------------------
Tue, 19 Apr 2005 05:26:58 +0000 : chx
I won't lose any sleep because of duplicated titles...
------------------------------------------------------------------------
Tue, 19 Apr 2005 18:35:58 +0000 : killes(a)www.drop.org
Let me explain why I have chosen to duplicate the title (and also the
uid): If you look at all the queries in Drupal, you will find that most
of them only need the title and th uid of a node. That is, by
duplicating it, we save expensive joins on the node_revisions table.
Due to this fact, this patch is actually a performance improvement.
A note about updating contrib module:
Strictly speaking they wouldn't need to be updated. They only need to
if their authors decide that their info should be available for
revisioning. The upgrade path for forum.module in my update.inc patch
(+ the forum patch)
should show you what needs to be done.
I will write a note for the update page once the patch hits core.
------------------------------------------------------------------------
Sun, 24 Apr 2005 21:21:19 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_25.patch (60.38 KB)
Updated to cvs.
Dries: Based on some remarks in #drupal this is the last update I am
going to do. Apply it or won't fix it.
------------------------------------------------------------------------
Sat, 30 Apr 2005 03:42:39 +0000 : Jeremy(a)kerneltrap.org
Attachment: http://drupal.org/files/issues/revisions_25.patch.patch (528 bytes)
That's a big patch. I've only started looking through it. I noticed
one little typo, affecting updates. A patch to your last patch is
attached.
I'm running with the revision patch on my dev server now happily. I
like the concept.
What happens if you click 'stop' on your browser in the middle of a
MySQL "transaction"? I assume that kills the connection to MySQL, and
the lock is freed? But this then leaves changes only partially
applied?
What exactly does locking/unlocking the tables buy us in MySQL? I
don't see anywhere that we detect if an apply fails part way through,
and thus roll back...? What am I missing?
------------------------------------------------------------------------
Sat, 30 Apr 2005 07:11:28 +0000 : Dries
Jeremy: many of us are worried about the performance ramifications this
patch introduces. Early experiments showed a small performance
improvement (while a performance regression might be expected). More
performance reports from large sites like kerneltrap.org will certainly
help this patch. Mind to do a quick performance comparision and to
report back with some numbers? Thanks.
------------------------------------------------------------------------
Sat, 30 Apr 2005 12:38:02 +0000 : Jeremy(a)kerneltrap.org
Dries: I'm not running HEAD on kerneltrap, so this really isn't a
possibility. Furthermore, until I understand why we're locking tables,
I don't like it. The idea of revisions in their own tables is great.
The idea of locking tables to get (without any obvious benefit) there
really worries me.
------------------------------------------------------------------------
Sat, 30 Apr 2005 14:16:01 +0000 : killes(a)www.drop.org
@Jeremy: Thanks for looking at the patch! Also for catching the typo. :)
Did you try to upgrade your database? If yes, how did it go? One of
Dries' concerns is the complexity of the upgrade. How many nodes and
revisions did the db have?
About database locking: This part of the patch was created by Nick and
I simply continued to use it.
Maybe the code should rather be:
if(db_begin_transaction(array('{node}', '{node_revisions}',
'{watchdog}', '{sessions}', '{files}'))) {
db_query($node_query, $node_table_values);
db_query($revisions_query, $revisions_table_values);
db_commit_transaction();
...
}
The idea is probably to avoid two updates at the same time. I don't
think the locking helps if you abort the script at an inconvenient
time. Rollbacks aren't implemented in all mysql versions.
We could omit the db locking if deemed inappropriate. Maybe Nick can
explain his ideas behind this.
@Dries: I wonder who the "many of us" are. They certainly haven't
spoken to me. Moshe had some reservations about the upgrade path and
project module, but the time that project module abused revisions to
store issue updates was long ago and his reservations were resolved.
Nobody else (besides you of course and now Jeremy) has voiced
reservations in a way that was audible to me.
If you grep through the patch you will notice that there are only four
queries which have a join on the node_revisions table. Two of them are
in node_load and in the other cases the join replaced a join on the
node table. The two queries in node_load are the only ones that have
both a join on the node and the revisions query. Thus, loading of
individual nodes might become somewhat slower. All other queries will
be faster since the node table is now much smaller. Also, node loading
does not have to be slower, it depends on your node table. If you had
a lot of revisions and thus a large table, then the new scheme will
make your queries actually faster since we do not load the revisions
on each and every node load anymore. If you didn't have many revisions
your node_load migth be somewhat slower.
WRT to the update script Karoly pointed out that we could use multiple
insert queries instead one query per revision. This would probably
make the update somewhat faster. I am willing to work on this iff you
declare that you will commit the patch afterwards. I'd need to know if
this will work on pgsql and on all supported mysql versions before I
work on it.
About locking: Database locking is dog slow, at least on mysql. I was
using locks in an earlier version of the upgrade script but had to
remove it for (serious!) performance reasons.
------------------------------------------------------------------------
Mon, 09 May 2005 15:07:34 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_26.patch (46.45 KB)
Ok, another update, cause I need it myself.
I've left out the transaction stuff for now. It is in principle
unrelated to this patch and should be discussed elsewhere.
This also makes the patch smaller and easier to review (hint, hint).
------------------------------------------------------------------------
Mon, 09 May 2005 20:32:09 +0000 : killes(a)www.drop.org
The patch contained the update functions twice.
------------------------------------------------------------------------
Mon, 09 May 2005 20:32:26 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_27.patch (39.05 KB)
The patch contained the update functions twice.
------------------------------------------------------------------------
Mon, 09 May 2005 21:23:06 +0000 : Dries
Got one error during the upgrade path:
ALTER TABLE {book} ADD PRIMARY KEY vid (vid)
FAILED
------------------------------------------------------------------------
Mon, 09 May 2005 21:26:19 +0000 : killes(a)www.drop.org
This had happend to me as well, when I tested this patch. The reason is
that for some reason the vid is not unique. Most likely there are some
entries with vid = 0 in there. Can you check which node types those
have? it always was an error in the test database. See:
http://drupal.org/node/7582#comment-20678
------------------------------------------------------------------------
Mon, 09 May 2005 21:27:06 +0000 : Dries
Actually, I got 2850 errors during the upgrade.
Some of these:
sprintf() [function.sprintf]: Too few arguments in
drupal-cvs/includes/database.inc on line 154.
Some of these:
Query was empty query: in drupal-cvs/includes/database.mysql.inc on
line 66.
And this:
Unknown table 'n' in field list query: SELECT n.nid, n.vid FROM node
INNER JOIN files f ON n.nid = f.nid in
drupal-cvs/includes/database.mysql.inc on line 66.
:-)
------------------------------------------------------------------------
Mon, 09 May 2005 21:29:19 +0000 : Dries
Or this:
user error: Unknown column 'log' in 'field list'
query: SELECT parent, weight, log FROM book WHERE nid = 1 in
drupal-cvs/includes/database.mysql.inc on line 66
------------------------------------------------------------------------
Mon, 09 May 2005 21:52:12 +0000 : Dries
The time required to generate my main page went from 902 ms (before
upgrade) to 2139 ms (after upgrade).
The time required to generate a forum listing (?q=forum/x) went from
1872 ms (before upgrade) to 2874 ms (after upgrade).
Maybe this is because my database is not consistent as the result of
the upgrade errors (yet I don't see any errors on the pages I
benchmarked).
------------------------------------------------------------------------
Tue, 10 May 2005 00:24:38 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_28.patch (53.47 KB)
Ok, let me get to this from the bottom to the top:
- my test runs indicated a different development wrt timing. If I had
gotten your results, I had stopped working on this long ago. So your
results are wrong for some reason.
- user error: Unknown column 'log' in 'field list'
Wasn't my day, the book patch got lost. It is contained now. First -R
the old patch, then apply this one.
- Unknown table 'n' in field list query:
Walkah found this, but I forgot to fix it. Fixed now.
- I've no idea where the other queries come from. I am hoping that
either your test db is borken or they are follow ups from the other
ones.
If you let me have your test db, I'll try some debugging.
Thanks for wasting your time, too.
------------------------------------------------------------------------
Tue, 10 May 2005 05:07:31 +0000 : Dries
I double-checked and the numbers don't seem to lie. I'll test some more
after work on another machine to make sure it is not platform-specific.
------------------------------------------------------------------------
Wed, 11 May 2005 03:32:47 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_29.patch (54.83 KB)
Ok, here I am again.
What I did:
1) Ask Dries to let me have drupal.org database
2) get 400MB of SQL inserts...
3) take 23 minutes to import said data
4) Remove all image and project nodes (don't want to install their
modules), 11765 nodes left
5) back up data
6) take tests on non-cached /node page (as anonymous user).
ab results:
-c 1 -n 25:
Requests per second: 1.29 [#/sec] (mean)
Connection Times (ms)
min mean[+/-sd] median max
Total: 663 775 179.7 689 1264
7) Do the same for the tracker page:
Requests per second: 0.83 [#/sec] (mean)
Total: 1182 1199 7.4 1199 1217
8) Apply my patch (rev. 28).
9) run db update and hold breath
10) update times out...
11) play back backup from 5)
12) wait
13) getting annoyed and removing cache, watchdog, and accesslog before
playing in backup.
14) wait again. Understand why Dries doesn't try this patch often.
Maybe a smaller DB would do for testing?
15) wait more. get really annoyed.
16) Set time limit to 18000 in update.php
17) try again
18) fails again before the second update is completed.
19) curse.
20) delete search stuff from db. Ooops, sooo much smaller...
21) import again, below 2 minutes...
22) rewrite to use extended insert. Found a bug.
23) still does not complete. Mysql logging to the rescue!
24) tid = 0? Not good.
25) Well, the update works fine till node 10834. 5595 nodes done, 6136
to go.
26) Writing shell based update script. Discovery: 24MB aren't enough.
Hopefully 64 are. Nope.
extended inserts for revisions are apparently not the brightest idea:
Huge memory consumption.
Hmm, no, all updates got through. Selecting the revisions to put them
into old_revisions table screwed it. Learned about CREATE TABLE
old_revisions SELECT syntax.
Yay! finally. 24 MB are just not enough the update.php script seems to
still break.
27) Benchmarks!
/node
Requests per second: 1.54 [#/sec] (mean)
Connection Times (ms)
min mean[+/-sd] median max
Total: 632 649 40.5 636 791
/tracker
Requests per second: 0.86 [#/sec] (mean)
Total: 1119 1165 65.8 1160 1461
Ok: So we get an improvement for many node_loads, but none for simple
selects from node.
More tests can be done.
28) roll new patch
Ain't Drupal fun?
------------------------------------------------------------------------
Wed, 18 May 2005 13:38:05 +0000 : Dries
I did another round of tests on _another_ machine and it is not looking
good:
Before upgrade After upgrade
?q= (main page) 218 ms/request 340 ms/request
?q=forum (forum overview) 754 ms/request 1520 ms/request
?q=about (book page) 375 ms/request 5400 ms/request
The upgrade process itself gave me a number of 'query was empty' and
'sprintf(): too few arguments' reports. Everything seems to work fine
though.
Looking at the ?q=about page, I see that the following query is
executed twice _and_ that each time, it take more than 2 seconds to
complete:
SELECT n.nid, n.title, b.parent, b.weight FROM node n INNER JOIN book b
ON n.vid = b.vid WHERE n.status = 1 AND n.moderate = 0 ORDER BY
b.weight, n.title;
--8 SHOW INDEX FROM book;
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name |
Collation | Cardinality | Sub_part | Packed | Null | Index_type |
Comment |
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| book | 1 | book_parent | 1 | parent | A
| 92 | NULL | NULL | | BTREE | |
| book | 1 | nid | 1 | nid | A
| 369 | NULL | NULL | | BTREE | |
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
2 rows in set (0.00 sec)
The book module does not appear to have a primary key? Sounds like a
bad idea so I added one:
mysql> ALTER TABLE book ADD PRIMARY KEY nid (nid);
Query OK, 369 rows affected (0.02 sec)
Records: 369 Duplicates: 0 Warnings: 0
Next, I wanted to make the vid column a unique key in all node tables:
mysql> ALTER TABLE node ADD UNIQUE vid (vid);
Query OK, 20392 rows affected (0.81 sec)
Records: 20392 Duplicates: 0 Warnings: 0
mysql> ALTER TABLE book ADD UNIQUE vid (vid);
ERROR 1062: Duplicate entry '0' for key 2
mysql> ALTER TABLE forum ADD UNIQUE vid (vid);
Query OK, 10806 rows affected (0.10 sec)
Records: 10806 Duplicates: 0 Warnings: 0
As you can see, it fails for the book table which makes me believe
there is some inconsistent data ... I set out to fix that:
mysql> SELECT nid, COUNT(nid) AS vids FROM book GROUP BY vid HAVING
vids > 1;
+-----+------+
| nid | vids |
+-----+------+
| 871 | 2 |
+-----+------+
1 row in set (0.00 sec)
mysql> SELECT title FROM node WHERE nid = 871;
Empty set (0.00 sec)
mysql> DELETE FROM book WHERE nid = 871;
Query OK, 1 row affected (0.00 sec)
mysql> ALTER TABLE book ADD UNIQUE vid (vid);
Query OK, 368 rows affected (0.01 sec)
Records: 368 Duplicates: 0 Warnings: 0
Looks like everything is well now. Ran some new benchmarks:
Before upgrade After upgrade With
indices
?q= (main page) 218 ms/request 340 ms/request 336
ms/request
?q=forum (forum overview) 754 ms/request 1520 ms/request 1531
ms/request
?q=about (book page) 375 ms/request 5400 ms/request 475
ms/request
Unfortunately, we're still slower than the original code.
------------------------------------------------------------------------
Wed, 18 May 2005 21:53:31 +0000 : killes(a)www.drop.org
Dries, thanks for testing it again.
I do think the broken queries you observer have something to do with
the bad performance after the update. Please log the queries any I will
have a look at them. I've never seen any such queries.
My update script also tries to create the appropriate indices, but it
will of course fail if the database contains cruft. the indices for the
forum are probably missing, too.
I am still convinced that the patch is core worthy.
------------------------------------------------------------------------
Thu, 19 May 2005 04:36:09 +0000 : Dries
It wouldn't hurt if more people would benchmark this patch. The patch's
current performance worries me.
Did you check your watchdog messages after upgrading the drupal.org
database? Depending on your settings, errors might only be shown in
the watchdog. I'll look into the remaining glitches as time permits.
Thanks for your persistence in keeping this patch up-to-date. :)
------------------------------------------------------------------------
Thu, 19 May 2005 11:59:22 +0000 : killes(a)www.drop.org
Dries: Can you please let me have your updated database? I want to have
a look at it and try my own benchmarks with it.
And yes, if I did learn something on this project is how to be
persistant. ;)
------------------------------------------------------------------------
Fri, 24 Jun 2005 16:25:34 +0000 : killes(a)www.drop.org
Here is an idea that occurred to me:
The problem with the upgrade process is that keeping the existing
revisions requires a lot of work to do. This generates a huge amount of
sql queries for a large database and also requires a huge amount of
memory.
My suggestions is to let update.php only handle the basic upgrade, ie
without old revisions. An additional module could be created that would
implement a cron based approach to upgrading old revisions one node at a
time. it could expose a hook to let contrib modules do their own
upgrades.
Dries, what do you think? (I am writing "Dries" because he seems to be
the only one who is interested in getting this into core...)
------------------------------------------------------------------------
Fri, 24 Jun 2005 22:25:11 +0000 : Junyor
Killes:
I'm also interested in seeing this hit core. What about adding
something to legacy.module to do it?
------------------------------------------------------------------------
Sun, 26 Jun 2005 21:14:54 +0000 : chx
This is a sensible approach. Maybe this is the _only_ sensible approach.
I have a little problem though: while the conversion is running somehow
both revision handlers should be available.
------------------------------------------------------------------------
Sun, 26 Jun 2005 22:16:21 +0000 : killes(a)www.drop.org
hehe, one only has to whine and bug enough and one gets some feedback.
;)
@junyor: legacy.module would be a good place. my current idea is to
auto-enable it in update.php and then disable it again in legacy_cron
after all nodes are updated.. ;)
@chx: When somebody wants to look at revisions of a node that node
could be auto-updated.
The only problem are contrib modules: they's need to have some hook in
order to update their own data. When somebody looks at the revisions of
a node than cannot be updated because the contrib module in question has
no such hook, we can optionally let the user discard old revisions I
guess.
Dries, what do you think?
------------------------------------------------------------------------
Mon, 25 Jul 2005 16:48:51 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_30.patch (53.13 KB)
Sooooo; I've updated this patch once again.
Dries didn't like my idea of legacy updates so we will have an option
to discard old revisions in case the update should prove difficult.
Always keep a backup.
------------------------------------------------------------------------
Mon, 25 Jul 2005 17:31:12 +0000 : Bèr Kessels
can we please accept and commit this patch? We can iron out any issues
later. This patch is just far too big and complex to be
perfect-at-once.
------------------------------------------------------------------------
Mon, 25 Jul 2005 19:22:50 +0000 : Dries
@Ber: no, we can't commit this patch blindly.
Data loss is a much bigger problem than a syntax error or other code
glitch. We can break Drupal but we can't break their data.
I've spent quite a bit of time testing the previous version of this
patch and noticed significant performance degradation.
Tell me, Ber, why can't you test this patch first?
------------------------------------------------------------------------
Mon, 25 Jul 2005 19:40:37 +0000 : Bèr Kessels
I was not referring to not testing it. If it is just the upgrade path
that proves to be cumbersome, then why can we not fix it afterwards,
I.e. when everyone has had a good chance to look at it.
Testing such a huge patch requires a lot of work. something no-one just
does in his spare hours. We had discussions before to deal in a slightly
different way with big changes; to commit them quicker and to leave the
ironing out of any left overs for the community. I hooked into that
discussion here. For two reasons. One is that Gerhard has spent
numerous hours on maintaining this patch. The other one is that the
community can be of help with ironing out issues in such a large
change, much better than that Gerhard can do on his own.
And yes, dataloss is very bad, but no-one should loose data, when
he/she followed the instructions (backing up)...
------------------------------------------------------------------------
Mon, 25 Jul 2005 20:00:38 +0000 : Dries
Ber: applying this patch takes 15 seconds. Whether I apply this patch
for you, or whether you apply it yourself, it will hardly reduce the
'testing cost'. The problem is that data loss can be subtle; it might
go unnoticed for a couple days. Make no mistake, I'd like to see this
patch committed ASAP, but it warrants some testing. Let's
test/benchmark it and commit it.
------------------------------------------------------------------------
Mon, 25 Jul 2005 20:55:03 +0000 : drumm
The upgrade script already takes a long time to execute and does not
provide feedback to the user about how it is doing. I plan on making
the upgrade script able to spread the updates across multiple page
views and give user feedback showing that progress is being made. This
will hopefully make the speed of this update a moot point.
I am working on this for my dayjob, CivicSpace, so it should get done
"real soon now", but it should be expected to take awhile (smallish
number of weeks). Please make comprimises if necessary.
------------------------------------------------------------------------
Wed, 27 Jul 2005 17:38:57 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_31.patch (48.78 KB)
@drumm: the timeout isn't usually the problem. Memory consumption is.
I'll look into doing the updates in chunks of 1000 nodes or so.
Clousseau found a bug in the patch, updated.
------------------------------------------------------------------------
Thu, 28 Jul 2005 18:28:08 +0000 : moshe weitzman
Even if the upgrade issues are resolved, we have to figure out why this
patch is slowing down drupal (according to benchmarks posted here).
------------------------------------------------------------------------
Fri, 29 Jul 2005 07:51:45 +0000 : Dries
We can't proceed without some additional benchmarking. I already
benchmarked it on two machines (see comments above), and I'll benchmark
it again after August 10. The performance impact of this patch worries
me so if two other people could test and benchmark this patch
extensively, that would come a long way. August 10 is close to the
code freeze so some help is necessary.
If Gerhard/killes reviewed/tested one of your patches, it is time to
return the favor ...
------------------------------------------------------------------------
Fri, 29 Jul 2005 15:40:31 +0000 : Jose A Reyero
After looking at the patch I'm not really surprised this slows down
everything. I thought that the reason why we wanted revisions in their
own table was in the first place to have a simpler -and faster- node
table. But this patch:
- Adds fields and complexity not only to the node table but also to all
the main tables for data
- Needs additional joins just to retrieve single nodes (which is done
many times for a typical main page)
- Does away with encapsulation of the revision functionality requiring
other modules to handle revissions related data.
So please, please, please:
- Make revisions table only needed when actually using revisions
- Keep that old nice thing of hiding revisions system from other
modules
- And what's so bad with serializing data anyway? I agree that usually
there are better ways to store data. But this is one of the cases where
serialization does make sense. Not the current huge serialized field,
but only a new table with one serialized node per row.
- In general, do it much simpler. This is way too complex and it
removes more functionality than it adds.
In just four words: keep it simple, please.
------------------------------------------------------------------------
Fri, 29 Jul 2005 21:25:26 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_32.patch (51.27 KB)
Patch updated (v31 was buggy, and there was an update to updates.inc).
I've had a discussion with Jose and he will maybe do some testing over
the weekend. John is also doing some testing as we speak.
------------------------------------------------------------------------
Sat, 30 Jul 2005 16:06:59 +0000 : Nick Nassar
It has been exactly one year since I first submitted this patch.
You know what that means: *old patch party in IRC!* Virtual beers on
me!
w00t!
------------------------------------------------------------------------
Sat, 30 Jul 2005 16:08:33 +0000 : Jose A Reyero
Hi again.
I had a talk with killes -I have to say he enlightned me about this-
and also did some testing. Sorry but I dont have a good set up to
provide benchmarks.
Here are my conclusions -and also some explanation for the varied
results abt performance:
- This is faster for simple node listings
- This is slower for full node/load
....thus the performance increase/decrease for a given page should
depend on the relation between the number of simple node listings
(usually blocks) and the number of nodes displayed in the main page.
I've also experienced some trouble with attachments -not re-created
when creating a new revision- and book relations, same.
I'm not sure I like this new "feature" of creating a new revision every
time you rollback an old one -in Drupal 4.6 this seems not to happen.
Besides these details, I think this patch is good enough -and quite a
powerful approach actually- though I still feel it's bit too complex
when simply dealing with nodes -without revisions-, which is what we do
most of the time, and complexity will be even higher with contrib
modules using still more tables, not to mention flexinode -versioning
info will have to be scattered everywhere in the db.
So this is not a +1 nor -1. On one side I think this is powerful. On
the other side I feel this is too complex and really, it's more than I
need.
I would be happy with a 'revision' table having: nid, vid,
data(serialized), that you can forget about when not using revisions.
------------------------------------------------------------------------
Sat, 30 Jul 2005 19:16:41 +0000 : Nick Nassar
On a much more serious note that my last comment, unless I'm missing
something, you've introduced a subtle bug into this patch since I left
it.
A and B are both updating revision 5 of a node.
A calls db_next_id() and gets a revision id of 6
B calls db_next_id() and gets a revision id of 7
B updates the node's revision to 7
B adds revision 7 to the revision table
A updates the node's revision to 6
A adds revision 6 to the revision table
Now, you have a situation where the current revision (6) isn't actually
the last revision(7). For some uses, that's not a big deal, but in the
case of wikis, which is what I originally hoped to use this patch for,
it creates problems.
------------------------------------------------------------------------
Mon, 01 Aug 2005 00:04:14 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_33.patch (51.27 KB)
Updated the patch. updates.inc was changed and chx found a typo.
@Nick: I don't think the issue is due to this patch. Drupal has
currently no mechanism to not let people commit a new version of the
same node, whether with or without revisions. drupal_next_id ensures
that the revision IDs are unique and the later revisions will be the
current one. Which revision is the current one does not depend on order
of the version ID, btw.
@Jose: what exactly were your problems with uploads? what do I need to
do to reproduce?
------------------------------------------------------------------------
Mon, 01 Aug 2005 11:33:56 +0000 : Jose A Reyero
Sorry, I dont know what I did the other day but I cannot reproduce it
either :-)
I applied this new patch. Problems I'm having now (Testing with
"stories")
- Book outline is gone after creating a new revision
- I cannot delete attachments (not sure it is related to your patch or
to something else)
- Usage of the log message: Is that mensage intended for revisions or
only for book outlines? Currently the field is displayed only for
outlines but shown in the revision tab
Btw, the db update failed for me in patch 32, so I did it manually. I
didnt do a db update again after rolling back v32 and applying v33
-should I?
------------------------------------------------------------------------
Sun, 14 Aug 2005 16:03:41 +0000 : killes(a)www.drop.org
Jose, thanks a lot for testing!
"- Book outline is gone after creating a new revision"
Sounds like a bug, it should not.
"- I cannot delete attachments (not sure it is related to your patch or
to something else)"
Another bug
"- Usage of the log message: Is that mensage intended for revisions or
only for book outlines? Currently the field is displayed only for
outlines but shown in the revision tab"
Log messages are possible for nodes of type book and page as well as
all nodes that are put into the book. Log messages can be added for any
node type, the module author should implement those.
"Btw, the db update failed for me in patch 32, so I did it manually. I
didnt do a db update again after rolling back v32 and applying v33
-should I?"
No, the db update should be the same.
I will try to fix the bugs you found.
WRT the db update I've got an announcement to make: I will not migrate
old revisions.
Why?
1) it is a rather complicated process that requires a lot of memory and
is likely to give people on cheap hosting trouble.
2) testing it takes a lot of time.
3) I don't personally need it
I will still move old revisions to an old_revisions table from where
they can be converted by a later update or a custom script.
------------------------------------------------------------------------
Tue, 16 Aug 2005 21:34:43 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_34.patch (52.92 KB)
Ok, here is incarnation No. 34 of this patch. Dries agrees with me that
we can put the revision rescue operation into a later update, possible
using Drumm's installer.
The patch fixes the issue with the file uploads not being deletable, I
could not reproduce the problem with the outlines.
------------------------------------------------------------------------
Sat, 20 Aug 2005 13:33:40 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/scripts.tar.gz (4.39 KB)
I've started to do some performance testing.
I created a script that creates nodes that have revisions according to
the new system. A node has a 50% chance to have a revision and then an
equal chance fro between 1 and 25 revisions. I've created a database
with 50 nodes and 500 comments, 7 vocabs and 100 total taxo terms.
I've also created a script to convert these revisions to the old format
(much easier than the other way around). Initial tests didn't show a
significant change for /node. I'll try with more nodes and /tracker.
Attached are the two scripts in a tarball. They need to be run against
the patched install.
Also included are the two preliminary results.
------------------------------------------------------------------------
Sat, 20 Aug 2005 15:16:04 +0000 : killes(a)www.drop.org
I've now created a databse which has 1000 nodes and each node has 50 or
51 revisions. I have no taxo terms assigned and no comments (those
parts didn't change). I am only using stories and pages. I am now
fairly confident that this patch is a performance improvement for this
kind of setup.
for the new patch on the /node page I get
Requests per second: 0.27 [#/sec] (mean)
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 3592 3648 56.9 3640 3985
Waiting: 3573 3626 55.1 3620 3949
Total: 3592 3648 56.9 3640 3985
for the old one I get:
Requests per second: 0.23 [#/sec] (mean)
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 4247 4297 64.0 4286 4657
Waiting: 4156 4263 65.3 4253 4624
Total: 4247 4297 64.0 4286 4657
For the tracker the results are even better:
old:
Requests per second: 2.00 [#/sec] (mean)
new:
Requests per second: 3.64 [#/sec] (mean)
Anybody interested in obtaining this database can mail me. It is 60MB
(gzipped).
1
0
Issue status update for
http://drupal.org/node/28420
Post a follow up:
http://drupal.org/project/comments/add/28420
Project: Drupal
Version: cvs
Component: comment.module
Category: bug reports
Priority: normal
Assigned to: Jeremy(a)kerneltrap.org
Reported by: Jeremy(a)kerneltrap.org
Updated by: Jeremy(a)kerneltrap.org
Status: patch (code needs review)
Attachment: http://drupal.org/files/issues/comment.module_16.patch (1.11 KB)
drupal_set_message(..., 'error') isn't sufficient to prevent the comment
from being posted. I have instead updated the patch to set the error on
the hidden 'token' form field.
I have updated the message to read:
"Unable to validate your comment, please try again. If this error
persists, please contact the site administrator."
If you don't like the error message, better suggestions are welcome.
Jeremy(a)kerneltrap.org
Previous comments:
------------------------------------------------------------------------
Mon, 08 Aug 2005 01:55:34 +0000 : Jeremy(a)kerneltrap.org
Setting "Preview comment" to "Required" does not strictly require that
the comment be previewed first. This is being abused by spammers to
quickly and efficiently post spam comments.
I discovered this after I added a new feature to my new spam module [1]
to auto-blacklist spammer IP addresses, allowing me to block comment
spammers when they preview a comment and thus preventing them from ever
inserting their spam into my database. I configured my comment module
to "require" comment previews, and yet found that the comments were
slipping past my filter. I finally realized what the spammer is doing
is setting $_POST['op'] to 'Post comment', effectively bypassing the
preview phase.
I'm currently looking for a clean solution to this. At the moment the
only idea I have is to generate a token at the preview phase, and
validate the token at the post phase. Unfortunately the token would
have to be stored in the databse between the preview and the post,
which adds overhead.
Alternatively, I've considered using a time-based hash which would
constantly update depending on the time of day. This could easily be
validated without storing anything in the database. If too long has
gone between the preview and the post, an additional preview step would
be required... The down side here is that the time-based hash would be
publically available, and thus the spammer could easily duplicate it in
their script. A private key could solve for that, but increases the
complexity as it adds a configuration step.
I have the feeling I'm missing a simpler, cleaner solution.
Suggestions?
[1] http://kerneltrap.org/jeremy/drupal/spam/
------------------------------------------------------------------------
Mon, 08 Aug 2005 02:26:21 +0000 : moshe weitzman
even if you get this fixed, won't these bots just add a preview step?
this 'preview required' feature is designed to maintain high quality
submissions by forcing users to proof read. it isn;t designed for
security.
i think you want to hook into comment_validate(). just add a hook here
- there is already a hook_comment() waiting for you to add an
operation.
------------------------------------------------------------------------
Mon, 08 Aug 2005 08:49:30 +0000 : Eaton
I posted a patch a few days ago (http://drupal.org/node/28255) that adds
validation and form construction hooks for comments. It's similar to the
one that the captcha module uses, though it adds comment form_pre and
form_post hooks instead of a single comment form hook.
------------------------------------------------------------------------
Mon, 08 Aug 2005 13:30:34 +0000 : Jeremy(a)kerneltrap.org
Attachment: http://drupal.org/files/issues/comment.module_11.patch (2.5 KB)
> even if you get this fixed, won't these bots just add a preview step?
Eventually, yes, but it drastically changes their ability to fling spam
at a site. As is, they simply have a script that shoots data out at
high speed without having to wait for messages to return from the
server. It is the server that is doing all the work, thus making it
simple for a spammer to DoS a site.
If "preview required" really meant "preview required", they would be
forced to first automate clicking "preview", and then wait for a
response before clicking "submit". This requires more resources on
their side, and allows us to add delays after clicking "preview" (if we
detect that they are a spammer) further using their resources.
> this 'preview required' feature is designed to maintain high quality
> submissions by forcing users to proof read. it isn;t designed for
security.
Regardless of the intention, I was misled to believe that configuring
my site to require previews would require that all comments were first
previewed. As a site administrator, I would prefer to know that
"required" really means "required".
> i think you want to hook into comment_validate(). just add a hook
here -
> there is already a hook_comment() waiting for you to add an
operation.
Yes and no. Ultimately yes that will work and will allow my spam
module to prevent the spam from ever being posted. But it still leaves
the greatest burden on the web server, instead of on the spammer. The
spammer can still use a very simple script that only pushes data, and
thus can generate spam at an unbelievable rapid rate.
Here is an example patch to enforce "preview required". It's one idea,
I'm sure there are better ones.
------------------------------------------------------------------------
Mon, 08 Aug 2005 14:01:38 +0000 : Jeremy(a)kerneltrap.org
Attachment: http://drupal.org/files/issues/comment.module_12.patch (1.4 KB)
Here's a second version of the patch that doesn't require any manual
configuration.
------------------------------------------------------------------------
Mon, 08 Aug 2005 16:27:27 +0000 : Jose A Reyero
I like this idea, and the patch looks good
Still, I think it misses something, like some timestamp related hash,
because once you get the hash code you can post multiple comments with
that.
Another problem I can think of is, what happens when a cron run happens
between the preview and the post?? I'm afraid comments would get lost
For this second problem, I think a key generated only once after module
activation could do. About the first one....mmm... I'll sit down for a
while and think.....
------------------------------------------------------------------------
Tue, 09 Aug 2005 12:27:20 +0000 : Jeremy(a)kerneltrap.org
Attachment: http://drupal.org/files/issues/comment.module_13.patch (1.33 KB)
> I think it misses something, like some timestamp related hash, because
> once you get the hash code you can post multiple comments with that.
Using a timestamp will mean that the comment form "expires". That is,
if you wait too long to preview your comment, it will generate an error
when you try to post.
Yes, technically a spammer could post one real comment, then based on
what was in the session from that they could post the same identical
comment over and over, so long as it was attached to the same node.
But this is not what they do, they try and spread their spam throughout
your webpage. Furthermore, the spam module is perfectly capable of
detecting and preventing this.
> Another problem I can think of is, what happens when a cron run
happens
> between the preview and the post?? I'm afraid comments would get lost
The key is only generated once, that's what the first test is about.
In any case, in the unlikely event that the key were to change between
preview and post they would simply have to post a second time.
My earlier patch wasn't quite right, I was testing the token in the
wrong place. This patch fixes that.
BTW: This is beneficial for maintaining high quality submissions too,
as prior to this change someone could:
1) enter a comment
2) press preview
3) completely change their comment (introducing a mistake)
4) press submit and the comment (mistake and all) would go into the
database unpreviewed
After this change:
1) enter a comment
2) press preview
3) completely change their comment (introducing a mistake)
4) press submit and they get an error because they didn't preview
their changes - forcing them to preview once after any change
------------------------------------------------------------------------
Wed, 10 Aug 2005 03:50:33 +0000 : Jeremy(a)kerneltrap.org
FWIW: I've been getting slammed by spam attacks this whole week.
Installing this patch has made a huge difference. Well over 100 spam
attempts per minute (sometimes two and three times that) and I hardly
notice the spammer, whereas before it was choking my database.
(Granted, the spammer has not yet upgraded his script to first preview,
then submit. But even if he did it wouldn't help him as testing has
verified that the new spam module would prevent the comments from ever
getting to the database.)
Additionally, user and anonymous (nonspam) comments continue to show up
at a normal rate.
------------------------------------------------------------------------
Tue, 16 Aug 2005 14:08:04 +0000 : Jeremy(a)kerneltrap.org
I would love to see _any_ discussion on this. Drupal is currently too
easy to spam, with little effort on the spammer's side, and lots of
resources wasted on the Drupal side. A patch like this will greatly
increase the spammer's burden, and make it possible to effectively
block even the most aggressive spammer attacks.
------------------------------------------------------------------------
Wed, 17 Aug 2005 16:24:04 +0000 : Jose A Reyero
Well, this patch is definitely better than what we have, and would save
some spam for sure.
But maybe keeping track, at the session level, of generated hashes for
a user, and then removing them when the comment is sent, could do the
work.
This way we can forget about previewing comments or not, and also the
"permission" to post the comment would expire when the session expires.
Any randomly generated value could do for this, no need for complex
hashes, but having nid and pid in the hash would add some extra
security.
------------------------------------------------------------------------
Wed, 17 Aug 2005 19:58:02 +0000 : breyten
Jeremy, a big +1 on the idea, but why not generate the private key when
it is actually needed (Ie, when displaying the comment form), instead
of wasting a _cron() hook on it?
------------------------------------------------------------------------
Thu, 18 Aug 2005 03:20:02 +0000 : Jeremy(a)kerneltrap.org
> Well, this patch is definitely better than what we have, and would
save some spam for sure.
It is continuing to work very well on my site, which seems to be under
nearly perpetual spam attacks from multiple sources.
> But maybe keeping track, at the session level, of generated hashes
for a user, and then
> removing them when the comment is sent, could do the work.
The catch is: the key has to be something unique to the server, not
guessable or learnable from the outside Simply storing the hash data
in the session alone is not enough, as then the spammer could create
any random data and store it in the session.
That said, the hash could be generated off something other than the
text of the comment as it is now, so that a preview is not required.
I'll look at doing something like that and submit another patch.
> This way we can forget about previewing comments or not, and also the
"permission" to
> post the comment would expire when the session expires. Any randomly
generated value
> could do for this, no need for complex hashes, but having nid and pid
in the hash would
> add some extra security.
nid and pid alone are worthless, as they are easy to learn. The pid
can always be 0 (spam is rarely attached to a pre-existing comment).
The nid is obtained in the path of where the spam is being posted.
The solution is a "private-key", which is what my patch adds. Then
sure, hash the private key plus the nid and the pid, and you've got
enough protection to prevent most spammers. To make it even more
secure, automatic rey-keying could be easily accomplished.
------------------------------------------------------------------------
Thu, 18 Aug 2005 04:09:10 +0000 : Jeremy(a)kerneltrap.org
Attachment: http://drupal.org/files/issues/comment.module_15.patch (1.18 KB)
The attached patch:
1) gets rid of the _cron() hook
2) no longer requires that comments be previewed
Prior to this patch, comment spammers were able to send data to a
Drupal server acting as though they'd filled out a comment form and
pressed submit. As they didn't actually use the form, they could
submit spam comments at an obscene rate.
With this patch, comment spammers will have to actually load the form,
enter text, and press submit. Yes, that can still be automated, but it
takes much more work and slows them down, as they have to wait for the
entry form to load each time.
Unfortunately a spammer could manually submit one comment, then re-use
that same session info over and over to attach repeated spam comments
to the same node. Such an attempt would be detected and blocked by the
spam module if enabled, but again such a session re-use attack could be
done without loading the form each time. Fortunately there is much
less gain for a spammer to submit 100 spam comments on the same page,
versus submitting 100 spam comments each on a different page as they do
now.
Ideas to improve upon this concept include:
- re-key every day or week, changing the private key regularly to be
sure it couldn't ever be permanently cracked
- add a key table, and generate a unique key for every comment form.
essentially, upon comment form creation generate a random key which is
stored both in a database table and in the session. when a comment is
submited, look for the key from the session in the database table, if a
match is found delete it from the database table and post the comment.
this would prevent session re-use, but adds overhead. i don't know if
it's worth it, perhaps as an external module if the hooks were
available.
------------------------------------------------------------------------
Fri, 19 Aug 2005 18:55:11 +0000 : drumm
<?php
form_set_error('error', t('Validation error, please be sure cookies are
enabled on your browser.'));
?>
form_set_error [2]()'s fist argument should be the name of a form
field, not 'error.' Using (..., 'error') would be better in this case.
And the actual message needs work. Since this is a hidden field I don't
think it has anything to do with cookies.
[2] http://drupaldocs.org/api/head/function/form_set_error
------------------------------------------------------------------------
Fri, 19 Aug 2005 18:56:41 +0000 : drumm
The unclosed link in my last update was supposed to say
drupal_set_message(..., 'error')
1
0
Issue status update for
http://drupal.org/node/7582
Post a follow up:
http://drupal.org/project/comments/add/7582
Project: Drupal
Version: cvs
Component: node system
Category: bug reports
Priority: normal
Assigned to: killes(a)www.drop.org
Reported by: killes(a)www.drop.org
Updated by: killes(a)www.drop.org
Status: patch (code needs review)
I've now created a databse which has 1000 nodes and each node has 50 or
51 revisions. I have no taxo terms assigned and no comments (those
parts didn't change). I am only using stories and pages. I am now
fairly confident that this patch is a performance improvement for this
kind of setup.
for the new patch on the /node page I get
Requests per second: 0.27 [#/sec] (mean)
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 3592 3648 56.9 3640 3985
Waiting: 3573 3626 55.1 3620 3949
Total: 3592 3648 56.9 3640 3985
for the old one I get:
Requests per second: 0.23 [#/sec] (mean)
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 4247 4297 64.0 4286 4657
Waiting: 4156 4263 65.3 4253 4624
Total: 4247 4297 64.0 4286 4657
For the tracker the results are even better:
old:
Requests per second: 2.00 [#/sec] (mean)
new:
Requests per second: 3.64 [#/sec] (mean)
Anybody interested in obtaining this database can mail me. It is 60MB
(gzipped).
killes(a)www.drop.org
Previous comments:
------------------------------------------------------------------------
Wed, 05 May 2004 16:25:27 +0000 : killes(a)www.drop.org
Currently all node revisions are stored in a serialized field in
node.table and retrieved for _each_ page view although they are rarely
needed. However, we have agreed that serializing data is bad and that
we should try to keep the memory foot print pf Drupal small.
Therefore I propose to create a separate revisions table which would be
in principle identical to the node table, only that it could have
several old copies of the same node. Extra data added by other modules
could be added in a serialized field unless we find a better solution.
------------------------------------------------------------------------
Wed, 05 May 2004 17:06:35 +0000 : jhriggs
I too think the serialized approach is less than desirable, but here's
an alternative. This would likely take some considerable rework in
core and contrib, but the following is how we handle similar types of
situations in our databases at work. It is more elegant that a
separate table, and avoids the (almost exact) duplication of a table.
Instead of separate tables, keep all revisions of nodes in the node
table as follows:
* add field: active (0/1 or Y/N)
* add field: revision
* every revision of a node is stored in the node table; however, only
one revision can be active at any given time
* nid can no longer be unique -- primary/unique key becomes (nid,
active)
* any time a node is loaded, updated (without revision), etc., the
active version is used.
Thoughts?
------------------------------------------------------------------------
Wed, 05 May 2004 17:57:48 +0000 : killes(a)www.drop.org
I am not opposed to your scheme, but I want to stress the following:
* Duplicating a table's structure is not bad (IMHO) as long as the
content is different.
* having two tables will allow us to have a rather small node table.
This is (maybe) a performance gain.
------------------------------------------------------------------------
Wed, 05 May 2004 18:37:29 +0000 : jhriggs
I don't necessarily think that duplicating a table's structure is _bad_.
It just seems to be wasteful and a pain to maintain. (Every change to
the node table is made twice...easy to do, but also easy to miss
perhaps.)
As for performance, as long as nid and the active indicator are
indexed, there shouldn't be any performance loss. Also, archiving an
old version when making a new revision will be much simpler: just
change the active indicator rather than copying an entire node to
another table (and ensuring everything gets copied...again a potential
maintainance issue).
To be honest, I would just like to see the serialized data go away,
regarless of what approach is taken.
------------------------------------------------------------------------
Fri, 30 Jul 2004 19:49:33 +0000 : Nick Nassar
Attachment: http://drupal.org/files/issues/Drupal-Improved_Revision_Schema_07-30-2004.p… (10.47 KB)
I'm interested in using Drupal for a large scale wiki-type project. In
order to do this, I need revisions to be in their own table.
Attached is a patch to do just that. Most of the changes are pretty
self explanatory. Spreading out node data across two tables meant that
I had to add database functions to do locking/transactions. Without
this, race conditions in which the database becomes corrupted are
possible.
------------------------------------------------------------------------
Fri, 30 Jul 2004 19:54:36 +0000 : Nick Nassar
Oh yeah... The patch is a diff against Drupal CVS
------------------------------------------------------------------------
Sat, 31 Jul 2004 00:00:08 +0000 : Anonymous
Gerhard speaking.
Nick, thanks a lot for your nice patch! It saves me a great deal of
labour. I looked through it and immediately liked it. You not only put
the old revisions into a new table but also the current one. Do you
have an estimate how much more expensive the additional join is?
Besides a few minor coding style issues I found a major one: Just a few
hours before you uploaded your patch JonBob's node access patch hit
core. That means your patch won't apply anymore as all the queries you
change have been changed. Can I bug you to update your patch?
------------------------------------------------------------------------
Sat, 31 Jul 2004 01:11:59 +0000 : Anonymous
Also I think that your upgrade path loses existing revisions.
------------------------------------------------------------------------
Sat, 31 Jul 2004 02:39:12 +0000 : drumm
I think this is the proper way to do things. No columns are duplicated,
there is no serialized data, and only the fields that are logically
revised are stored. Nothing jumped out at me as a way to have my node
modules be able to keep a table of revisions of additional fields. I'm
guessing this could be done within the confines of _insert and _update.
Assuming the upgrade path works and modules can extend it I give it a
+1.
------------------------------------------------------------------------
Sat, 31 Jul 2004 14:40:15 +0000 : Nick Nassar
It figures that just as I finish a big patch, another patch comes along
and breaks it. Oh well, it should be a pretty easy to fix. I'll work on
it.
Fixing the upgrade path to keep revisions should be fairly painless.
I found another issue that needs to be fixed before this patch gets
merged. There format of a node needs to be stored for each revision.
Otherwise, for modules that store a format for the nodes, such as page
and book, if you write one revision in PHP and the next in HTML, the
PHP revision will be displayed as HTML. This is part of a larger issue
of how node modules should store revisions of additional fields. I
think each module that wants to do this should create another table
with (nid, revid) as the primary key. Just as when they want to add
fields to a node they create another table with nid as the primary key.
As far as performance goes, for sites that make heavy use of revisions,
an extra join on primary keys is going to be a lot faster than grabbing
all of the revisions from that database everytime. We would need to run
benchmarks to determine is the overall difference in speed is for an
average site is a gain or a loss. I'm guessing it's very minor either
way.
------------------------------------------------------------------------
Mon, 23 Aug 2004 13:55:49 +0000 : Nick Nassar
Attachment: http://drupal.org/files/issues/Drupal-Improved_Revision_Schema_08-23-2004.p… (10.92 KB)
Here's an updated patch against CVS that puts revisions in their own
table, provides an upgrade path, and fixes the format related bugs in
the last patch.
Hopefully, this can make it into CVS as soon as the freeze is over.
------------------------------------------------------------------------
Mon, 23 Aug 2004 14:10:39 +0000 : moshe weitzman
Interesting patch ... drumm's question is still outstanding. how do
modules store revisions of their fields? Are they expected to manage
this on their own? Thats not how it works today.
As an aside, i am seeing profile_ fields in my node.revisions column.
One could argue that those need not be saved. They pertain to the node
author, not to the node itself.
------------------------------------------------------------------------
Mon, 23 Aug 2004 16:14:39 +0000 : Nick Nassar
Having modules be responsible for storing revisions of their own fields
is a side-effect of storing revision data in tables. There's really no
way around it. However, revisions generally don't make sense for node
types that don't have PHP/HTML content, such as polls. I think it's
going to be a pretty rare scenario for a new node type to want another
field to change per-revision, so it's a pretty good trade-off.
Storing fields that shouldn't be part of revisions, such as the
profile_ fields, is a side-effect of storing revisions as serialized
objects. Applying this patch will free up that wasted space. :)
------------------------------------------------------------------------
Mon, 23 Aug 2004 17:20:57 +0000 : Anonymous
There should be a hook that let's the module choose whether it supports
history. This way a module author can prevent the user from doing
something that may break his module or just cause undefined behavior.
If the module doesn't support history then don't let the user/admin
choose to add history to nodes of that type.
Craig
------------------------------------------------------------------------
Mon, 23 Aug 2004 19:23:29 +0000 : Nick Nassar
I agree, there should be an API change to make specifying support for
revisions easier. In the interests of keeping patches small and keeping
to one change per patch, I think the API change should be a separate
issue.
A sort of ad-hoc API to decide whether or not a module supports
revisions by default already exists. Instead of having a hook, modules
set the default value of the "Create new revision" field in the edit
form. The admin can change this option in
admin/node/configure/defaults. This patch doesn't change that.
Revisions are broken for node types that have their own database
structure, like polls, even when storing them as serialized objects.
This patch doesn't change that, either.
------------------------------------------------------------------------
Tue, 26 Oct 2004 02:35:06 +0000 : moshe weitzman
I'm guessing that someone is going to have to demonstrate that this
patch performs as well as current drupal before it gets comitted. i
think this patch is a few benchmarks from being comitted.
------------------------------------------------------------------------
Wed, 27 Oct 2004 01:04:09 +0000 : Nick Nassar
Attachment: http://drupal.org/files/issues/Drupal-Improved_Revision_Schema_10-26-2004.p… (11 KB)
I ran some really unscientific benchmarks, and it looks like this patch
has a negligible affect on performance.
I used apache bench and the database from theregular.org, which doesn't
contain any revisions (worst case scenario for this patch) and contains
several hundred nodes. Both the patched and unpatched versions hovered
between 2.36 and 2.38 requests per second.
The command I used was:
ab -n50 -C 'PHPSESSID=b01a9f92880ef215b0ed6f1314a5eba2'
http://192.168.0.100/
An updated patch that should apply to CVS is attached.
------------------------------------------------------------------------
Wed, 27 Oct 2004 01:05:16 +0000 : Nick Nassar
I ran some really unscientific benchmarks, and it looks like this patch
has a negligible affect on performance.
I used apache bench and the database from theregular.org, which doesn't
contain any revisions (worst case scenario for this patch) and contains
several hundred nodes. Both the patched and unpatched versions hovered
between 2.36 and 2.38 requests per second.
The command I used was:
ab -n50 -C 'PHPSESSID=b01a9f92880ef215b0ed6f1314a5eba2'
http://192.168.0.100/
An updated patch that should apply to CVS is attached.
------------------------------------------------------------------------
Wed, 27 Oct 2004 01:05:26 +0000 : Nick Nassar
Attachment: http://drupal.org/files/issues/Drupal-Improved_Revision_Schema_10-26-2004.p… (11 KB)
I ran some really unscientific benchmarks, and it looks like this patch
has a negligible affect on performance.
I used apache bench and the database from theregular.org, which doesn't
contain any revisions (worst case scenario for this patch) and contains
several hundred nodes. Both the patched and unpatched versions hovered
between 2.36 and 2.38 requests per second.
The command I used was:
ab -n50 -C 'PHPSESSID=b01a9f92880ef215b0ed6f1314a5eba2'
http://192.168.0.100/
An updated patch that should apply to CVS is attached.
------------------------------------------------------------------------
Mon, 15 Nov 2004 05:05:30 +0000 : elias1884
please overthink the revision system default workflow as well. don't
look at the revision system as an isolated system but as a part of the
whole workflow system!
if you combine revisions with the moderation queue the most logic
default workflow would be like that:
auth user creates node (revision #0)
admin approves the node (status = 1, moderation = 0)
=> node publicly available
auth user finds typo and changes node (revision #1, status = 0,
moderation = 1)
-------
what happens at that point at the moment is, that the node is not
accessible anymore at all until the new revision is approved by admin.
of course the new revision should not go online until reviewed and
approved, this is absolutely correct, but there is no reason to not
take the old revision offline, since it was already approved and should
therefore be online until the new revision is approved. it is not
practical if a node disappears only because the author corrected a
typo.
-------
admin approves the node (status = 1, moderation = 0)
eventhough I first thought a plain boolean active field would not be
capable of providing that functionality if finally came to the
conclusion, that it can. The only thing to do is to not set that bit,
when a new revision is created, but when it is approved (in case
moderation is activated under default workflow). Every revision should
have its own moderation, status and active field and on approval they
are set like this (status=1, moderation=0, active=Y).
When you wanna rollback to an old revision, you can chose between all
revisions that already have the moderation bit set back to 0 again and
the published to 1. There should be an extra permission for rollback!
another concern that I have about the default workflow is, that users
can't see the content, they have just created, when moderation is
enabled. Eventhough, there is a big fat "submission accepted" presented
after submissions, unexperienced users tend to question the information
those stupid tincans give them, if they can't find their content
afterwards. Many users are really lazy bastards and they don't even
read the status messages. The best feedback about whether his story was
submitted successfully or not of course is, if he can find the story
somewhere on the site, maybe with a status message on top of it,
mentioning, that the content is currently not publicly available since
it has not been approved yet. there should be a my content section
under my account, like somebody is trying to do with the workspace
module I guess.
so my suggestion is to make (status=0, moderation=1) still available
for the creator under a my content section somewhere!
------------------------------------------------------------------------
Wed, 24 Nov 2004 04:21:18 +0000 : Nick Nassar
I agree. The current workflow for moderation queues and revisions needs
to change, but this patch isn't the place for it. The patch is already
too big, and it only does the backend stuff.
Instead of adding more to this patch and making it take even longer to
get into core, would you mind creating a new issue for your UI
suggestions, so the those changes can be added as a separate patch?
Thanks,
Nick
------------------------------------------------------------------------
Sat, 11 Dec 2004 12:26:02 +0000 : Dries
This patch is _much_ needed so I'd love to see someone revive it. In
order for this patch to be accepted, the following needs to be done:
* Update this patch to CVS HEAD.
* Rename revid to vid.
* Rename node_rev to node_revisions.
* Rename node_rev.changed to node_revisions.timestamp.
* Rename $rnode to $revision.
* Fix the coding style to match Drupal's: proper spacing, single quotes
where possible, proper variable names.
* Benchmark this patch with a large database with enough revisions.
I'd be happy to benchmark this on my local copy of the drupal.org
database.
* The book.log field should probably move to the node_revisions table.
This can be done in a separate patch.
* Investigate whether transactions are well-supported.
------------------------------------------------------------------------
Mon, 13 Dec 2004 00:25:40 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/Drupal-Improved_Revision_Schema_10-26-2004-r… (11.02 KB)
I've worked a bit on the patch (coding style issues as mentioned by
Dries). One thing I noticed is that the patch uses REPLACE. IIRC this
needs to be chagned to "UPDATE, if fail INSERT" for pgsql
compatibility.
Nick, are you still interested in working on that patch? I'd like to
know how it works on your site and work on getting it into core.
------------------------------------------------------------------------
Mon, 13 Dec 2004 12:33:08 +0000 : Dries
Gerhard: your patch does not apply.
------------------------------------------------------------------------
Mon, 13 Dec 2004 17:10:12 +0000 : killes(a)www.drop.org
Yes, I know, that was the same version as I mailed to you earlier.
------------------------------------------------------------------------
Mon, 13 Dec 2004 21:02:06 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions.patch (52.96 KB)
Ok, upüdated the patch to cvs.
------------------------------------------------------------------------
Tue, 14 Dec 2004 08:58:36 +0000 : Dries
Some more comments:
* db_begin_transaction() and db_end_transaction() do not belong in
database.inc, but in database.mysql.inc and database.pgsql.inc
respectively.
* The node module calls node_revisionsision_list() which is not
defined. (Fxed that on my local copy.)
* Do db_begin_transaction() and db_end_transaction() deprecate Jeremy's
table locking patch?
* The upgrade path assigns the wrong user ID to each revision.
* The upgrade path assigns the wrong date to each revision (that or a
node's revision page shows the wrong usernames/dates).
* The coding style needs a bit of work, but we can worry about that
later.
------------------------------------------------------------------------
Tue, 14 Dec 2004 17:34:44 +0000 : Nick Nassar
If you need any help getting those things fixed, just let me know.
------------------------------------------------------------------------
Tue, 14 Dec 2004 17:50:30 +0000 : Nick Nassar
How this relates to Jeremy's node locking patch:
There was lots of discussion, and node locking was decided against
because from an end user point of view you never want a node to be
locked. He's now advocating for a much simpler patch that warns users
if their changes will overwrite someone elses. That patch still has a
race condition, which might be fixed using db_begin_transaction().
http://drupal.org/node/6025
------------------------------------------------------------------------
Tue, 14 Dec 2004 22:26:19 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_0.patch (55.96 KB)
Here is an updated patch that tries to address Dries concerns.
------------------------------------------------------------------------
Wed, 15 Dec 2004 08:32:50 +0000 : Dries
Attachment: http://drupal.org/files/issues/revisions-bug.png (76.06 KB)
It didn't fix the aforementioned bugs. See attached screenshot.
------------------------------------------------------------------------
Thu, 06 Jan 2005 20:15:01 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_1.patch (51.77 KB)
Ok, here is a new version. Dries and myself worked hart at it, so please
have a look.
what is still missing
- database upgrades for the core modules with an own table
- contrib modules need an upgrade too.
- do we need nid and vid in both the node and the node_revisions table?
- the amount of sql queries means a good stress testing for large
databases.
------------------------------------------------------------------------
Wed, 19 Jan 2005 21:43:49 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_2.patch (49.49 KB)
Here is an updated patch. We discussed to keep the current title in node
module and also in the revisiosn table. This is content duplication but
will save many joins as many queries only need the title of a node.
Discussion is welcome.
------------------------------------------------------------------------
Wed, 19 Jan 2005 23:33:32 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_3.patch (29.93 KB)
I've implemented the aforementioned solution. This makes the patch much
smaller. The patch now also removes taxonomy_node_has_term() which
wasn't used anywhere. I'd really apprciate if some people could test
drive the patch. It will be another huge improvement for 4.6.
------------------------------------------------------------------------
Thu, 20 Jan 2005 00:05:54 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_4.patch (30 KB)
Another revision. Steven didn't like my literal $node->vid in queries.
------------------------------------------------------------------------
Thu, 20 Jan 2005 01:10:50 +0000 : killes(a)www.drop.org
- database upgrades for the core modules with an own table
- contrib modules need an upgrade too.
- do we need nid and vid in both the node and the node_revisions table?
- the amount of sql queries means a good stress testing for large
databases.
These issues are still open, btw. Especially the first one needs to be
tackled.
------------------------------------------------------------------------
Tue, 25 Jan 2005 20:11:59 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_5.patch (51.13 KB)
Here is a patch that has the database tables updated for forum, book,
and page module.
------------------------------------------------------------------------
Sat, 29 Jan 2005 22:55:59 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_6.patch (49.18 KB)
Yet another update to keep it working with head. The patch now also
removes the table definitons for the page table.
------------------------------------------------------------------------
Sat, 29 Jan 2005 22:57:40 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_7.patch (55.69 KB)
Sorry, that was the old version, this is the right one.
------------------------------------------------------------------------
Mon, 31 Jan 2005 19:55:03 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_8.patch (55.71 KB)
Updated once more.
------------------------------------------------------------------------
Mon, 31 Jan 2005 20:52:08 +0000 : Dries
Anyone to help review/test this?
------------------------------------------------------------------------
Mon, 31 Jan 2005 21:22:36 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_9.patch (49.29 KB)
Updated again, the update functions occurred twice. Thanks Bart.
------------------------------------------------------------------------
Wed, 02 Feb 2005 00:27:05 +0000 : killes(a)www.drop.org
Don't know if the db I am using is corrupted or what. I still do have
some didficulties.
The latest patch is attached.
------------------------------------------------------------------------
Wed, 02 Feb 2005 00:27:49 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_10.patch (49.67 KB)
I am probably slowly going mad ...
------------------------------------------------------------------------
Wed, 02 Feb 2005 01:54:58 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_11.patch (48.95 KB)
The update issue still needs investigating. This patch is updated for
cvs.
------------------------------------------------------------------------
Wed, 02 Feb 2005 20:20:51 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_12.patch (49.83 KB)
Ok, here is a new version. I've solved my troubles with book.module.
There are still some issues with forum module. Possibly due to
inconsistent database.
------------------------------------------------------------------------
Wed, 02 Feb 2005 21:31:05 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_13.patch (49.83 KB)
Turns out the drupal.org database had indeed some quirks. Please run
this query in your oldest db and tell me the result:
select nid,type from node where type like '%/%';
If you get a non-zero result we might need to add another security
update.
The patch could use still more testing, though.
------------------------------------------------------------------------
Thu, 03 Feb 2005 01:16:54 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_14.patch (49 KB)
Ok, we are getting somewhere. At a first glance the update is working.
There is a problem remaining: the revisions tab will be shown whether
the node has revisions or not. Not sure we can/need to fix this.
People with a drupal.org account can log in at
http://killes.drupaldevs.org/revision/ and poke around. Your
permissions will be the same as on drupal.org. Feel free to vreak
everything but don't forget to file complaints here. (Note: this is
only a pruned version of the drupal.org database with all project nodes
and nodes with nids > 7000 dropped).
------------------------------------------------------------------------
Thu, 03 Feb 2005 04:19:14 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_15.patch (52.39 KB)
There was some error in node_save and also the patches to the
database.inc files got lost...
------------------------------------------------------------------------
Thu, 03 Feb 2005 07:07:27 +0000 : robertDouglass
Submitting book pages doesn't work on your test site. It puts the entire
content of the preview inside the body textarea. I wrote a sentence in
the body and the log, and pressing preview put several lines of HTML
containing both sentences in the body textarea on the preview page,
plus the book page wouldn't submit.
-R
------------------------------------------------------------------------
Thu, 03 Feb 2005 07:50:59 +0000 : Junyor
0 results here. I started using Drupal with version 4.4, though.
------------------------------------------------------------------------
Thu, 03 Feb 2005 23:56:18 +0000 : killes(a)www.drop.org
@Junyor: Thanks, that's a good sign. Maybe somebody else has an older db
to try.
@robertDouglass: The first effect you describe is due to drupaldevs
running on PHP 5. I am unsure why the second thing does not work. In
node_save() the node object has a nid although there is none in the
form. Very strange.
I've enabled display of db queries on the testsite.
------------------------------------------------------------------------
Fri, 04 Feb 2005 19:17:55 +0000 : dmjossel
No results here on the query:
select nid,type from node where type like '%/%';
On a database that was put in place prior to Drupal 4 and is now
running on 4.5.2.
------------------------------------------------------------------------
Fri, 04 Feb 2005 20:44:23 +0000 : killes(a)www.drop.org
@dmjossel: thanks.
@all. The strange problem I reported was apparently php 5 related.
After applying Steven's php 5 patch it went away. One error is
remaining: If I create a new forum topic it will be shown as part of
the book on preview. Hmm, that was due to a db that got corrupted
during testing so that is fixed too.
Please poke around at the test site and look if you find more errors.
------------------------------------------------------------------------
Sat, 05 Feb 2005 07:16:22 +0000 : Steven
By the way, I just remembered that Drupal.org has some blogs lingering
on in the database even though blog.module is not enabled. Perhaps this
is causing troubles?
------------------------------------------------------------------------
Sat, 05 Feb 2005 11:22:59 +0000 : Anonymous
I can't see why it would. Drupal.org will need extra updates for images
and project nodes because those have their own tables. GK.
------------------------------------------------------------------------
Sun, 06 Feb 2005 12:49:55 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_16.patch (52.49 KB)
Updated to apply to cvs again.
------------------------------------------------------------------------
Tue, 22 Feb 2005 20:15:40 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_17.patch (49.64 KB)
Updated again.
All we need is a patch to upload module and an upgrade path for it.
------------------------------------------------------------------------
Fri, 04 Mar 2005 04:22:58 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_18.patch (52.31 KB)
Updated once more. Moved log field from book to node_revisions table as
discussed in Antwerp. upload module still missing.
We need to decide under which circumstances the log field should be
displayed. Should that be added to the workflow? Should it depend on
the revisions setting?
------------------------------------------------------------------------
Sat, 05 Mar 2005 19:27:03 +0000 : Anonymous
Attachment: http://drupal.org/files/issues/revisions_20.patch (75.52 KB)
Ok, here it is: Yet another revision of this grrrrreat patch.
Changes from previous versions:
- supports versioning for uploaded files. A problem is that if you
delete a file, it will be gone for all revisions.
- the log field is now in the node_revsions table, but each module has
to decide whether to show it or not.
I've implemented it for the page and the book type odes. Also, the
field can be edited when adding non-book nodes to the book. The log is
displayed on the revisions page and if a node is moderated.
- the revisions are moved to an old_revisions table to a) get the node
table smaller and b) still leave the mavailable for contrib modules
that want to retreive old version data.
The patch has been applied to killes.drupaldevs.org/revision where it
can be tested by anybody (especially people who have "site admin"
rights on drupal.org)
The database is from drupal.org and you should b able to log in with
your pass or simply mail yourself a new one.
Gerhard
------------------------------------------------------------------------
Sat, 05 Mar 2005 19:51:56 +0000 : Anonymous
Attachment: http://drupal.org/files/issues/revisions_21.patch (59.42 KB)
BTW, I marked this a bug because atm the revisions field can grow quite
big. Neil has reported problems from some users who were not able to
load some nodes due to to many large revisions.
Also, som unrelated stuff crept into the patch. New version attached.
------------------------------------------------------------------------
Tue, 08 Mar 2005 05:56:01 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_22.patch (60.29 KB)
Ok, I think I got it.
Changes to last version:
- uploads are no properly versioned.
Missing are still pgsql checks and updates.
------------------------------------------------------------------------
Thu, 10 Mar 2005 16:58:41 +0000 : Anonymous
Was able to get http://drupal.org/files/issues/revisions_21.patch to
work with drupal-cvs.tar.gz (10 March 2005) by:
- includes/database.mysql.inc: Commenting out duplicates for functions
function db_begin_transaction and function db_commit_transaction
- modules: node.module: Removing "'title' => $node->title," from
$node_table_values variable declaration and removing "'title' =>
"'%s'"," from "$node_table_types" variable declaration.
Happy to submit a patch if requested. I'll watch this thread.
------------------------------------------------------------------------
Fri, 11 Mar 2005 23:59:45 +0000 : killes(a)www.drop.org
The duplicate function has been removed in rev 22 of this patch.
Why do you think the changes in node_save are needed? Titles are saved
in both tables for performance reasons.
------------------------------------------------------------------------
Sun, 13 Mar 2005 16:12:21 +0000 : jlerner
Hi - I posted comment #62. The changes to node_save appear to be needed
because recent patches (both 21 and 22) remove the field 'title' from
table 'node'. So without the changes to node_save, node.module is
broken and generates errors.
Joshua
------------------------------------------------------------------------
Sun, 13 Mar 2005 16:29:42 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_23.patch (61.17 KB)
Thanks, Joshua, for catching this. node:title is there to stay.
------------------------------------------------------------------------
Wed, 13 Apr 2005 16:29:23 +0000 : moshe weitzman
since HEAD is open again, perhaps it is a good time to revisit this
patch.
once this is committed, lets address - http://drupal.org/node/11071
"node_validate does not respect group editing"
------------------------------------------------------------------------
Mon, 18 Apr 2005 15:43:42 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_24.patch (60.39 KB)
Updated.
------------------------------------------------------------------------
Mon, 18 Apr 2005 16:16:32 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_24_0.patch (60.39 KB)
Updated.
------------------------------------------------------------------------
Tue, 19 Apr 2005 05:19:42 +0000 : Dries
I'll commit this patch later this week! If you haven't checked this
patch already, I urge you to test/check it out because it will have
significant impact on existing code and modules!
------------------------------------------------------------------------
Tue, 19 Apr 2005 05:21:33 +0000 : Dries
Also, what do people think about the n.title being duplicated?
------------------------------------------------------------------------
Tue, 19 Apr 2005 05:26:58 +0000 : chx
I won't lose any sleep because of duplicated titles...
------------------------------------------------------------------------
Tue, 19 Apr 2005 18:35:58 +0000 : killes(a)www.drop.org
Let me explain why I have chosen to duplicate the title (and also the
uid): If you look at all the queries in Drupal, you will find that most
of them only need the title and th uid of a node. That is, by
duplicating it, we save expensive joins on the node_revisions table.
Due to this fact, this patch is actually a performance improvement.
A note about updating contrib module:
Strictly speaking they wouldn't need to be updated. They only need to
if their authors decide that their info should be available for
revisioning. The upgrade path for forum.module in my update.inc patch
(+ the forum patch)
should show you what needs to be done.
I will write a note for the update page once the patch hits core.
------------------------------------------------------------------------
Sun, 24 Apr 2005 21:21:19 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_25.patch (60.38 KB)
Updated to cvs.
Dries: Based on some remarks in #drupal this is the last update I am
going to do. Apply it or won't fix it.
------------------------------------------------------------------------
Sat, 30 Apr 2005 03:42:39 +0000 : Jeremy(a)kerneltrap.org
Attachment: http://drupal.org/files/issues/revisions_25.patch.patch (528 bytes)
That's a big patch. I've only started looking through it. I noticed
one little typo, affecting updates. A patch to your last patch is
attached.
I'm running with the revision patch on my dev server now happily. I
like the concept.
What happens if you click 'stop' on your browser in the middle of a
MySQL "transaction"? I assume that kills the connection to MySQL, and
the lock is freed? But this then leaves changes only partially
applied?
What exactly does locking/unlocking the tables buy us in MySQL? I
don't see anywhere that we detect if an apply fails part way through,
and thus roll back...? What am I missing?
------------------------------------------------------------------------
Sat, 30 Apr 2005 07:11:28 +0000 : Dries
Jeremy: many of us are worried about the performance ramifications this
patch introduces. Early experiments showed a small performance
improvement (while a performance regression might be expected). More
performance reports from large sites like kerneltrap.org will certainly
help this patch. Mind to do a quick performance comparision and to
report back with some numbers? Thanks.
------------------------------------------------------------------------
Sat, 30 Apr 2005 12:38:02 +0000 : Jeremy(a)kerneltrap.org
Dries: I'm not running HEAD on kerneltrap, so this really isn't a
possibility. Furthermore, until I understand why we're locking tables,
I don't like it. The idea of revisions in their own tables is great.
The idea of locking tables to get (without any obvious benefit) there
really worries me.
------------------------------------------------------------------------
Sat, 30 Apr 2005 14:16:01 +0000 : killes(a)www.drop.org
@Jeremy: Thanks for looking at the patch! Also for catching the typo. :)
Did you try to upgrade your database? If yes, how did it go? One of
Dries' concerns is the complexity of the upgrade. How many nodes and
revisions did the db have?
About database locking: This part of the patch was created by Nick and
I simply continued to use it.
Maybe the code should rather be:
if(db_begin_transaction(array('{node}', '{node_revisions}',
'{watchdog}', '{sessions}', '{files}'))) {
db_query($node_query, $node_table_values);
db_query($revisions_query, $revisions_table_values);
db_commit_transaction();
...
}
The idea is probably to avoid two updates at the same time. I don't
think the locking helps if you abort the script at an inconvenient
time. Rollbacks aren't implemented in all mysql versions.
We could omit the db locking if deemed inappropriate. Maybe Nick can
explain his ideas behind this.
@Dries: I wonder who the "many of us" are. They certainly haven't
spoken to me. Moshe had some reservations about the upgrade path and
project module, but the time that project module abused revisions to
store issue updates was long ago and his reservations were resolved.
Nobody else (besides you of course and now Jeremy) has voiced
reservations in a way that was audible to me.
If you grep through the patch you will notice that there are only four
queries which have a join on the node_revisions table. Two of them are
in node_load and in the other cases the join replaced a join on the
node table. The two queries in node_load are the only ones that have
both a join on the node and the revisions query. Thus, loading of
individual nodes might become somewhat slower. All other queries will
be faster since the node table is now much smaller. Also, node loading
does not have to be slower, it depends on your node table. If you had
a lot of revisions and thus a large table, then the new scheme will
make your queries actually faster since we do not load the revisions
on each and every node load anymore. If you didn't have many revisions
your node_load migth be somewhat slower.
WRT to the update script Karoly pointed out that we could use multiple
insert queries instead one query per revision. This would probably
make the update somewhat faster. I am willing to work on this iff you
declare that you will commit the patch afterwards. I'd need to know if
this will work on pgsql and on all supported mysql versions before I
work on it.
About locking: Database locking is dog slow, at least on mysql. I was
using locks in an earlier version of the upgrade script but had to
remove it for (serious!) performance reasons.
------------------------------------------------------------------------
Mon, 09 May 2005 15:07:34 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_26.patch (46.45 KB)
Ok, another update, cause I need it myself.
I've left out the transaction stuff for now. It is in principle
unrelated to this patch and should be discussed elsewhere.
This also makes the patch smaller and easier to review (hint, hint).
------------------------------------------------------------------------
Mon, 09 May 2005 20:32:09 +0000 : killes(a)www.drop.org
The patch contained the update functions twice.
------------------------------------------------------------------------
Mon, 09 May 2005 20:32:26 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_27.patch (39.05 KB)
The patch contained the update functions twice.
------------------------------------------------------------------------
Mon, 09 May 2005 21:23:06 +0000 : Dries
Got one error during the upgrade path:
ALTER TABLE {book} ADD PRIMARY KEY vid (vid)
FAILED
------------------------------------------------------------------------
Mon, 09 May 2005 21:26:19 +0000 : killes(a)www.drop.org
This had happend to me as well, when I tested this patch. The reason is
that for some reason the vid is not unique. Most likely there are some
entries with vid = 0 in there. Can you check which node types those
have? it always was an error in the test database. See:
http://drupal.org/node/7582#comment-20678
------------------------------------------------------------------------
Mon, 09 May 2005 21:27:06 +0000 : Dries
Actually, I got 2850 errors during the upgrade.
Some of these:
sprintf() [function.sprintf]: Too few arguments in
drupal-cvs/includes/database.inc on line 154.
Some of these:
Query was empty query: in drupal-cvs/includes/database.mysql.inc on
line 66.
And this:
Unknown table 'n' in field list query: SELECT n.nid, n.vid FROM node
INNER JOIN files f ON n.nid = f.nid in
drupal-cvs/includes/database.mysql.inc on line 66.
:-)
------------------------------------------------------------------------
Mon, 09 May 2005 21:29:19 +0000 : Dries
Or this:
user error: Unknown column 'log' in 'field list'
query: SELECT parent, weight, log FROM book WHERE nid = 1 in
drupal-cvs/includes/database.mysql.inc on line 66
------------------------------------------------------------------------
Mon, 09 May 2005 21:52:12 +0000 : Dries
The time required to generate my main page went from 902 ms (before
upgrade) to 2139 ms (after upgrade).
The time required to generate a forum listing (?q=forum/x) went from
1872 ms (before upgrade) to 2874 ms (after upgrade).
Maybe this is because my database is not consistent as the result of
the upgrade errors (yet I don't see any errors on the pages I
benchmarked).
------------------------------------------------------------------------
Tue, 10 May 2005 00:24:38 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_28.patch (53.47 KB)
Ok, let me get to this from the bottom to the top:
- my test runs indicated a different development wrt timing. If I had
gotten your results, I had stopped working on this long ago. So your
results are wrong for some reason.
- user error: Unknown column 'log' in 'field list'
Wasn't my day, the book patch got lost. It is contained now. First -R
the old patch, then apply this one.
- Unknown table 'n' in field list query:
Walkah found this, but I forgot to fix it. Fixed now.
- I've no idea where the other queries come from. I am hoping that
either your test db is borken or they are follow ups from the other
ones.
If you let me have your test db, I'll try some debugging.
Thanks for wasting your time, too.
------------------------------------------------------------------------
Tue, 10 May 2005 05:07:31 +0000 : Dries
I double-checked and the numbers don't seem to lie. I'll test some more
after work on another machine to make sure it is not platform-specific.
------------------------------------------------------------------------
Wed, 11 May 2005 03:32:47 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_29.patch (54.83 KB)
Ok, here I am again.
What I did:
1) Ask Dries to let me have drupal.org database
2) get 400MB of SQL inserts...
3) take 23 minutes to import said data
4) Remove all image and project nodes (don't want to install their
modules), 11765 nodes left
5) back up data
6) take tests on non-cached /node page (as anonymous user).
ab results:
-c 1 -n 25:
Requests per second: 1.29 [#/sec] (mean)
Connection Times (ms)
min mean[+/-sd] median max
Total: 663 775 179.7 689 1264
7) Do the same for the tracker page:
Requests per second: 0.83 [#/sec] (mean)
Total: 1182 1199 7.4 1199 1217
8) Apply my patch (rev. 28).
9) run db update and hold breath
10) update times out...
11) play back backup from 5)
12) wait
13) getting annoyed and removing cache, watchdog, and accesslog before
playing in backup.
14) wait again. Understand why Dries doesn't try this patch often.
Maybe a smaller DB would do for testing?
15) wait more. get really annoyed.
16) Set time limit to 18000 in update.php
17) try again
18) fails again before the second update is completed.
19) curse.
20) delete search stuff from db. Ooops, sooo much smaller...
21) import again, below 2 minutes...
22) rewrite to use extended insert. Found a bug.
23) still does not complete. Mysql logging to the rescue!
24) tid = 0? Not good.
25) Well, the update works fine till node 10834. 5595 nodes done, 6136
to go.
26) Writing shell based update script. Discovery: 24MB aren't enough.
Hopefully 64 are. Nope.
extended inserts for revisions are apparently not the brightest idea:
Huge memory consumption.
Hmm, no, all updates got through. Selecting the revisions to put them
into old_revisions table screwed it. Learned about CREATE TABLE
old_revisions SELECT syntax.
Yay! finally. 24 MB are just not enough the update.php script seems to
still break.
27) Benchmarks!
/node
Requests per second: 1.54 [#/sec] (mean)
Connection Times (ms)
min mean[+/-sd] median max
Total: 632 649 40.5 636 791
/tracker
Requests per second: 0.86 [#/sec] (mean)
Total: 1119 1165 65.8 1160 1461
Ok: So we get an improvement for many node_loads, but none for simple
selects from node.
More tests can be done.
28) roll new patch
Ain't Drupal fun?
------------------------------------------------------------------------
Wed, 18 May 2005 13:38:05 +0000 : Dries
I did another round of tests on _another_ machine and it is not looking
good:
Before upgrade After upgrade
?q= (main page) 218 ms/request 340 ms/request
?q=forum (forum overview) 754 ms/request 1520 ms/request
?q=about (book page) 375 ms/request 5400 ms/request
The upgrade process itself gave me a number of 'query was empty' and
'sprintf(): too few arguments' reports. Everything seems to work fine
though.
Looking at the ?q=about page, I see that the following query is
executed twice _and_ that each time, it take more than 2 seconds to
complete:
SELECT n.nid, n.title, b.parent, b.weight FROM node n INNER JOIN book b
ON n.vid = b.vid WHERE n.status = 1 AND n.moderate = 0 ORDER BY
b.weight, n.title;
--8 SHOW INDEX FROM book;
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name |
Collation | Cardinality | Sub_part | Packed | Null | Index_type |
Comment |
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| book | 1 | book_parent | 1 | parent | A
| 92 | NULL | NULL | | BTREE | |
| book | 1 | nid | 1 | nid | A
| 369 | NULL | NULL | | BTREE | |
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
2 rows in set (0.00 sec)
The book module does not appear to have a primary key? Sounds like a
bad idea so I added one:
mysql> ALTER TABLE book ADD PRIMARY KEY nid (nid);
Query OK, 369 rows affected (0.02 sec)
Records: 369 Duplicates: 0 Warnings: 0
Next, I wanted to make the vid column a unique key in all node tables:
mysql> ALTER TABLE node ADD UNIQUE vid (vid);
Query OK, 20392 rows affected (0.81 sec)
Records: 20392 Duplicates: 0 Warnings: 0
mysql> ALTER TABLE book ADD UNIQUE vid (vid);
ERROR 1062: Duplicate entry '0' for key 2
mysql> ALTER TABLE forum ADD UNIQUE vid (vid);
Query OK, 10806 rows affected (0.10 sec)
Records: 10806 Duplicates: 0 Warnings: 0
As you can see, it fails for the book table which makes me believe
there is some inconsistent data ... I set out to fix that:
mysql> SELECT nid, COUNT(nid) AS vids FROM book GROUP BY vid HAVING
vids > 1;
+-----+------+
| nid | vids |
+-----+------+
| 871 | 2 |
+-----+------+
1 row in set (0.00 sec)
mysql> SELECT title FROM node WHERE nid = 871;
Empty set (0.00 sec)
mysql> DELETE FROM book WHERE nid = 871;
Query OK, 1 row affected (0.00 sec)
mysql> ALTER TABLE book ADD UNIQUE vid (vid);
Query OK, 368 rows affected (0.01 sec)
Records: 368 Duplicates: 0 Warnings: 0
Looks like everything is well now. Ran some new benchmarks:
Before upgrade After upgrade With
indices
?q= (main page) 218 ms/request 340 ms/request 336
ms/request
?q=forum (forum overview) 754 ms/request 1520 ms/request 1531
ms/request
?q=about (book page) 375 ms/request 5400 ms/request 475
ms/request
Unfortunately, we're still slower than the original code.
------------------------------------------------------------------------
Wed, 18 May 2005 21:53:31 +0000 : killes(a)www.drop.org
Dries, thanks for testing it again.
I do think the broken queries you observer have something to do with
the bad performance after the update. Please log the queries any I will
have a look at them. I've never seen any such queries.
My update script also tries to create the appropriate indices, but it
will of course fail if the database contains cruft. the indices for the
forum are probably missing, too.
I am still convinced that the patch is core worthy.
------------------------------------------------------------------------
Thu, 19 May 2005 04:36:09 +0000 : Dries
It wouldn't hurt if more people would benchmark this patch. The patch's
current performance worries me.
Did you check your watchdog messages after upgrading the drupal.org
database? Depending on your settings, errors might only be shown in
the watchdog. I'll look into the remaining glitches as time permits.
Thanks for your persistence in keeping this patch up-to-date. :)
------------------------------------------------------------------------
Thu, 19 May 2005 11:59:22 +0000 : killes(a)www.drop.org
Dries: Can you please let me have your updated database? I want to have
a look at it and try my own benchmarks with it.
And yes, if I did learn something on this project is how to be
persistant. ;)
------------------------------------------------------------------------
Fri, 24 Jun 2005 16:25:34 +0000 : killes(a)www.drop.org
Here is an idea that occurred to me:
The problem with the upgrade process is that keeping the existing
revisions requires a lot of work to do. This generates a huge amount of
sql queries for a large database and also requires a huge amount of
memory.
My suggestions is to let update.php only handle the basic upgrade, ie
without old revisions. An additional module could be created that would
implement a cron based approach to upgrading old revisions one node at a
time. it could expose a hook to let contrib modules do their own
upgrades.
Dries, what do you think? (I am writing "Dries" because he seems to be
the only one who is interested in getting this into core...)
------------------------------------------------------------------------
Fri, 24 Jun 2005 22:25:11 +0000 : Junyor
Killes:
I'm also interested in seeing this hit core. What about adding
something to legacy.module to do it?
------------------------------------------------------------------------
Sun, 26 Jun 2005 21:14:54 +0000 : chx
This is a sensible approach. Maybe this is the _only_ sensible approach.
I have a little problem though: while the conversion is running somehow
both revision handlers should be available.
------------------------------------------------------------------------
Sun, 26 Jun 2005 22:16:21 +0000 : killes(a)www.drop.org
hehe, one only has to whine and bug enough and one gets some feedback.
;)
@junyor: legacy.module would be a good place. my current idea is to
auto-enable it in update.php and then disable it again in legacy_cron
after all nodes are updated.. ;)
@chx: When somebody wants to look at revisions of a node that node
could be auto-updated.
The only problem are contrib modules: they's need to have some hook in
order to update their own data. When somebody looks at the revisions of
a node than cannot be updated because the contrib module in question has
no such hook, we can optionally let the user discard old revisions I
guess.
Dries, what do you think?
------------------------------------------------------------------------
Mon, 25 Jul 2005 16:48:51 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_30.patch (53.13 KB)
Sooooo; I've updated this patch once again.
Dries didn't like my idea of legacy updates so we will have an option
to discard old revisions in case the update should prove difficult.
Always keep a backup.
------------------------------------------------------------------------
Mon, 25 Jul 2005 17:31:12 +0000 : Bèr Kessels
can we please accept and commit this patch? We can iron out any issues
later. This patch is just far too big and complex to be
perfect-at-once.
------------------------------------------------------------------------
Mon, 25 Jul 2005 19:22:50 +0000 : Dries
@Ber: no, we can't commit this patch blindly.
Data loss is a much bigger problem than a syntax error or other code
glitch. We can break Drupal but we can't break their data.
I've spent quite a bit of time testing the previous version of this
patch and noticed significant performance degradation.
Tell me, Ber, why can't you test this patch first?
------------------------------------------------------------------------
Mon, 25 Jul 2005 19:40:37 +0000 : Bèr Kessels
I was not referring to not testing it. If it is just the upgrade path
that proves to be cumbersome, then why can we not fix it afterwards,
I.e. when everyone has had a good chance to look at it.
Testing such a huge patch requires a lot of work. something no-one just
does in his spare hours. We had discussions before to deal in a slightly
different way with big changes; to commit them quicker and to leave the
ironing out of any left overs for the community. I hooked into that
discussion here. For two reasons. One is that Gerhard has spent
numerous hours on maintaining this patch. The other one is that the
community can be of help with ironing out issues in such a large
change, much better than that Gerhard can do on his own.
And yes, dataloss is very bad, but no-one should loose data, when
he/she followed the instructions (backing up)...
------------------------------------------------------------------------
Mon, 25 Jul 2005 20:00:38 +0000 : Dries
Ber: applying this patch takes 15 seconds. Whether I apply this patch
for you, or whether you apply it yourself, it will hardly reduce the
'testing cost'. The problem is that data loss can be subtle; it might
go unnoticed for a couple days. Make no mistake, I'd like to see this
patch committed ASAP, but it warrants some testing. Let's
test/benchmark it and commit it.
------------------------------------------------------------------------
Mon, 25 Jul 2005 20:55:03 +0000 : drumm
The upgrade script already takes a long time to execute and does not
provide feedback to the user about how it is doing. I plan on making
the upgrade script able to spread the updates across multiple page
views and give user feedback showing that progress is being made. This
will hopefully make the speed of this update a moot point.
I am working on this for my dayjob, CivicSpace, so it should get done
"real soon now", but it should be expected to take awhile (smallish
number of weeks). Please make comprimises if necessary.
------------------------------------------------------------------------
Wed, 27 Jul 2005 17:38:57 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_31.patch (48.78 KB)
@drumm: the timeout isn't usually the problem. Memory consumption is.
I'll look into doing the updates in chunks of 1000 nodes or so.
Clousseau found a bug in the patch, updated.
------------------------------------------------------------------------
Thu, 28 Jul 2005 18:28:08 +0000 : moshe weitzman
Even if the upgrade issues are resolved, we have to figure out why this
patch is slowing down drupal (according to benchmarks posted here).
------------------------------------------------------------------------
Fri, 29 Jul 2005 07:51:45 +0000 : Dries
We can't proceed without some additional benchmarking. I already
benchmarked it on two machines (see comments above), and I'll benchmark
it again after August 10. The performance impact of this patch worries
me so if two other people could test and benchmark this patch
extensively, that would come a long way. August 10 is close to the
code freeze so some help is necessary.
If Gerhard/killes reviewed/tested one of your patches, it is time to
return the favor ...
------------------------------------------------------------------------
Fri, 29 Jul 2005 15:40:31 +0000 : Jose A Reyero
After looking at the patch I'm not really surprised this slows down
everything. I thought that the reason why we wanted revisions in their
own table was in the first place to have a simpler -and faster- node
table. But this patch:
- Adds fields and complexity not only to the node table but also to all
the main tables for data
- Needs additional joins just to retrieve single nodes (which is done
many times for a typical main page)
- Does away with encapsulation of the revision functionality requiring
other modules to handle revissions related data.
So please, please, please:
- Make revisions table only needed when actually using revisions
- Keep that old nice thing of hiding revisions system from other
modules
- And what's so bad with serializing data anyway? I agree that usually
there are better ways to store data. But this is one of the cases where
serialization does make sense. Not the current huge serialized field,
but only a new table with one serialized node per row.
- In general, do it much simpler. This is way too complex and it
removes more functionality than it adds.
In just four words: keep it simple, please.
------------------------------------------------------------------------
Fri, 29 Jul 2005 21:25:26 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_32.patch (51.27 KB)
Patch updated (v31 was buggy, and there was an update to updates.inc).
I've had a discussion with Jose and he will maybe do some testing over
the weekend. John is also doing some testing as we speak.
------------------------------------------------------------------------
Sat, 30 Jul 2005 16:06:59 +0000 : Nick Nassar
It has been exactly one year since I first submitted this patch.
You know what that means: *old patch party in IRC!* Virtual beers on
me!
w00t!
------------------------------------------------------------------------
Sat, 30 Jul 2005 16:08:33 +0000 : Jose A Reyero
Hi again.
I had a talk with killes -I have to say he enlightned me about this-
and also did some testing. Sorry but I dont have a good set up to
provide benchmarks.
Here are my conclusions -and also some explanation for the varied
results abt performance:
- This is faster for simple node listings
- This is slower for full node/load
....thus the performance increase/decrease for a given page should
depend on the relation between the number of simple node listings
(usually blocks) and the number of nodes displayed in the main page.
I've also experienced some trouble with attachments -not re-created
when creating a new revision- and book relations, same.
I'm not sure I like this new "feature" of creating a new revision every
time you rollback an old one -in Drupal 4.6 this seems not to happen.
Besides these details, I think this patch is good enough -and quite a
powerful approach actually- though I still feel it's bit too complex
when simply dealing with nodes -without revisions-, which is what we do
most of the time, and complexity will be even higher with contrib
modules using still more tables, not to mention flexinode -versioning
info will have to be scattered everywhere in the db.
So this is not a +1 nor -1. On one side I think this is powerful. On
the other side I feel this is too complex and really, it's more than I
need.
I would be happy with a 'revision' table having: nid, vid,
data(serialized), that you can forget about when not using revisions.
------------------------------------------------------------------------
Sat, 30 Jul 2005 19:16:41 +0000 : Nick Nassar
On a much more serious note that my last comment, unless I'm missing
something, you've introduced a subtle bug into this patch since I left
it.
A and B are both updating revision 5 of a node.
A calls db_next_id() and gets a revision id of 6
B calls db_next_id() and gets a revision id of 7
B updates the node's revision to 7
B adds revision 7 to the revision table
A updates the node's revision to 6
A adds revision 6 to the revision table
Now, you have a situation where the current revision (6) isn't actually
the last revision(7). For some uses, that's not a big deal, but in the
case of wikis, which is what I originally hoped to use this patch for,
it creates problems.
------------------------------------------------------------------------
Mon, 01 Aug 2005 00:04:14 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_33.patch (51.27 KB)
Updated the patch. updates.inc was changed and chx found a typo.
@Nick: I don't think the issue is due to this patch. Drupal has
currently no mechanism to not let people commit a new version of the
same node, whether with or without revisions. drupal_next_id ensures
that the revision IDs are unique and the later revisions will be the
current one. Which revision is the current one does not depend on order
of the version ID, btw.
@Jose: what exactly were your problems with uploads? what do I need to
do to reproduce?
------------------------------------------------------------------------
Mon, 01 Aug 2005 11:33:56 +0000 : Jose A Reyero
Sorry, I dont know what I did the other day but I cannot reproduce it
either :-)
I applied this new patch. Problems I'm having now (Testing with
"stories")
- Book outline is gone after creating a new revision
- I cannot delete attachments (not sure it is related to your patch or
to something else)
- Usage of the log message: Is that mensage intended for revisions or
only for book outlines? Currently the field is displayed only for
outlines but shown in the revision tab
Btw, the db update failed for me in patch 32, so I did it manually. I
didnt do a db update again after rolling back v32 and applying v33
-should I?
------------------------------------------------------------------------
Sun, 14 Aug 2005 16:03:41 +0000 : killes(a)www.drop.org
Jose, thanks a lot for testing!
"- Book outline is gone after creating a new revision"
Sounds like a bug, it should not.
"- I cannot delete attachments (not sure it is related to your patch or
to something else)"
Another bug
"- Usage of the log message: Is that mensage intended for revisions or
only for book outlines? Currently the field is displayed only for
outlines but shown in the revision tab"
Log messages are possible for nodes of type book and page as well as
all nodes that are put into the book. Log messages can be added for any
node type, the module author should implement those.
"Btw, the db update failed for me in patch 32, so I did it manually. I
didnt do a db update again after rolling back v32 and applying v33
-should I?"
No, the db update should be the same.
I will try to fix the bugs you found.
WRT the db update I've got an announcement to make: I will not migrate
old revisions.
Why?
1) it is a rather complicated process that requires a lot of memory and
is likely to give people on cheap hosting trouble.
2) testing it takes a lot of time.
3) I don't personally need it
I will still move old revisions to an old_revisions table from where
they can be converted by a later update or a custom script.
------------------------------------------------------------------------
Tue, 16 Aug 2005 21:34:43 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_34.patch (52.92 KB)
Ok, here is incarnation No. 34 of this patch. Dries agrees with me that
we can put the revision rescue operation into a later update, possible
using Drumm's installer.
The patch fixes the issue with the file uploads not being deletable, I
could not reproduce the problem with the outlines.
------------------------------------------------------------------------
Sat, 20 Aug 2005 13:33:40 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/scripts.tar.gz (4.39 KB)
I've started to do some performance testing.
I created a script that creates nodes that have revisions according to
the new system. A node has a 50% chance to have a revision and then an
equal chance fro between 1 and 25 revisions. I've created a database
with 50 nodes and 500 comments, 7 vocabs and 100 total taxo terms.
I've also created a script to convert these revisions to the old format
(much easier than the other way around). Initial tests didn't show a
significant change for /node. I'll try with more nodes and /tracker.
Attached are the two scripts in a tarball. They need to be run against
the patched install.
Also included are the two preliminary results.
1
0
20 Aug '05
Issue status update for
http://drupal.org/node/29082
Post a follow up:
http://drupal.org/project/comments/add/29082
Project: Drupal
Version: cvs
Component: postgresql database
Category: tasks
Priority: critical
Assigned to: Cvbge
Reported by: Cvbge
Updated by: Cvbge
Status: patch (code needs review)
I need people to test mysql updates, as I changed mysql part too.
Cvbge
Previous comments:
------------------------------------------------------------------------
Wed, 17 Aug 2005 21:25:43 +0000 : Cvbge
Postgres, since 6.4, has version() function so you can check db version
from SQL. This can be used in updates.inc for example for conditional
dropping of the column (which is available since 7.3, not 7.4 as stated
in updates.inc).
I need to check how the version string is returned in different
versions before making changes.
------------------------------------------------------------------------
Thu, 18 Aug 2005 10:08:25 +0000 : Cvbge
Attachment: http://drupal.org/files/issues/drupal_pg_version2.diff (1.01 KB)
Here's a function returning postgres db version. I'm not sure if it fits
in updates.inc, maybe it should be somewhere else.
------------------------------------------------------------------------
Fri, 19 Aug 2005 14:33:32 +0000 : Cvbge
Attachment: http://drupal.org/files/issues/drupal_pg_version_and_other.diff (2.76 KB)
Added 2 new functions to add or change column definition. The take care
about mysql/pgsql. Code not tested, will test and update old code when
I hear that this has a chanse to go in ;)
------------------------------------------------------------------------
Fri, 19 Aug 2005 17:03:39 +0000 : Cvbge
Attachment: http://drupal.org/files/issues/updates.inc-pgsql_fix.diff (9.97 KB)
Still more work needed.
------------------------------------------------------------------------
Sat, 20 Aug 2005 14:28:13 +0000 : Cvbge
Attachment: http://drupal.org/files/issues/screenshot_3.jpg (119.33 KB)
With attached patch I could make all updates without errors (my LOCKs
patch was needed - http://drupal.org/node/22911) There are still some
issues that should be addressed (see the patch). Also mysql update is
not tested - please test.
There are some problems when updating from 4.6 to cvs. Maybe the update
should be made by 4.6 base itself?
So I've started with fresh 4.6.3.
After changing the base to cvs, patching with the LOCKs patch and
simple debug helper patch, on update.php?op=update I get following
errors:
warning: pg_query(): Query failed: ERROR: column "cache" of relation
"sessions" does not exist in /var/www/dt/d/includes/database.pgsql.inc
on line 71.
Warning: pg_query(): Query failed: ERROR: column "referer" of relation
"watchdog" does not exist in /var/www/dt/d/includes/database.pgsql.inc
on line 71
Bad query:
Array
(
[0] => SELECT u.*, s.* FROM users u INNER JOIN sessions s ON u.uid
= s.uid WHERE s.sid = 'd5b1aa940eaeab37fe690ce9cc500396' AND u.status <
3 LIMIT 1 OFFSET 0
[1] => SELECT r.rid, r.name FROM role r INNER JOIN users_roles ur
ON ur.rid = r.rid WHERE ur.uid = 1
[2] => SELECT * FROM access WHERE status = 1 AND type = 'host' AND
LOWER('127.0.0.1') LIKE LOWER(mask)
[3] => SELECT * FROM access WHERE status = 0 AND type = 'host' AND
LOWER('127.0.0.1') LIKE LOWER(mask)
[4] => SELECT data, created, headers, expire FROM cache WHERE cid =
'variables'
[5] => SELECT COUNT(pid) FROM url_alias
[6] => SELECT name, filename, throttle, bootstrap FROM system WHERE
type = 'module' AND status = 1
[7] => UPDATE sessions SET uid = 1, cache = 0, hostname =
'127.0.0.1', session = 'messages|a:0:{}', timestamp = 1124545039 WHERE
sid = 'd5b1aa940eaeab37fe690ce9cc500396'
[8] => INSERT INTO watchdog (uid, type, message, severity, link,
location, referer, hostname, timestamp) VALUES (1, 'php', 'pg_query():
Query failed: ERROR: column "cache" of relation "sessions" does not
exist in /var/www/dt/d/includes/database.pgsql.inc on line 71.', 2, '',
'/dt/d/update.php?op=update', 'http://localhost/dt/d/update.php',
'127.0.0.1', 1124545039)
)
Fatal error: ERROR: column "referer" of relation "watchdog" does not
exist query: INSERT INTO watchdog (uid, type, message, severity, link,
location, referer, hostname, timestamp) VALUES (1, 'php', 'pg_query():
Query failed: ERROR: column "cache" of relation "sessions" does not
exist in /var/www/dt/d/includes/database.pgsql.inc on line 71.', 2, '',
'/dt/d/update.php?op=update', 'http://localhost/dt/d/update.php',
'127.0.0.1', 1124545039) in /var/www/dt/d/includes/database.pgsql.inc
on line 94
Warning: Unknown(): A session is active. You cannot change the session
module's ini settings at this time. in Unknown on line 0
As we can see the problem is that there is no cache column is session
table - that's update_130. Also there's no column 'referer' in watchdog
table - that's update_142. So I've commented out update_130 and 142,
added the columns by hand in the db and started again from the
begining.
After that I got another error:
warning: pg_query(): Query failed: ERROR: column "access" of relation
"users" does not exist in /var/www/dt/d/includes/database.pgsql.inc on
line 71.
Bad query:
Array
(
[0] => SELECT u.*, s.* FROM users u INNER JOIN sessions s ON u.uid
= s.uid WHERE s.sid = 'd5b1aa940eaeab37fe690ce9cc500396' AND u.status <
3 LIMIT 1 OFFSET 0
[1] => SELECT r.rid, r.name FROM role r INNER JOIN users_roles ur
ON ur.rid = r.rid WHERE ur.uid = 1
[2] => SELECT * FROM access WHERE status = 1 AND type = 'host' AND
LOWER('127.0.0.1') LIKE LOWER(mask)
[3] => SELECT * FROM access WHERE status = 0 AND type = 'host' AND
LOWER('127.0.0.1') LIKE LOWER(mask)
[4] => SELECT data, created, headers, expire FROM cache WHERE cid =
'variables'
[5] => SELECT COUNT(pid) FROM url_alias
[6] => SELECT name, filename, throttle, bootstrap FROM system WHERE
type = 'module' AND status = 1
[7] => UPDATE sessions SET uid = 1, cache = 0, hostname =
'127.0.0.1', session = 'messages|a:0:{}', timestamp = 1124547147 WHERE
sid = 'd5b1aa940eaeab37fe690ce9cc500396'
[8] => UPDATE users SET access = 1124547147 WHERE uid = 1
[9] => INSERT INTO watchdog (uid, type, message, severity, link,
location, referer, hostname, timestamp) VALUES (1, 'php', 'pg_query():
Query failed: ERROR: column "access" of relation "users" does not
exist in /var/www/dt/d/includes/database.pgsql.inc on line 71.', 2, '',
'/dt/d/update.php?op=update', 'http://localhost/dt/d/update.php',
'127.0.0.1', 1124547147)
)
user error:
query: UPDATE users SET access = 1124547147 WHERE uid = 1 in
/var/www/dt/d/includes/database.pgsql.inc on line 94.
That's update_136. So, disable update_136 and do it by hand and start
again.
After all that, when accessing update.php?op=update I noticed no
errors. After pressing Update all updates were applied without any
errors. But when accessed main page I got a screen as on attached
screenshot.
------------------------------------------------------------------------
Sat, 20 Aug 2005 14:32:27 +0000 : Cvbge
Attachment: http://drupal.org/files/issues/updates.inc-pgsql_fix4.diff (12.86 KB)
And the mentioned patch.
Critical because otherwise upgrades (at least for postgres) are
impossibile.
1
0
20 Aug '05
Issue status update for
http://drupal.org/node/29082
Post a follow up:
http://drupal.org/project/comments/add/29082
Project: Drupal
Version: cvs
Component: postgresql database
Category: tasks
-Priority: normal
+Priority: critical
Assigned to: Cvbge
Reported by: Cvbge
Updated by: Cvbge
Status: patch (code needs review)
Attachment: http://drupal.org/files/issues/updates.inc-pgsql_fix4.diff (12.86 KB)
And the mentioned patch.
Critical because otherwise upgrades (at least for postgres) are
impossibile.
Cvbge
Previous comments:
------------------------------------------------------------------------
Wed, 17 Aug 2005 21:25:43 +0000 : Cvbge
Postgres, since 6.4, has version() function so you can check db version
from SQL. This can be used in updates.inc for example for conditional
dropping of the column (which is available since 7.3, not 7.4 as stated
in updates.inc).
I need to check how the version string is returned in different
versions before making changes.
------------------------------------------------------------------------
Thu, 18 Aug 2005 10:08:25 +0000 : Cvbge
Attachment: http://drupal.org/files/issues/drupal_pg_version2.diff (1.01 KB)
Here's a function returning postgres db version. I'm not sure if it fits
in updates.inc, maybe it should be somewhere else.
------------------------------------------------------------------------
Fri, 19 Aug 2005 14:33:32 +0000 : Cvbge
Attachment: http://drupal.org/files/issues/drupal_pg_version_and_other.diff (2.76 KB)
Added 2 new functions to add or change column definition. The take care
about mysql/pgsql. Code not tested, will test and update old code when
I hear that this has a chanse to go in ;)
------------------------------------------------------------------------
Fri, 19 Aug 2005 17:03:39 +0000 : Cvbge
Attachment: http://drupal.org/files/issues/updates.inc-pgsql_fix.diff (9.97 KB)
Still more work needed.
------------------------------------------------------------------------
Sat, 20 Aug 2005 14:28:13 +0000 : Cvbge
Attachment: http://drupal.org/files/issues/screenshot_3.jpg (119.33 KB)
With attached patch I could make all updates without errors (my LOCKs
patch was needed - http://drupal.org/node/22911) There are still some
issues that should be addressed (see the patch). Also mysql update is
not tested - please test.
There are some problems when updating from 4.6 to cvs. Maybe the update
should be made by 4.6 base itself?
So I've started with fresh 4.6.3.
After changing the base to cvs, patching with the LOCKs patch and
simple debug helper patch, on update.php?op=update I get following
errors:
warning: pg_query(): Query failed: ERROR: column "cache" of relation
"sessions" does not exist in /var/www/dt/d/includes/database.pgsql.inc
on line 71.
Warning: pg_query(): Query failed: ERROR: column "referer" of relation
"watchdog" does not exist in /var/www/dt/d/includes/database.pgsql.inc
on line 71
Bad query:
Array
(
[0] => SELECT u.*, s.* FROM users u INNER JOIN sessions s ON u.uid
= s.uid WHERE s.sid = 'd5b1aa940eaeab37fe690ce9cc500396' AND u.status <
3 LIMIT 1 OFFSET 0
[1] => SELECT r.rid, r.name FROM role r INNER JOIN users_roles ur
ON ur.rid = r.rid WHERE ur.uid = 1
[2] => SELECT * FROM access WHERE status = 1 AND type = 'host' AND
LOWER('127.0.0.1') LIKE LOWER(mask)
[3] => SELECT * FROM access WHERE status = 0 AND type = 'host' AND
LOWER('127.0.0.1') LIKE LOWER(mask)
[4] => SELECT data, created, headers, expire FROM cache WHERE cid =
'variables'
[5] => SELECT COUNT(pid) FROM url_alias
[6] => SELECT name, filename, throttle, bootstrap FROM system WHERE
type = 'module' AND status = 1
[7] => UPDATE sessions SET uid = 1, cache = 0, hostname =
'127.0.0.1', session = 'messages|a:0:{}', timestamp = 1124545039 WHERE
sid = 'd5b1aa940eaeab37fe690ce9cc500396'
[8] => INSERT INTO watchdog (uid, type, message, severity, link,
location, referer, hostname, timestamp) VALUES (1, 'php', 'pg_query():
Query failed: ERROR: column "cache" of relation "sessions" does not
exist in /var/www/dt/d/includes/database.pgsql.inc on line 71.', 2, '',
'/dt/d/update.php?op=update', 'http://localhost/dt/d/update.php',
'127.0.0.1', 1124545039)
)
Fatal error: ERROR: column "referer" of relation "watchdog" does not
exist query: INSERT INTO watchdog (uid, type, message, severity, link,
location, referer, hostname, timestamp) VALUES (1, 'php', 'pg_query():
Query failed: ERROR: column "cache" of relation "sessions" does not
exist in /var/www/dt/d/includes/database.pgsql.inc on line 71.', 2, '',
'/dt/d/update.php?op=update', 'http://localhost/dt/d/update.php',
'127.0.0.1', 1124545039) in /var/www/dt/d/includes/database.pgsql.inc
on line 94
Warning: Unknown(): A session is active. You cannot change the session
module's ini settings at this time. in Unknown on line 0
As we can see the problem is that there is no cache column is session
table - that's update_130. Also there's no column 'referer' in watchdog
table - that's update_142. So I've commented out update_130 and 142,
added the columns by hand in the db and started again from the
begining.
After that I got another error:
warning: pg_query(): Query failed: ERROR: column "access" of relation
"users" does not exist in /var/www/dt/d/includes/database.pgsql.inc on
line 71.
Bad query:
Array
(
[0] => SELECT u.*, s.* FROM users u INNER JOIN sessions s ON u.uid
= s.uid WHERE s.sid = 'd5b1aa940eaeab37fe690ce9cc500396' AND u.status <
3 LIMIT 1 OFFSET 0
[1] => SELECT r.rid, r.name FROM role r INNER JOIN users_roles ur
ON ur.rid = r.rid WHERE ur.uid = 1
[2] => SELECT * FROM access WHERE status = 1 AND type = 'host' AND
LOWER('127.0.0.1') LIKE LOWER(mask)
[3] => SELECT * FROM access WHERE status = 0 AND type = 'host' AND
LOWER('127.0.0.1') LIKE LOWER(mask)
[4] => SELECT data, created, headers, expire FROM cache WHERE cid =
'variables'
[5] => SELECT COUNT(pid) FROM url_alias
[6] => SELECT name, filename, throttle, bootstrap FROM system WHERE
type = 'module' AND status = 1
[7] => UPDATE sessions SET uid = 1, cache = 0, hostname =
'127.0.0.1', session = 'messages|a:0:{}', timestamp = 1124547147 WHERE
sid = 'd5b1aa940eaeab37fe690ce9cc500396'
[8] => UPDATE users SET access = 1124547147 WHERE uid = 1
[9] => INSERT INTO watchdog (uid, type, message, severity, link,
location, referer, hostname, timestamp) VALUES (1, 'php', 'pg_query():
Query failed: ERROR: column "access" of relation "users" does not
exist in /var/www/dt/d/includes/database.pgsql.inc on line 71.', 2, '',
'/dt/d/update.php?op=update', 'http://localhost/dt/d/update.php',
'127.0.0.1', 1124547147)
)
user error:
query: UPDATE users SET access = 1124547147 WHERE uid = 1 in
/var/www/dt/d/includes/database.pgsql.inc on line 94.
That's update_136. So, disable update_136 and do it by hand and start
again.
After all that, when accessing update.php?op=update I noticed no
errors. After pressing Update all updates were applied without any
errors. But when accessed main page I got a screen as on attached
screenshot.
1
0
20 Aug '05
Issue status update for
http://drupal.org/node/29082
Post a follow up:
http://drupal.org/project/comments/add/29082
Project: Drupal
Version: cvs
Component: postgresql database
Category: tasks
Priority: normal
Assigned to: Cvbge
Reported by: Cvbge
Updated by: Cvbge
-Status: patch (code needs work)
+Status: patch (code needs review)
Attachment: http://drupal.org/files/issues/screenshot_3.jpg (119.33 KB)
With attached patch I could make all updates without errors (my LOCKs
patch was needed - http://drupal.org/node/22911) There are still some
issues that should be addressed (see the patch). Also mysql update is
not tested - please test.
There are some problems when updating from 4.6 to cvs. Maybe the update
should be made by 4.6 base itself?
So I've started with fresh 4.6.3.
After changing the base to cvs, patching with the LOCKs patch and
simple debug helper patch, on update.php?op=update I get following
errors:
warning: pg_query(): Query failed: ERROR: column "cache" of relation
"sessions" does not exist in /var/www/dt/d/includes/database.pgsql.inc
on line 71.
Warning: pg_query(): Query failed: ERROR: column "referer" of relation
"watchdog" does not exist in /var/www/dt/d/includes/database.pgsql.inc
on line 71
Bad query:
Array
(
[0] => SELECT u.*, s.* FROM users u INNER JOIN sessions s ON u.uid
= s.uid WHERE s.sid = 'd5b1aa940eaeab37fe690ce9cc500396' AND u.status <
3 LIMIT 1 OFFSET 0
[1] => SELECT r.rid, r.name FROM role r INNER JOIN users_roles ur
ON ur.rid = r.rid WHERE ur.uid = 1
[2] => SELECT * FROM access WHERE status = 1 AND type = 'host' AND
LOWER('127.0.0.1') LIKE LOWER(mask)
[3] => SELECT * FROM access WHERE status = 0 AND type = 'host' AND
LOWER('127.0.0.1') LIKE LOWER(mask)
[4] => SELECT data, created, headers, expire FROM cache WHERE cid =
'variables'
[5] => SELECT COUNT(pid) FROM url_alias
[6] => SELECT name, filename, throttle, bootstrap FROM system WHERE
type = 'module' AND status = 1
[7] => UPDATE sessions SET uid = 1, cache = 0, hostname =
'127.0.0.1', session = 'messages|a:0:{}', timestamp = 1124545039 WHERE
sid = 'd5b1aa940eaeab37fe690ce9cc500396'
[8] => INSERT INTO watchdog (uid, type, message, severity, link,
location, referer, hostname, timestamp) VALUES (1, 'php', 'pg_query():
Query failed: ERROR: column "cache" of relation "sessions" does not
exist in /var/www/dt/d/includes/database.pgsql.inc on line 71.', 2, '',
'/dt/d/update.php?op=update', 'http://localhost/dt/d/update.php',
'127.0.0.1', 1124545039)
)
Fatal error: ERROR: column "referer" of relation "watchdog" does not
exist query: INSERT INTO watchdog (uid, type, message, severity, link,
location, referer, hostname, timestamp) VALUES (1, 'php', 'pg_query():
Query failed: ERROR: column "cache" of relation "sessions" does not
exist in /var/www/dt/d/includes/database.pgsql.inc on line 71.', 2, '',
'/dt/d/update.php?op=update', 'http://localhost/dt/d/update.php',
'127.0.0.1', 1124545039) in /var/www/dt/d/includes/database.pgsql.inc
on line 94
Warning: Unknown(): A session is active. You cannot change the session
module's ini settings at this time. in Unknown on line 0
As we can see the problem is that there is no cache column is session
table - that's update_130. Also there's no column 'referer' in watchdog
table - that's update_142. So I've commented out update_130 and 142,
added the columns by hand in the db and started again from the
begining.
After that I got another error:
warning: pg_query(): Query failed: ERROR: column "access" of relation
"users" does not exist in /var/www/dt/d/includes/database.pgsql.inc on
line 71.
Bad query:
Array
(
[0] => SELECT u.*, s.* FROM users u INNER JOIN sessions s ON u.uid
= s.uid WHERE s.sid = 'd5b1aa940eaeab37fe690ce9cc500396' AND u.status <
3 LIMIT 1 OFFSET 0
[1] => SELECT r.rid, r.name FROM role r INNER JOIN users_roles ur
ON ur.rid = r.rid WHERE ur.uid = 1
[2] => SELECT * FROM access WHERE status = 1 AND type = 'host' AND
LOWER('127.0.0.1') LIKE LOWER(mask)
[3] => SELECT * FROM access WHERE status = 0 AND type = 'host' AND
LOWER('127.0.0.1') LIKE LOWER(mask)
[4] => SELECT data, created, headers, expire FROM cache WHERE cid =
'variables'
[5] => SELECT COUNT(pid) FROM url_alias
[6] => SELECT name, filename, throttle, bootstrap FROM system WHERE
type = 'module' AND status = 1
[7] => UPDATE sessions SET uid = 1, cache = 0, hostname =
'127.0.0.1', session = 'messages|a:0:{}', timestamp = 1124547147 WHERE
sid = 'd5b1aa940eaeab37fe690ce9cc500396'
[8] => UPDATE users SET access = 1124547147 WHERE uid = 1
[9] => INSERT INTO watchdog (uid, type, message, severity, link,
location, referer, hostname, timestamp) VALUES (1, 'php', 'pg_query():
Query failed: ERROR: column "access" of relation "users" does not
exist in /var/www/dt/d/includes/database.pgsql.inc on line 71.', 2, '',
'/dt/d/update.php?op=update', 'http://localhost/dt/d/update.php',
'127.0.0.1', 1124547147)
)
user error:
query: UPDATE users SET access = 1124547147 WHERE uid = 1 in
/var/www/dt/d/includes/database.pgsql.inc on line 94.
That's update_136. So, disable update_136 and do it by hand and start
again.
After all that, when accessing update.php?op=update I noticed no
errors. After pressing Update all updates were applied without any
errors. But when accessed main page I got a screen as on attached
screenshot.
Cvbge
Previous comments:
------------------------------------------------------------------------
Wed, 17 Aug 2005 21:25:43 +0000 : Cvbge
Postgres, since 6.4, has version() function so you can check db version
from SQL. This can be used in updates.inc for example for conditional
dropping of the column (which is available since 7.3, not 7.4 as stated
in updates.inc).
I need to check how the version string is returned in different
versions before making changes.
------------------------------------------------------------------------
Thu, 18 Aug 2005 10:08:25 +0000 : Cvbge
Attachment: http://drupal.org/files/issues/drupal_pg_version2.diff (1.01 KB)
Here's a function returning postgres db version. I'm not sure if it fits
in updates.inc, maybe it should be somewhere else.
------------------------------------------------------------------------
Fri, 19 Aug 2005 14:33:32 +0000 : Cvbge
Attachment: http://drupal.org/files/issues/drupal_pg_version_and_other.diff (2.76 KB)
Added 2 new functions to add or change column definition. The take care
about mysql/pgsql. Code not tested, will test and update old code when
I hear that this has a chanse to go in ;)
------------------------------------------------------------------------
Fri, 19 Aug 2005 17:03:39 +0000 : Cvbge
Attachment: http://drupal.org/files/issues/updates.inc-pgsql_fix.diff (9.97 KB)
Still more work needed.
1
0
Issue status update for
http://drupal.org/node/7582
Post a follow up:
http://drupal.org/project/comments/add/7582
Project: Drupal
Version: cvs
Component: node system
Category: bug reports
Priority: normal
Assigned to: killes(a)www.drop.org
Reported by: killes(a)www.drop.org
Updated by: killes(a)www.drop.org
Status: patch (code needs review)
Attachment: http://drupal.org/files/issues/scripts.tar.gz (4.39 KB)
I've started to do some performance testing.
I created a script that creates nodes that have revisions according to
the new system. A node has a 50% chance to have a revision and then an
equal chance fro between 1 and 25 revisions. I've created a database
with 50 nodes and 500 comments, 7 vocabs and 100 total taxo terms.
I've also created a script to convert these revisions to the old format
(much easier than the other way around). Initial tests didn't show a
significant change for /node. I'll try with more nodes and /tracker.
Attached are the two scripts in a tarball. They need to be run against
the patched install.
Also included are the two preliminary results.
killes(a)www.drop.org
Previous comments:
------------------------------------------------------------------------
Wed, 05 May 2004 16:25:27 +0000 : killes(a)www.drop.org
Currently all node revisions are stored in a serialized field in
node.table and retrieved for _each_ page view although they are rarely
needed. However, we have agreed that serializing data is bad and that
we should try to keep the memory foot print pf Drupal small.
Therefore I propose to create a separate revisions table which would be
in principle identical to the node table, only that it could have
several old copies of the same node. Extra data added by other modules
could be added in a serialized field unless we find a better solution.
------------------------------------------------------------------------
Wed, 05 May 2004 17:06:35 +0000 : jhriggs
I too think the serialized approach is less than desirable, but here's
an alternative. This would likely take some considerable rework in
core and contrib, but the following is how we handle similar types of
situations in our databases at work. It is more elegant that a
separate table, and avoids the (almost exact) duplication of a table.
Instead of separate tables, keep all revisions of nodes in the node
table as follows:
* add field: active (0/1 or Y/N)
* add field: revision
* every revision of a node is stored in the node table; however, only
one revision can be active at any given time
* nid can no longer be unique -- primary/unique key becomes (nid,
active)
* any time a node is loaded, updated (without revision), etc., the
active version is used.
Thoughts?
------------------------------------------------------------------------
Wed, 05 May 2004 17:57:48 +0000 : killes(a)www.drop.org
I am not opposed to your scheme, but I want to stress the following:
* Duplicating a table's structure is not bad (IMHO) as long as the
content is different.
* having two tables will allow us to have a rather small node table.
This is (maybe) a performance gain.
------------------------------------------------------------------------
Wed, 05 May 2004 18:37:29 +0000 : jhriggs
I don't necessarily think that duplicating a table's structure is _bad_.
It just seems to be wasteful and a pain to maintain. (Every change to
the node table is made twice...easy to do, but also easy to miss
perhaps.)
As for performance, as long as nid and the active indicator are
indexed, there shouldn't be any performance loss. Also, archiving an
old version when making a new revision will be much simpler: just
change the active indicator rather than copying an entire node to
another table (and ensuring everything gets copied...again a potential
maintainance issue).
To be honest, I would just like to see the serialized data go away,
regarless of what approach is taken.
------------------------------------------------------------------------
Fri, 30 Jul 2004 19:49:33 +0000 : Nick Nassar
Attachment: http://drupal.org/files/issues/Drupal-Improved_Revision_Schema_07-30-2004.p… (10.47 KB)
I'm interested in using Drupal for a large scale wiki-type project. In
order to do this, I need revisions to be in their own table.
Attached is a patch to do just that. Most of the changes are pretty
self explanatory. Spreading out node data across two tables meant that
I had to add database functions to do locking/transactions. Without
this, race conditions in which the database becomes corrupted are
possible.
------------------------------------------------------------------------
Fri, 30 Jul 2004 19:54:36 +0000 : Nick Nassar
Oh yeah... The patch is a diff against Drupal CVS
------------------------------------------------------------------------
Sat, 31 Jul 2004 00:00:08 +0000 : Anonymous
Gerhard speaking.
Nick, thanks a lot for your nice patch! It saves me a great deal of
labour. I looked through it and immediately liked it. You not only put
the old revisions into a new table but also the current one. Do you
have an estimate how much more expensive the additional join is?
Besides a few minor coding style issues I found a major one: Just a few
hours before you uploaded your patch JonBob's node access patch hit
core. That means your patch won't apply anymore as all the queries you
change have been changed. Can I bug you to update your patch?
------------------------------------------------------------------------
Sat, 31 Jul 2004 01:11:59 +0000 : Anonymous
Also I think that your upgrade path loses existing revisions.
------------------------------------------------------------------------
Sat, 31 Jul 2004 02:39:12 +0000 : drumm
I think this is the proper way to do things. No columns are duplicated,
there is no serialized data, and only the fields that are logically
revised are stored. Nothing jumped out at me as a way to have my node
modules be able to keep a table of revisions of additional fields. I'm
guessing this could be done within the confines of _insert and _update.
Assuming the upgrade path works and modules can extend it I give it a
+1.
------------------------------------------------------------------------
Sat, 31 Jul 2004 14:40:15 +0000 : Nick Nassar
It figures that just as I finish a big patch, another patch comes along
and breaks it. Oh well, it should be a pretty easy to fix. I'll work on
it.
Fixing the upgrade path to keep revisions should be fairly painless.
I found another issue that needs to be fixed before this patch gets
merged. There format of a node needs to be stored for each revision.
Otherwise, for modules that store a format for the nodes, such as page
and book, if you write one revision in PHP and the next in HTML, the
PHP revision will be displayed as HTML. This is part of a larger issue
of how node modules should store revisions of additional fields. I
think each module that wants to do this should create another table
with (nid, revid) as the primary key. Just as when they want to add
fields to a node they create another table with nid as the primary key.
As far as performance goes, for sites that make heavy use of revisions,
an extra join on primary keys is going to be a lot faster than grabbing
all of the revisions from that database everytime. We would need to run
benchmarks to determine is the overall difference in speed is for an
average site is a gain or a loss. I'm guessing it's very minor either
way.
------------------------------------------------------------------------
Mon, 23 Aug 2004 13:55:49 +0000 : Nick Nassar
Attachment: http://drupal.org/files/issues/Drupal-Improved_Revision_Schema_08-23-2004.p… (10.92 KB)
Here's an updated patch against CVS that puts revisions in their own
table, provides an upgrade path, and fixes the format related bugs in
the last patch.
Hopefully, this can make it into CVS as soon as the freeze is over.
------------------------------------------------------------------------
Mon, 23 Aug 2004 14:10:39 +0000 : moshe weitzman
Interesting patch ... drumm's question is still outstanding. how do
modules store revisions of their fields? Are they expected to manage
this on their own? Thats not how it works today.
As an aside, i am seeing profile_ fields in my node.revisions column.
One could argue that those need not be saved. They pertain to the node
author, not to the node itself.
------------------------------------------------------------------------
Mon, 23 Aug 2004 16:14:39 +0000 : Nick Nassar
Having modules be responsible for storing revisions of their own fields
is a side-effect of storing revision data in tables. There's really no
way around it. However, revisions generally don't make sense for node
types that don't have PHP/HTML content, such as polls. I think it's
going to be a pretty rare scenario for a new node type to want another
field to change per-revision, so it's a pretty good trade-off.
Storing fields that shouldn't be part of revisions, such as the
profile_ fields, is a side-effect of storing revisions as serialized
objects. Applying this patch will free up that wasted space. :)
------------------------------------------------------------------------
Mon, 23 Aug 2004 17:20:57 +0000 : Anonymous
There should be a hook that let's the module choose whether it supports
history. This way a module author can prevent the user from doing
something that may break his module or just cause undefined behavior.
If the module doesn't support history then don't let the user/admin
choose to add history to nodes of that type.
Craig
------------------------------------------------------------------------
Mon, 23 Aug 2004 19:23:29 +0000 : Nick Nassar
I agree, there should be an API change to make specifying support for
revisions easier. In the interests of keeping patches small and keeping
to one change per patch, I think the API change should be a separate
issue.
A sort of ad-hoc API to decide whether or not a module supports
revisions by default already exists. Instead of having a hook, modules
set the default value of the "Create new revision" field in the edit
form. The admin can change this option in
admin/node/configure/defaults. This patch doesn't change that.
Revisions are broken for node types that have their own database
structure, like polls, even when storing them as serialized objects.
This patch doesn't change that, either.
------------------------------------------------------------------------
Tue, 26 Oct 2004 02:35:06 +0000 : moshe weitzman
I'm guessing that someone is going to have to demonstrate that this
patch performs as well as current drupal before it gets comitted. i
think this patch is a few benchmarks from being comitted.
------------------------------------------------------------------------
Wed, 27 Oct 2004 01:04:09 +0000 : Nick Nassar
Attachment: http://drupal.org/files/issues/Drupal-Improved_Revision_Schema_10-26-2004.p… (11 KB)
I ran some really unscientific benchmarks, and it looks like this patch
has a negligible affect on performance.
I used apache bench and the database from theregular.org, which doesn't
contain any revisions (worst case scenario for this patch) and contains
several hundred nodes. Both the patched and unpatched versions hovered
between 2.36 and 2.38 requests per second.
The command I used was:
ab -n50 -C 'PHPSESSID=b01a9f92880ef215b0ed6f1314a5eba2'
http://192.168.0.100/
An updated patch that should apply to CVS is attached.
------------------------------------------------------------------------
Wed, 27 Oct 2004 01:05:16 +0000 : Nick Nassar
I ran some really unscientific benchmarks, and it looks like this patch
has a negligible affect on performance.
I used apache bench and the database from theregular.org, which doesn't
contain any revisions (worst case scenario for this patch) and contains
several hundred nodes. Both the patched and unpatched versions hovered
between 2.36 and 2.38 requests per second.
The command I used was:
ab -n50 -C 'PHPSESSID=b01a9f92880ef215b0ed6f1314a5eba2'
http://192.168.0.100/
An updated patch that should apply to CVS is attached.
------------------------------------------------------------------------
Wed, 27 Oct 2004 01:05:26 +0000 : Nick Nassar
Attachment: http://drupal.org/files/issues/Drupal-Improved_Revision_Schema_10-26-2004.p… (11 KB)
I ran some really unscientific benchmarks, and it looks like this patch
has a negligible affect on performance.
I used apache bench and the database from theregular.org, which doesn't
contain any revisions (worst case scenario for this patch) and contains
several hundred nodes. Both the patched and unpatched versions hovered
between 2.36 and 2.38 requests per second.
The command I used was:
ab -n50 -C 'PHPSESSID=b01a9f92880ef215b0ed6f1314a5eba2'
http://192.168.0.100/
An updated patch that should apply to CVS is attached.
------------------------------------------------------------------------
Mon, 15 Nov 2004 05:05:30 +0000 : elias1884
please overthink the revision system default workflow as well. don't
look at the revision system as an isolated system but as a part of the
whole workflow system!
if you combine revisions with the moderation queue the most logic
default workflow would be like that:
auth user creates node (revision #0)
admin approves the node (status = 1, moderation = 0)
=> node publicly available
auth user finds typo and changes node (revision #1, status = 0,
moderation = 1)
-------
what happens at that point at the moment is, that the node is not
accessible anymore at all until the new revision is approved by admin.
of course the new revision should not go online until reviewed and
approved, this is absolutely correct, but there is no reason to not
take the old revision offline, since it was already approved and should
therefore be online until the new revision is approved. it is not
practical if a node disappears only because the author corrected a
typo.
-------
admin approves the node (status = 1, moderation = 0)
eventhough I first thought a plain boolean active field would not be
capable of providing that functionality if finally came to the
conclusion, that it can. The only thing to do is to not set that bit,
when a new revision is created, but when it is approved (in case
moderation is activated under default workflow). Every revision should
have its own moderation, status and active field and on approval they
are set like this (status=1, moderation=0, active=Y).
When you wanna rollback to an old revision, you can chose between all
revisions that already have the moderation bit set back to 0 again and
the published to 1. There should be an extra permission for rollback!
another concern that I have about the default workflow is, that users
can't see the content, they have just created, when moderation is
enabled. Eventhough, there is a big fat "submission accepted" presented
after submissions, unexperienced users tend to question the information
those stupid tincans give them, if they can't find their content
afterwards. Many users are really lazy bastards and they don't even
read the status messages. The best feedback about whether his story was
submitted successfully or not of course is, if he can find the story
somewhere on the site, maybe with a status message on top of it,
mentioning, that the content is currently not publicly available since
it has not been approved yet. there should be a my content section
under my account, like somebody is trying to do with the workspace
module I guess.
so my suggestion is to make (status=0, moderation=1) still available
for the creator under a my content section somewhere!
------------------------------------------------------------------------
Wed, 24 Nov 2004 04:21:18 +0000 : Nick Nassar
I agree. The current workflow for moderation queues and revisions needs
to change, but this patch isn't the place for it. The patch is already
too big, and it only does the backend stuff.
Instead of adding more to this patch and making it take even longer to
get into core, would you mind creating a new issue for your UI
suggestions, so the those changes can be added as a separate patch?
Thanks,
Nick
------------------------------------------------------------------------
Sat, 11 Dec 2004 12:26:02 +0000 : Dries
This patch is _much_ needed so I'd love to see someone revive it. In
order for this patch to be accepted, the following needs to be done:
* Update this patch to CVS HEAD.
* Rename revid to vid.
* Rename node_rev to node_revisions.
* Rename node_rev.changed to node_revisions.timestamp.
* Rename $rnode to $revision.
* Fix the coding style to match Drupal's: proper spacing, single quotes
where possible, proper variable names.
* Benchmark this patch with a large database with enough revisions.
I'd be happy to benchmark this on my local copy of the drupal.org
database.
* The book.log field should probably move to the node_revisions table.
This can be done in a separate patch.
* Investigate whether transactions are well-supported.
------------------------------------------------------------------------
Mon, 13 Dec 2004 00:25:40 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/Drupal-Improved_Revision_Schema_10-26-2004-r… (11.02 KB)
I've worked a bit on the patch (coding style issues as mentioned by
Dries). One thing I noticed is that the patch uses REPLACE. IIRC this
needs to be chagned to "UPDATE, if fail INSERT" for pgsql
compatibility.
Nick, are you still interested in working on that patch? I'd like to
know how it works on your site and work on getting it into core.
------------------------------------------------------------------------
Mon, 13 Dec 2004 12:33:08 +0000 : Dries
Gerhard: your patch does not apply.
------------------------------------------------------------------------
Mon, 13 Dec 2004 17:10:12 +0000 : killes(a)www.drop.org
Yes, I know, that was the same version as I mailed to you earlier.
------------------------------------------------------------------------
Mon, 13 Dec 2004 21:02:06 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions.patch (52.96 KB)
Ok, upüdated the patch to cvs.
------------------------------------------------------------------------
Tue, 14 Dec 2004 08:58:36 +0000 : Dries
Some more comments:
* db_begin_transaction() and db_end_transaction() do not belong in
database.inc, but in database.mysql.inc and database.pgsql.inc
respectively.
* The node module calls node_revisionsision_list() which is not
defined. (Fxed that on my local copy.)
* Do db_begin_transaction() and db_end_transaction() deprecate Jeremy's
table locking patch?
* The upgrade path assigns the wrong user ID to each revision.
* The upgrade path assigns the wrong date to each revision (that or a
node's revision page shows the wrong usernames/dates).
* The coding style needs a bit of work, but we can worry about that
later.
------------------------------------------------------------------------
Tue, 14 Dec 2004 17:34:44 +0000 : Nick Nassar
If you need any help getting those things fixed, just let me know.
------------------------------------------------------------------------
Tue, 14 Dec 2004 17:50:30 +0000 : Nick Nassar
How this relates to Jeremy's node locking patch:
There was lots of discussion, and node locking was decided against
because from an end user point of view you never want a node to be
locked. He's now advocating for a much simpler patch that warns users
if their changes will overwrite someone elses. That patch still has a
race condition, which might be fixed using db_begin_transaction().
http://drupal.org/node/6025
------------------------------------------------------------------------
Tue, 14 Dec 2004 22:26:19 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_0.patch (55.96 KB)
Here is an updated patch that tries to address Dries concerns.
------------------------------------------------------------------------
Wed, 15 Dec 2004 08:32:50 +0000 : Dries
Attachment: http://drupal.org/files/issues/revisions-bug.png (76.06 KB)
It didn't fix the aforementioned bugs. See attached screenshot.
------------------------------------------------------------------------
Thu, 06 Jan 2005 20:15:01 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_1.patch (51.77 KB)
Ok, here is a new version. Dries and myself worked hart at it, so please
have a look.
what is still missing
- database upgrades for the core modules with an own table
- contrib modules need an upgrade too.
- do we need nid and vid in both the node and the node_revisions table?
- the amount of sql queries means a good stress testing for large
databases.
------------------------------------------------------------------------
Wed, 19 Jan 2005 21:43:49 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_2.patch (49.49 KB)
Here is an updated patch. We discussed to keep the current title in node
module and also in the revisiosn table. This is content duplication but
will save many joins as many queries only need the title of a node.
Discussion is welcome.
------------------------------------------------------------------------
Wed, 19 Jan 2005 23:33:32 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_3.patch (29.93 KB)
I've implemented the aforementioned solution. This makes the patch much
smaller. The patch now also removes taxonomy_node_has_term() which
wasn't used anywhere. I'd really apprciate if some people could test
drive the patch. It will be another huge improvement for 4.6.
------------------------------------------------------------------------
Thu, 20 Jan 2005 00:05:54 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_4.patch (30 KB)
Another revision. Steven didn't like my literal $node->vid in queries.
------------------------------------------------------------------------
Thu, 20 Jan 2005 01:10:50 +0000 : killes(a)www.drop.org
- database upgrades for the core modules with an own table
- contrib modules need an upgrade too.
- do we need nid and vid in both the node and the node_revisions table?
- the amount of sql queries means a good stress testing for large
databases.
These issues are still open, btw. Especially the first one needs to be
tackled.
------------------------------------------------------------------------
Tue, 25 Jan 2005 20:11:59 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_5.patch (51.13 KB)
Here is a patch that has the database tables updated for forum, book,
and page module.
------------------------------------------------------------------------
Sat, 29 Jan 2005 22:55:59 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_6.patch (49.18 KB)
Yet another update to keep it working with head. The patch now also
removes the table definitons for the page table.
------------------------------------------------------------------------
Sat, 29 Jan 2005 22:57:40 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_7.patch (55.69 KB)
Sorry, that was the old version, this is the right one.
------------------------------------------------------------------------
Mon, 31 Jan 2005 19:55:03 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_8.patch (55.71 KB)
Updated once more.
------------------------------------------------------------------------
Mon, 31 Jan 2005 20:52:08 +0000 : Dries
Anyone to help review/test this?
------------------------------------------------------------------------
Mon, 31 Jan 2005 21:22:36 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_9.patch (49.29 KB)
Updated again, the update functions occurred twice. Thanks Bart.
------------------------------------------------------------------------
Wed, 02 Feb 2005 00:27:05 +0000 : killes(a)www.drop.org
Don't know if the db I am using is corrupted or what. I still do have
some didficulties.
The latest patch is attached.
------------------------------------------------------------------------
Wed, 02 Feb 2005 00:27:49 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_10.patch (49.67 KB)
I am probably slowly going mad ...
------------------------------------------------------------------------
Wed, 02 Feb 2005 01:54:58 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_11.patch (48.95 KB)
The update issue still needs investigating. This patch is updated for
cvs.
------------------------------------------------------------------------
Wed, 02 Feb 2005 20:20:51 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_12.patch (49.83 KB)
Ok, here is a new version. I've solved my troubles with book.module.
There are still some issues with forum module. Possibly due to
inconsistent database.
------------------------------------------------------------------------
Wed, 02 Feb 2005 21:31:05 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_13.patch (49.83 KB)
Turns out the drupal.org database had indeed some quirks. Please run
this query in your oldest db and tell me the result:
select nid,type from node where type like '%/%';
If you get a non-zero result we might need to add another security
update.
The patch could use still more testing, though.
------------------------------------------------------------------------
Thu, 03 Feb 2005 01:16:54 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_14.patch (49 KB)
Ok, we are getting somewhere. At a first glance the update is working.
There is a problem remaining: the revisions tab will be shown whether
the node has revisions or not. Not sure we can/need to fix this.
People with a drupal.org account can log in at
http://killes.drupaldevs.org/revision/ and poke around. Your
permissions will be the same as on drupal.org. Feel free to vreak
everything but don't forget to file complaints here. (Note: this is
only a pruned version of the drupal.org database with all project nodes
and nodes with nids > 7000 dropped).
------------------------------------------------------------------------
Thu, 03 Feb 2005 04:19:14 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_15.patch (52.39 KB)
There was some error in node_save and also the patches to the
database.inc files got lost...
------------------------------------------------------------------------
Thu, 03 Feb 2005 07:07:27 +0000 : robertDouglass
Submitting book pages doesn't work on your test site. It puts the entire
content of the preview inside the body textarea. I wrote a sentence in
the body and the log, and pressing preview put several lines of HTML
containing both sentences in the body textarea on the preview page,
plus the book page wouldn't submit.
-R
------------------------------------------------------------------------
Thu, 03 Feb 2005 07:50:59 +0000 : Junyor
0 results here. I started using Drupal with version 4.4, though.
------------------------------------------------------------------------
Thu, 03 Feb 2005 23:56:18 +0000 : killes(a)www.drop.org
@Junyor: Thanks, that's a good sign. Maybe somebody else has an older db
to try.
@robertDouglass: The first effect you describe is due to drupaldevs
running on PHP 5. I am unsure why the second thing does not work. In
node_save() the node object has a nid although there is none in the
form. Very strange.
I've enabled display of db queries on the testsite.
------------------------------------------------------------------------
Fri, 04 Feb 2005 19:17:55 +0000 : dmjossel
No results here on the query:
select nid,type from node where type like '%/%';
On a database that was put in place prior to Drupal 4 and is now
running on 4.5.2.
------------------------------------------------------------------------
Fri, 04 Feb 2005 20:44:23 +0000 : killes(a)www.drop.org
@dmjossel: thanks.
@all. The strange problem I reported was apparently php 5 related.
After applying Steven's php 5 patch it went away. One error is
remaining: If I create a new forum topic it will be shown as part of
the book on preview. Hmm, that was due to a db that got corrupted
during testing so that is fixed too.
Please poke around at the test site and look if you find more errors.
------------------------------------------------------------------------
Sat, 05 Feb 2005 07:16:22 +0000 : Steven
By the way, I just remembered that Drupal.org has some blogs lingering
on in the database even though blog.module is not enabled. Perhaps this
is causing troubles?
------------------------------------------------------------------------
Sat, 05 Feb 2005 11:22:59 +0000 : Anonymous
I can't see why it would. Drupal.org will need extra updates for images
and project nodes because those have their own tables. GK.
------------------------------------------------------------------------
Sun, 06 Feb 2005 12:49:55 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_16.patch (52.49 KB)
Updated to apply to cvs again.
------------------------------------------------------------------------
Tue, 22 Feb 2005 20:15:40 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_17.patch (49.64 KB)
Updated again.
All we need is a patch to upload module and an upgrade path for it.
------------------------------------------------------------------------
Fri, 04 Mar 2005 04:22:58 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_18.patch (52.31 KB)
Updated once more. Moved log field from book to node_revisions table as
discussed in Antwerp. upload module still missing.
We need to decide under which circumstances the log field should be
displayed. Should that be added to the workflow? Should it depend on
the revisions setting?
------------------------------------------------------------------------
Sat, 05 Mar 2005 19:27:03 +0000 : Anonymous
Attachment: http://drupal.org/files/issues/revisions_20.patch (75.52 KB)
Ok, here it is: Yet another revision of this grrrrreat patch.
Changes from previous versions:
- supports versioning for uploaded files. A problem is that if you
delete a file, it will be gone for all revisions.
- the log field is now in the node_revsions table, but each module has
to decide whether to show it or not.
I've implemented it for the page and the book type odes. Also, the
field can be edited when adding non-book nodes to the book. The log is
displayed on the revisions page and if a node is moderated.
- the revisions are moved to an old_revisions table to a) get the node
table smaller and b) still leave the mavailable for contrib modules
that want to retreive old version data.
The patch has been applied to killes.drupaldevs.org/revision where it
can be tested by anybody (especially people who have "site admin"
rights on drupal.org)
The database is from drupal.org and you should b able to log in with
your pass or simply mail yourself a new one.
Gerhard
------------------------------------------------------------------------
Sat, 05 Mar 2005 19:51:56 +0000 : Anonymous
Attachment: http://drupal.org/files/issues/revisions_21.patch (59.42 KB)
BTW, I marked this a bug because atm the revisions field can grow quite
big. Neil has reported problems from some users who were not able to
load some nodes due to to many large revisions.
Also, som unrelated stuff crept into the patch. New version attached.
------------------------------------------------------------------------
Tue, 08 Mar 2005 05:56:01 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_22.patch (60.29 KB)
Ok, I think I got it.
Changes to last version:
- uploads are no properly versioned.
Missing are still pgsql checks and updates.
------------------------------------------------------------------------
Thu, 10 Mar 2005 16:58:41 +0000 : Anonymous
Was able to get http://drupal.org/files/issues/revisions_21.patch to
work with drupal-cvs.tar.gz (10 March 2005) by:
- includes/database.mysql.inc: Commenting out duplicates for functions
function db_begin_transaction and function db_commit_transaction
- modules: node.module: Removing "'title' => $node->title," from
$node_table_values variable declaration and removing "'title' =>
"'%s'"," from "$node_table_types" variable declaration.
Happy to submit a patch if requested. I'll watch this thread.
------------------------------------------------------------------------
Fri, 11 Mar 2005 23:59:45 +0000 : killes(a)www.drop.org
The duplicate function has been removed in rev 22 of this patch.
Why do you think the changes in node_save are needed? Titles are saved
in both tables for performance reasons.
------------------------------------------------------------------------
Sun, 13 Mar 2005 16:12:21 +0000 : jlerner
Hi - I posted comment #62. The changes to node_save appear to be needed
because recent patches (both 21 and 22) remove the field 'title' from
table 'node'. So without the changes to node_save, node.module is
broken and generates errors.
Joshua
------------------------------------------------------------------------
Sun, 13 Mar 2005 16:29:42 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_23.patch (61.17 KB)
Thanks, Joshua, for catching this. node:title is there to stay.
------------------------------------------------------------------------
Wed, 13 Apr 2005 16:29:23 +0000 : moshe weitzman
since HEAD is open again, perhaps it is a good time to revisit this
patch.
once this is committed, lets address - http://drupal.org/node/11071
"node_validate does not respect group editing"
------------------------------------------------------------------------
Mon, 18 Apr 2005 15:43:42 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_24.patch (60.39 KB)
Updated.
------------------------------------------------------------------------
Mon, 18 Apr 2005 16:16:32 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_24_0.patch (60.39 KB)
Updated.
------------------------------------------------------------------------
Tue, 19 Apr 2005 05:19:42 +0000 : Dries
I'll commit this patch later this week! If you haven't checked this
patch already, I urge you to test/check it out because it will have
significant impact on existing code and modules!
------------------------------------------------------------------------
Tue, 19 Apr 2005 05:21:33 +0000 : Dries
Also, what do people think about the n.title being duplicated?
------------------------------------------------------------------------
Tue, 19 Apr 2005 05:26:58 +0000 : chx
I won't lose any sleep because of duplicated titles...
------------------------------------------------------------------------
Tue, 19 Apr 2005 18:35:58 +0000 : killes(a)www.drop.org
Let me explain why I have chosen to duplicate the title (and also the
uid): If you look at all the queries in Drupal, you will find that most
of them only need the title and th uid of a node. That is, by
duplicating it, we save expensive joins on the node_revisions table.
Due to this fact, this patch is actually a performance improvement.
A note about updating contrib module:
Strictly speaking they wouldn't need to be updated. They only need to
if their authors decide that their info should be available for
revisioning. The upgrade path for forum.module in my update.inc patch
(+ the forum patch)
should show you what needs to be done.
I will write a note for the update page once the patch hits core.
------------------------------------------------------------------------
Sun, 24 Apr 2005 21:21:19 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_25.patch (60.38 KB)
Updated to cvs.
Dries: Based on some remarks in #drupal this is the last update I am
going to do. Apply it or won't fix it.
------------------------------------------------------------------------
Sat, 30 Apr 2005 03:42:39 +0000 : Jeremy(a)kerneltrap.org
Attachment: http://drupal.org/files/issues/revisions_25.patch.patch (528 bytes)
That's a big patch. I've only started looking through it. I noticed
one little typo, affecting updates. A patch to your last patch is
attached.
I'm running with the revision patch on my dev server now happily. I
like the concept.
What happens if you click 'stop' on your browser in the middle of a
MySQL "transaction"? I assume that kills the connection to MySQL, and
the lock is freed? But this then leaves changes only partially
applied?
What exactly does locking/unlocking the tables buy us in MySQL? I
don't see anywhere that we detect if an apply fails part way through,
and thus roll back...? What am I missing?
------------------------------------------------------------------------
Sat, 30 Apr 2005 07:11:28 +0000 : Dries
Jeremy: many of us are worried about the performance ramifications this
patch introduces. Early experiments showed a small performance
improvement (while a performance regression might be expected). More
performance reports from large sites like kerneltrap.org will certainly
help this patch. Mind to do a quick performance comparision and to
report back with some numbers? Thanks.
------------------------------------------------------------------------
Sat, 30 Apr 2005 12:38:02 +0000 : Jeremy(a)kerneltrap.org
Dries: I'm not running HEAD on kerneltrap, so this really isn't a
possibility. Furthermore, until I understand why we're locking tables,
I don't like it. The idea of revisions in their own tables is great.
The idea of locking tables to get (without any obvious benefit) there
really worries me.
------------------------------------------------------------------------
Sat, 30 Apr 2005 14:16:01 +0000 : killes(a)www.drop.org
@Jeremy: Thanks for looking at the patch! Also for catching the typo. :)
Did you try to upgrade your database? If yes, how did it go? One of
Dries' concerns is the complexity of the upgrade. How many nodes and
revisions did the db have?
About database locking: This part of the patch was created by Nick and
I simply continued to use it.
Maybe the code should rather be:
if(db_begin_transaction(array('{node}', '{node_revisions}',
'{watchdog}', '{sessions}', '{files}'))) {
db_query($node_query, $node_table_values);
db_query($revisions_query, $revisions_table_values);
db_commit_transaction();
...
}
The idea is probably to avoid two updates at the same time. I don't
think the locking helps if you abort the script at an inconvenient
time. Rollbacks aren't implemented in all mysql versions.
We could omit the db locking if deemed inappropriate. Maybe Nick can
explain his ideas behind this.
@Dries: I wonder who the "many of us" are. They certainly haven't
spoken to me. Moshe had some reservations about the upgrade path and
project module, but the time that project module abused revisions to
store issue updates was long ago and his reservations were resolved.
Nobody else (besides you of course and now Jeremy) has voiced
reservations in a way that was audible to me.
If you grep through the patch you will notice that there are only four
queries which have a join on the node_revisions table. Two of them are
in node_load and in the other cases the join replaced a join on the
node table. The two queries in node_load are the only ones that have
both a join on the node and the revisions query. Thus, loading of
individual nodes might become somewhat slower. All other queries will
be faster since the node table is now much smaller. Also, node loading
does not have to be slower, it depends on your node table. If you had
a lot of revisions and thus a large table, then the new scheme will
make your queries actually faster since we do not load the revisions
on each and every node load anymore. If you didn't have many revisions
your node_load migth be somewhat slower.
WRT to the update script Karoly pointed out that we could use multiple
insert queries instead one query per revision. This would probably
make the update somewhat faster. I am willing to work on this iff you
declare that you will commit the patch afterwards. I'd need to know if
this will work on pgsql and on all supported mysql versions before I
work on it.
About locking: Database locking is dog slow, at least on mysql. I was
using locks in an earlier version of the upgrade script but had to
remove it for (serious!) performance reasons.
------------------------------------------------------------------------
Mon, 09 May 2005 15:07:34 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_26.patch (46.45 KB)
Ok, another update, cause I need it myself.
I've left out the transaction stuff for now. It is in principle
unrelated to this patch and should be discussed elsewhere.
This also makes the patch smaller and easier to review (hint, hint).
------------------------------------------------------------------------
Mon, 09 May 2005 20:32:09 +0000 : killes(a)www.drop.org
The patch contained the update functions twice.
------------------------------------------------------------------------
Mon, 09 May 2005 20:32:26 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_27.patch (39.05 KB)
The patch contained the update functions twice.
------------------------------------------------------------------------
Mon, 09 May 2005 21:23:06 +0000 : Dries
Got one error during the upgrade path:
ALTER TABLE {book} ADD PRIMARY KEY vid (vid)
FAILED
------------------------------------------------------------------------
Mon, 09 May 2005 21:26:19 +0000 : killes(a)www.drop.org
This had happend to me as well, when I tested this patch. The reason is
that for some reason the vid is not unique. Most likely there are some
entries with vid = 0 in there. Can you check which node types those
have? it always was an error in the test database. See:
http://drupal.org/node/7582#comment-20678
------------------------------------------------------------------------
Mon, 09 May 2005 21:27:06 +0000 : Dries
Actually, I got 2850 errors during the upgrade.
Some of these:
sprintf() [function.sprintf]: Too few arguments in
drupal-cvs/includes/database.inc on line 154.
Some of these:
Query was empty query: in drupal-cvs/includes/database.mysql.inc on
line 66.
And this:
Unknown table 'n' in field list query: SELECT n.nid, n.vid FROM node
INNER JOIN files f ON n.nid = f.nid in
drupal-cvs/includes/database.mysql.inc on line 66.
:-)
------------------------------------------------------------------------
Mon, 09 May 2005 21:29:19 +0000 : Dries
Or this:
user error: Unknown column 'log' in 'field list'
query: SELECT parent, weight, log FROM book WHERE nid = 1 in
drupal-cvs/includes/database.mysql.inc on line 66
------------------------------------------------------------------------
Mon, 09 May 2005 21:52:12 +0000 : Dries
The time required to generate my main page went from 902 ms (before
upgrade) to 2139 ms (after upgrade).
The time required to generate a forum listing (?q=forum/x) went from
1872 ms (before upgrade) to 2874 ms (after upgrade).
Maybe this is because my database is not consistent as the result of
the upgrade errors (yet I don't see any errors on the pages I
benchmarked).
------------------------------------------------------------------------
Tue, 10 May 2005 00:24:38 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_28.patch (53.47 KB)
Ok, let me get to this from the bottom to the top:
- my test runs indicated a different development wrt timing. If I had
gotten your results, I had stopped working on this long ago. So your
results are wrong for some reason.
- user error: Unknown column 'log' in 'field list'
Wasn't my day, the book patch got lost. It is contained now. First -R
the old patch, then apply this one.
- Unknown table 'n' in field list query:
Walkah found this, but I forgot to fix it. Fixed now.
- I've no idea where the other queries come from. I am hoping that
either your test db is borken or they are follow ups from the other
ones.
If you let me have your test db, I'll try some debugging.
Thanks for wasting your time, too.
------------------------------------------------------------------------
Tue, 10 May 2005 05:07:31 +0000 : Dries
I double-checked and the numbers don't seem to lie. I'll test some more
after work on another machine to make sure it is not platform-specific.
------------------------------------------------------------------------
Wed, 11 May 2005 03:32:47 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_29.patch (54.83 KB)
Ok, here I am again.
What I did:
1) Ask Dries to let me have drupal.org database
2) get 400MB of SQL inserts...
3) take 23 minutes to import said data
4) Remove all image and project nodes (don't want to install their
modules), 11765 nodes left
5) back up data
6) take tests on non-cached /node page (as anonymous user).
ab results:
-c 1 -n 25:
Requests per second: 1.29 [#/sec] (mean)
Connection Times (ms)
min mean[+/-sd] median max
Total: 663 775 179.7 689 1264
7) Do the same for the tracker page:
Requests per second: 0.83 [#/sec] (mean)
Total: 1182 1199 7.4 1199 1217
8) Apply my patch (rev. 28).
9) run db update and hold breath
10) update times out...
11) play back backup from 5)
12) wait
13) getting annoyed and removing cache, watchdog, and accesslog before
playing in backup.
14) wait again. Understand why Dries doesn't try this patch often.
Maybe a smaller DB would do for testing?
15) wait more. get really annoyed.
16) Set time limit to 18000 in update.php
17) try again
18) fails again before the second update is completed.
19) curse.
20) delete search stuff from db. Ooops, sooo much smaller...
21) import again, below 2 minutes...
22) rewrite to use extended insert. Found a bug.
23) still does not complete. Mysql logging to the rescue!
24) tid = 0? Not good.
25) Well, the update works fine till node 10834. 5595 nodes done, 6136
to go.
26) Writing shell based update script. Discovery: 24MB aren't enough.
Hopefully 64 are. Nope.
extended inserts for revisions are apparently not the brightest idea:
Huge memory consumption.
Hmm, no, all updates got through. Selecting the revisions to put them
into old_revisions table screwed it. Learned about CREATE TABLE
old_revisions SELECT syntax.
Yay! finally. 24 MB are just not enough the update.php script seems to
still break.
27) Benchmarks!
/node
Requests per second: 1.54 [#/sec] (mean)
Connection Times (ms)
min mean[+/-sd] median max
Total: 632 649 40.5 636 791
/tracker
Requests per second: 0.86 [#/sec] (mean)
Total: 1119 1165 65.8 1160 1461
Ok: So we get an improvement for many node_loads, but none for simple
selects from node.
More tests can be done.
28) roll new patch
Ain't Drupal fun?
------------------------------------------------------------------------
Wed, 18 May 2005 13:38:05 +0000 : Dries
I did another round of tests on _another_ machine and it is not looking
good:
Before upgrade After upgrade
?q= (main page) 218 ms/request 340 ms/request
?q=forum (forum overview) 754 ms/request 1520 ms/request
?q=about (book page) 375 ms/request 5400 ms/request
The upgrade process itself gave me a number of 'query was empty' and
'sprintf(): too few arguments' reports. Everything seems to work fine
though.
Looking at the ?q=about page, I see that the following query is
executed twice _and_ that each time, it take more than 2 seconds to
complete:
SELECT n.nid, n.title, b.parent, b.weight FROM node n INNER JOIN book b
ON n.vid = b.vid WHERE n.status = 1 AND n.moderate = 0 ORDER BY
b.weight, n.title;
--8 SHOW INDEX FROM book;
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name |
Collation | Cardinality | Sub_part | Packed | Null | Index_type |
Comment |
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| book | 1 | book_parent | 1 | parent | A
| 92 | NULL | NULL | | BTREE | |
| book | 1 | nid | 1 | nid | A
| 369 | NULL | NULL | | BTREE | |
+-------+------------+-------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
2 rows in set (0.00 sec)
The book module does not appear to have a primary key? Sounds like a
bad idea so I added one:
mysql> ALTER TABLE book ADD PRIMARY KEY nid (nid);
Query OK, 369 rows affected (0.02 sec)
Records: 369 Duplicates: 0 Warnings: 0
Next, I wanted to make the vid column a unique key in all node tables:
mysql> ALTER TABLE node ADD UNIQUE vid (vid);
Query OK, 20392 rows affected (0.81 sec)
Records: 20392 Duplicates: 0 Warnings: 0
mysql> ALTER TABLE book ADD UNIQUE vid (vid);
ERROR 1062: Duplicate entry '0' for key 2
mysql> ALTER TABLE forum ADD UNIQUE vid (vid);
Query OK, 10806 rows affected (0.10 sec)
Records: 10806 Duplicates: 0 Warnings: 0
As you can see, it fails for the book table which makes me believe
there is some inconsistent data ... I set out to fix that:
mysql> SELECT nid, COUNT(nid) AS vids FROM book GROUP BY vid HAVING
vids > 1;
+-----+------+
| nid | vids |
+-----+------+
| 871 | 2 |
+-----+------+
1 row in set (0.00 sec)
mysql> SELECT title FROM node WHERE nid = 871;
Empty set (0.00 sec)
mysql> DELETE FROM book WHERE nid = 871;
Query OK, 1 row affected (0.00 sec)
mysql> ALTER TABLE book ADD UNIQUE vid (vid);
Query OK, 368 rows affected (0.01 sec)
Records: 368 Duplicates: 0 Warnings: 0
Looks like everything is well now. Ran some new benchmarks:
Before upgrade After upgrade With
indices
?q= (main page) 218 ms/request 340 ms/request 336
ms/request
?q=forum (forum overview) 754 ms/request 1520 ms/request 1531
ms/request
?q=about (book page) 375 ms/request 5400 ms/request 475
ms/request
Unfortunately, we're still slower than the original code.
------------------------------------------------------------------------
Wed, 18 May 2005 21:53:31 +0000 : killes(a)www.drop.org
Dries, thanks for testing it again.
I do think the broken queries you observer have something to do with
the bad performance after the update. Please log the queries any I will
have a look at them. I've never seen any such queries.
My update script also tries to create the appropriate indices, but it
will of course fail if the database contains cruft. the indices for the
forum are probably missing, too.
I am still convinced that the patch is core worthy.
------------------------------------------------------------------------
Thu, 19 May 2005 04:36:09 +0000 : Dries
It wouldn't hurt if more people would benchmark this patch. The patch's
current performance worries me.
Did you check your watchdog messages after upgrading the drupal.org
database? Depending on your settings, errors might only be shown in
the watchdog. I'll look into the remaining glitches as time permits.
Thanks for your persistence in keeping this patch up-to-date. :)
------------------------------------------------------------------------
Thu, 19 May 2005 11:59:22 +0000 : killes(a)www.drop.org
Dries: Can you please let me have your updated database? I want to have
a look at it and try my own benchmarks with it.
And yes, if I did learn something on this project is how to be
persistant. ;)
------------------------------------------------------------------------
Fri, 24 Jun 2005 16:25:34 +0000 : killes(a)www.drop.org
Here is an idea that occurred to me:
The problem with the upgrade process is that keeping the existing
revisions requires a lot of work to do. This generates a huge amount of
sql queries for a large database and also requires a huge amount of
memory.
My suggestions is to let update.php only handle the basic upgrade, ie
without old revisions. An additional module could be created that would
implement a cron based approach to upgrading old revisions one node at a
time. it could expose a hook to let contrib modules do their own
upgrades.
Dries, what do you think? (I am writing "Dries" because he seems to be
the only one who is interested in getting this into core...)
------------------------------------------------------------------------
Fri, 24 Jun 2005 22:25:11 +0000 : Junyor
Killes:
I'm also interested in seeing this hit core. What about adding
something to legacy.module to do it?
------------------------------------------------------------------------
Sun, 26 Jun 2005 21:14:54 +0000 : chx
This is a sensible approach. Maybe this is the _only_ sensible approach.
I have a little problem though: while the conversion is running somehow
both revision handlers should be available.
------------------------------------------------------------------------
Sun, 26 Jun 2005 22:16:21 +0000 : killes(a)www.drop.org
hehe, one only has to whine and bug enough and one gets some feedback.
;)
@junyor: legacy.module would be a good place. my current idea is to
auto-enable it in update.php and then disable it again in legacy_cron
after all nodes are updated.. ;)
@chx: When somebody wants to look at revisions of a node that node
could be auto-updated.
The only problem are contrib modules: they's need to have some hook in
order to update their own data. When somebody looks at the revisions of
a node than cannot be updated because the contrib module in question has
no such hook, we can optionally let the user discard old revisions I
guess.
Dries, what do you think?
------------------------------------------------------------------------
Mon, 25 Jul 2005 16:48:51 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_30.patch (53.13 KB)
Sooooo; I've updated this patch once again.
Dries didn't like my idea of legacy updates so we will have an option
to discard old revisions in case the update should prove difficult.
Always keep a backup.
------------------------------------------------------------------------
Mon, 25 Jul 2005 17:31:12 +0000 : Bèr Kessels
can we please accept and commit this patch? We can iron out any issues
later. This patch is just far too big and complex to be
perfect-at-once.
------------------------------------------------------------------------
Mon, 25 Jul 2005 19:22:50 +0000 : Dries
@Ber: no, we can't commit this patch blindly.
Data loss is a much bigger problem than a syntax error or other code
glitch. We can break Drupal but we can't break their data.
I've spent quite a bit of time testing the previous version of this
patch and noticed significant performance degradation.
Tell me, Ber, why can't you test this patch first?
------------------------------------------------------------------------
Mon, 25 Jul 2005 19:40:37 +0000 : Bèr Kessels
I was not referring to not testing it. If it is just the upgrade path
that proves to be cumbersome, then why can we not fix it afterwards,
I.e. when everyone has had a good chance to look at it.
Testing such a huge patch requires a lot of work. something no-one just
does in his spare hours. We had discussions before to deal in a slightly
different way with big changes; to commit them quicker and to leave the
ironing out of any left overs for the community. I hooked into that
discussion here. For two reasons. One is that Gerhard has spent
numerous hours on maintaining this patch. The other one is that the
community can be of help with ironing out issues in such a large
change, much better than that Gerhard can do on his own.
And yes, dataloss is very bad, but no-one should loose data, when
he/she followed the instructions (backing up)...
------------------------------------------------------------------------
Mon, 25 Jul 2005 20:00:38 +0000 : Dries
Ber: applying this patch takes 15 seconds. Whether I apply this patch
for you, or whether you apply it yourself, it will hardly reduce the
'testing cost'. The problem is that data loss can be subtle; it might
go unnoticed for a couple days. Make no mistake, I'd like to see this
patch committed ASAP, but it warrants some testing. Let's
test/benchmark it and commit it.
------------------------------------------------------------------------
Mon, 25 Jul 2005 20:55:03 +0000 : drumm
The upgrade script already takes a long time to execute and does not
provide feedback to the user about how it is doing. I plan on making
the upgrade script able to spread the updates across multiple page
views and give user feedback showing that progress is being made. This
will hopefully make the speed of this update a moot point.
I am working on this for my dayjob, CivicSpace, so it should get done
"real soon now", but it should be expected to take awhile (smallish
number of weeks). Please make comprimises if necessary.
------------------------------------------------------------------------
Wed, 27 Jul 2005 17:38:57 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_31.patch (48.78 KB)
@drumm: the timeout isn't usually the problem. Memory consumption is.
I'll look into doing the updates in chunks of 1000 nodes or so.
Clousseau found a bug in the patch, updated.
------------------------------------------------------------------------
Thu, 28 Jul 2005 18:28:08 +0000 : moshe weitzman
Even if the upgrade issues are resolved, we have to figure out why this
patch is slowing down drupal (according to benchmarks posted here).
------------------------------------------------------------------------
Fri, 29 Jul 2005 07:51:45 +0000 : Dries
We can't proceed without some additional benchmarking. I already
benchmarked it on two machines (see comments above), and I'll benchmark
it again after August 10. The performance impact of this patch worries
me so if two other people could test and benchmark this patch
extensively, that would come a long way. August 10 is close to the
code freeze so some help is necessary.
If Gerhard/killes reviewed/tested one of your patches, it is time to
return the favor ...
------------------------------------------------------------------------
Fri, 29 Jul 2005 15:40:31 +0000 : Jose A Reyero
After looking at the patch I'm not really surprised this slows down
everything. I thought that the reason why we wanted revisions in their
own table was in the first place to have a simpler -and faster- node
table. But this patch:
- Adds fields and complexity not only to the node table but also to all
the main tables for data
- Needs additional joins just to retrieve single nodes (which is done
many times for a typical main page)
- Does away with encapsulation of the revision functionality requiring
other modules to handle revissions related data.
So please, please, please:
- Make revisions table only needed when actually using revisions
- Keep that old nice thing of hiding revisions system from other
modules
- And what's so bad with serializing data anyway? I agree that usually
there are better ways to store data. But this is one of the cases where
serialization does make sense. Not the current huge serialized field,
but only a new table with one serialized node per row.
- In general, do it much simpler. This is way too complex and it
removes more functionality than it adds.
In just four words: keep it simple, please.
------------------------------------------------------------------------
Fri, 29 Jul 2005 21:25:26 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_32.patch (51.27 KB)
Patch updated (v31 was buggy, and there was an update to updates.inc).
I've had a discussion with Jose and he will maybe do some testing over
the weekend. John is also doing some testing as we speak.
------------------------------------------------------------------------
Sat, 30 Jul 2005 16:06:59 +0000 : Nick Nassar
It has been exactly one year since I first submitted this patch.
You know what that means: *old patch party in IRC!* Virtual beers on
me!
w00t!
------------------------------------------------------------------------
Sat, 30 Jul 2005 16:08:33 +0000 : Jose A Reyero
Hi again.
I had a talk with killes -I have to say he enlightned me about this-
and also did some testing. Sorry but I dont have a good set up to
provide benchmarks.
Here are my conclusions -and also some explanation for the varied
results abt performance:
- This is faster for simple node listings
- This is slower for full node/load
....thus the performance increase/decrease for a given page should
depend on the relation between the number of simple node listings
(usually blocks) and the number of nodes displayed in the main page.
I've also experienced some trouble with attachments -not re-created
when creating a new revision- and book relations, same.
I'm not sure I like this new "feature" of creating a new revision every
time you rollback an old one -in Drupal 4.6 this seems not to happen.
Besides these details, I think this patch is good enough -and quite a
powerful approach actually- though I still feel it's bit too complex
when simply dealing with nodes -without revisions-, which is what we do
most of the time, and complexity will be even higher with contrib
modules using still more tables, not to mention flexinode -versioning
info will have to be scattered everywhere in the db.
So this is not a +1 nor -1. On one side I think this is powerful. On
the other side I feel this is too complex and really, it's more than I
need.
I would be happy with a 'revision' table having: nid, vid,
data(serialized), that you can forget about when not using revisions.
------------------------------------------------------------------------
Sat, 30 Jul 2005 19:16:41 +0000 : Nick Nassar
On a much more serious note that my last comment, unless I'm missing
something, you've introduced a subtle bug into this patch since I left
it.
A and B are both updating revision 5 of a node.
A calls db_next_id() and gets a revision id of 6
B calls db_next_id() and gets a revision id of 7
B updates the node's revision to 7
B adds revision 7 to the revision table
A updates the node's revision to 6
A adds revision 6 to the revision table
Now, you have a situation where the current revision (6) isn't actually
the last revision(7). For some uses, that's not a big deal, but in the
case of wikis, which is what I originally hoped to use this patch for,
it creates problems.
------------------------------------------------------------------------
Mon, 01 Aug 2005 00:04:14 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_33.patch (51.27 KB)
Updated the patch. updates.inc was changed and chx found a typo.
@Nick: I don't think the issue is due to this patch. Drupal has
currently no mechanism to not let people commit a new version of the
same node, whether with or without revisions. drupal_next_id ensures
that the revision IDs are unique and the later revisions will be the
current one. Which revision is the current one does not depend on order
of the version ID, btw.
@Jose: what exactly were your problems with uploads? what do I need to
do to reproduce?
------------------------------------------------------------------------
Mon, 01 Aug 2005 11:33:56 +0000 : Jose A Reyero
Sorry, I dont know what I did the other day but I cannot reproduce it
either :-)
I applied this new patch. Problems I'm having now (Testing with
"stories")
- Book outline is gone after creating a new revision
- I cannot delete attachments (not sure it is related to your patch or
to something else)
- Usage of the log message: Is that mensage intended for revisions or
only for book outlines? Currently the field is displayed only for
outlines but shown in the revision tab
Btw, the db update failed for me in patch 32, so I did it manually. I
didnt do a db update again after rolling back v32 and applying v33
-should I?
------------------------------------------------------------------------
Sun, 14 Aug 2005 16:03:41 +0000 : killes(a)www.drop.org
Jose, thanks a lot for testing!
"- Book outline is gone after creating a new revision"
Sounds like a bug, it should not.
"- I cannot delete attachments (not sure it is related to your patch or
to something else)"
Another bug
"- Usage of the log message: Is that mensage intended for revisions or
only for book outlines? Currently the field is displayed only for
outlines but shown in the revision tab"
Log messages are possible for nodes of type book and page as well as
all nodes that are put into the book. Log messages can be added for any
node type, the module author should implement those.
"Btw, the db update failed for me in patch 32, so I did it manually. I
didnt do a db update again after rolling back v32 and applying v33
-should I?"
No, the db update should be the same.
I will try to fix the bugs you found.
WRT the db update I've got an announcement to make: I will not migrate
old revisions.
Why?
1) it is a rather complicated process that requires a lot of memory and
is likely to give people on cheap hosting trouble.
2) testing it takes a lot of time.
3) I don't personally need it
I will still move old revisions to an old_revisions table from where
they can be converted by a later update or a custom script.
------------------------------------------------------------------------
Tue, 16 Aug 2005 21:34:43 +0000 : killes(a)www.drop.org
Attachment: http://drupal.org/files/issues/revisions_34.patch (52.92 KB)
Ok, here is incarnation No. 34 of this patch. Dries agrees with me that
we can put the revision rescue operation into a later update, possible
using Drumm's installer.
The patch fixes the issue with the file uploads not being deletable, I
could not reproduce the problem with the outlines.
1
0