Drupal 7 Entities - view random node
I have a need for a block that will display a random quote. Node type = 'random_quote' and only 1 random record should be displayed In D7 the body contents are in field_data_body and I have it working using the DB API but it would probably be better using the entityController class methods like 'entity_load' I see that you can pass in conditions but is it possible to return a single random record? Does anyone with more experience with the new EntityController have any recommendations? Thanks!
I did that all the time in Drupal 6 using Views. It's dead simple. (Order random limit 1, done.) I haven't tried doing it in Drupal 7, but I'm sure the same method still works. --Larry Garfield On Sunday, January 23, 2011 1:33:34 pm Blaine Lang wrote:
I have a need for a block that will display a random quote. Node type = 'random_quote' and only 1 random record should be displayed
In D7 the body contents are in field_data_body and I have it working using the DB API but it would probably be better using the entityController class methods like 'entity_load'
I see that you can pass in conditions but is it possible to return a single random record?
Does anyone with more experience with the new EntityController have any recommendations?
Thanks!
Larry, I am curious. Being that you are a database guru, this fits in with an issue for my Quotes module, as does the original post. [BTW, Blaine, rather than write you own, you might want to look at Quotes.] One of the users who has an unusual number of quotes (~12K) says that ORDER RANDOM LIMIT 1 is not very efficient and gets fairly slow at that size. There are discussions elsewhere on the web to back up that assertion. Do you find that to be true? Do you have a simpler way around it than keeping a separate index of the applicable nodes? Nancy Injustice anywhere is a threat to justice everywhere. -- Dr. Martin L. King, Jr. ________________________________ From: Larry Garfield <larry@garfieldtech.com> To: development@drupal.org Sent: Sun, January 23, 2011 7:41:34 PM Subject: Re: [development] Drupal 7 Entities - view random node I did that all the time in Drupal 6 using Views. It's dead simple. (Order random limit 1, done.) I haven't tried doing it in Drupal 7, but I'm sure the same method still works. --Larry Garfield On Sunday, January 23, 2011 1:33:34 pm Blaine Lang wrote:
I have a need for a block that will display a random quote. Node type = 'random_quote' and only 1 random record should be displayed
I don't know that I've ever tried it with 12k nodes so I can't say from experience. Randomization is a tricky subject. The common method, which Views uses as does the ->orderRandom() method of DBTNG, boils down to adding a random number as a new column for each record, ordering by that, and then (in this case) discarding all but the first record. That involves a linear-growth creation for the new column (a fixed additional cost per record) and then an integer ordering. Assuming the sorting function in the SQL database is sane (which if it isn't you need a new database), that should be an n*log n algorithm. (That's as fast as a sorting algorithm can get.) So yes, more records means randomization is slower. How slow is too slow depends on too many factors for me to make any general statement. There are other mechanisms for randomization that can be faster, even O(1), but those rely on other assumption such as a primary key that is always 1- based and contiguous. That would work great... if you're searching all nodes, not just nodes of a certain type, and you've never deleted a node. The less those two caveats are true, the less well that method works. In this case, randomizing via Views and randomizing via DBTNG will be equally performant since they're using the same underlying SQL (at least the same approach, even if not the exact same syntax). I don't know of a faster, generic random-order approach in SQL. (If someone does, please file an issue so we can get it into core. <g>) --Larry Garfield On Sunday, January 23, 2011 6:53:58 pm nan wich wrote:
Larry, I am curious. Being that you are a database guru, this fits in with an issue for my Quotes module, as does the original post. [BTW, Blaine, rather than write you own, you might want to look at Quotes.]
One of the users who has an unusual number of quotes (~12K) says that ORDER RANDOM LIMIT 1 is not very efficient and gets fairly slow at that size. There are discussions elsewhere on the web to back up that assertion.
Do you find that to be true? Do you have a simpler way around it than keeping a separate index of the applicable nodes?
Nancy
Injustice anywhere is a threat to justice everywhere. -- Dr. Martin L. King, Jr.
________________________________ From: Larry Garfield <larry@garfieldtech.com> To: development@drupal.org Sent: Sun, January 23, 2011 7:41:34 PM Subject: Re: [development] Drupal 7 Entities - view random node
I did that all the time in Drupal 6 using Views. It's dead simple. (Order random limit 1, done.) I haven't tried doing it in Drupal 7, but I'm sure the same method still works.
--Larry Garfield
On Sunday, January 23, 2011 1:33:34 pm Blaine Lang wrote:
I have a need for a block that will display a random quote. Node type = 'random_quote' and only 1 random record should be displayed
Thanks, Larry, that make me feel a bit better. The suggestion is to keep a separate table of applicable nodes, generate a random number from the number of records, and get that record. Your explanation seems to indicate that one is simply spreading the "overhead" out over time - and probably increasing it in total, maybe. Given that the block can be refreshed on every page load, and a new node created relatively infrequently, I can also see that there may be some merit when the number of nodes gets fairly large. Maybe something like "SELECT n.* FROM {quotes_random} r INNER JOIN {node} n ON n.nid = r.nid WHERE r.row_num = 1 + RAND() * (SELECT MAX(row_num) FROM {quotes_random})". (Sorry for the D6 version; it would take me a much longer time to construct in D7.) Nancy Injustice anywhere is a threat to justice everywhere. -- Dr. Martin L. King, Jr. ________________________________ From: Larry Garfield I don't know that I've ever tried it with 12k nodes so I can't say from experience. Randomization is a tricky subject. The common method, which Views uses as does the ->orderRandom() method of DBTNG, boils down to adding a random number as a new column for each record, ordering by that, and then (in this case) discarding all but the first record. That involves a linear-growth creation for the new column (a fixed additional cost per record) and then an integer ordering. Assuming the sorting function in the SQL database is sane (which if it isn't you need a new database), that should be an n*log n algorithm. (That's as fast as a sorting algorithm can get.)
The only way I can see for a separate lookup table to help would be to prepopulate it with your desired filters and an arbitrary autoinc field. So you would do something like: INSERT INTO {mytemp} SELECT nid FROM {node} WHERE type='quote'; And then rebuild that (incrementally or full) every time you add or remove a quote node. Then to select, you would select one int at random between 1 and COUNT(*) in mytemp, then pull that record. The above makes the actual "fetch one at random" operation almost O(1), but you need to create and maintain that Actually, I just found this in the MySQL manual in user comments: SELECT FLOOR(RAND() * COUNT(*)) AS rand_row FROM {mytemp}; SELECT nid FROM {mytemp} LIMIT $rand_row, 1; That could actually be faster, at the cost of being two queries so there are technically potential race conditions. (Unlikely to be a problem in practice, but still there.) I've not tried the above; I just saw it and thought it was cool. :-) (It's also MySQL-specific. I don't know if the same thing would work well on other databases.) The MySQL manual has a couple of other suggestions in user comments, too: http://dev.mysql.com/doc/refman/5.0/en/select.html --Larry Garfield On Sunday, January 23, 2011 8:12:08 pm nan wich wrote:
Thanks, Larry, that make me feel a bit better.
The suggestion is to keep a separate table of applicable nodes, generate a random number from the number of records, and get that record. Your explanation seems to indicate that one is simply spreading the "overhead" out over time - and probably increasing it in total, maybe. Given that the block can be refreshed on every page load, and a new node created relatively infrequently, I can also see that there may be some merit when the number of nodes gets fairly large. Maybe something like "SELECT n.* FROM {quotes_random} r INNER JOIN {node} n ON n.nid = r.nid WHERE r.row_num = 1 + RAND() * (SELECT MAX(row_num) FROM {quotes_random})". (Sorry for the D6 version; it would take me a much longer time to construct in D7.)
Nancy
Injustice anywhere is a threat to justice everywhere. -- Dr. Martin L. King, Jr.
________________________________
From: Larry Garfield I don't know that I've ever tried it with 12k nodes so I can't say from experience.
Randomization is a tricky subject. The common method, which Views uses as does the ->orderRandom() method of DBTNG, boils down to adding a random number as a new column for each record, ordering by that, and then (in this case) discarding all but the first record. That involves a linear-growth creation for the new column (a fixed additional cost per record) and then an integer ordering. Assuming the sorting function in the SQL database is sane (which if it isn't you need a new database), that should be an n*log n algorithm. (That's as fast as a sorting algorithm can get.)
Yes, I saw that - and many others. One variation uses a LOCK to eliminate the race potential, but I am reluctant to do that. Unfortunately, that has a problem if there are key gaps. This looks a bit more fail-proof: http://edrackham.com/featured/get-random-row-with-mysql-without-order-by-ran... My issue with these techniques is that the vast majority of the users of the Quotes module don't have anywhere near 12K quotes nodes (I personally have 163), so do I let the one suffer, or let everyone use the "improved" technique (meaning an additional table), or choose some point at which to switch techniques (code bloat). And, of course, making sure it works on Postgres as well. I'd be happy to move this discussion to another venue if anyone is upset. Nancy Injustice anywhere is a threat to justice everywhere. -- Dr. Martin L. King, Jr. ________________________________ From: Larry Garfield Actually, I just found this in the MySQL manual in user comments: SELECT FLOOR(RAND() * COUNT(*)) AS rand_row FROM {mytemp}; SELECT nid FROM {mytemp} LIMIT $rand_row, 1; That could actually be faster, at the cost of being two queries so there are technically potential race conditions. (Unlikely to be a problem in practice, but still there.) I've not tried the above; I just saw it and thought it was cool. :-) (It's also MySQL-specific. I don't know if the same thing would work well on other databases.)
On Jan 23, 2011, at 8:56 PM, nan wich wrote:
I'd be happy to move this discussion to another venue if anyone is upset.
I've been following with interest. Whether it's quotes or photos or something else, there's often a use for presenting a single random item. -Carl
My 2 cents, Use something like this: SELECT * FROM node WHERE type = 'quote' AND nid <= FLOOR(RAND() * (SELECT MAX(nid) FROM node)) LIMIT 1; It isn't truly random, because it depends on how the nodes you're looking for are distributed, but it is faster than sort by rand().You can always extend the subquery with the same WHERE as the parent, but it will slow down the whole thing. Best regards, Peter Droogmans Attiks Ketsstraat 94 2140 Borgerhout Belgium 32 3 288 61 17 32 497 44 44 77 -----Original Message----- From: development-bounces@drupal.org [mailto:development-bounces@drupal.org] On Behalf Of Larry Garfield Sent: maandag 24 januari 2011 3:33 AM To: development@drupal.org Subject: Re: [development] Drupal 7 Entities - view random node The only way I can see for a separate lookup table to help would be to prepopulate it with your desired filters and an arbitrary autoinc field. So you would do something like: INSERT INTO {mytemp} SELECT nid FROM {node} WHERE type='quote'; And then rebuild that (incrementally or full) every time you add or remove a quote node. Then to select, you would select one int at random between 1 and COUNT(*) in mytemp, then pull that record. The above makes the actual "fetch one at random" operation almost O(1), but you need to create and maintain that Actually, I just found this in the MySQL manual in user comments: SELECT FLOOR(RAND() * COUNT(*)) AS rand_row FROM {mytemp}; SELECT nid FROM {mytemp} LIMIT $rand_row, 1; That could actually be faster, at the cost of being two queries so there are technically potential race conditions. (Unlikely to be a problem in practice, but still there.) I've not tried the above; I just saw it and thought it was cool. :-) (It's also MySQL-specific. I don't know if the same thing would work well on other databases.) The MySQL manual has a couple of other suggestions in user comments, too: http://dev.mysql.com/doc/refman/5.0/en/select.html --Larry Garfield On Sunday, January 23, 2011 8:12:08 pm nan wich wrote:
Thanks, Larry, that make me feel a bit better.
The suggestion is to keep a separate table of applicable nodes, generate a random number from the number of records, and get that record. Your explanation seems to indicate that one is simply spreading the "overhead" out over time - and probably increasing it in total, maybe. Given that the block can be refreshed on every page load, and a new node created relatively infrequently, I can also see that there may be some merit when the number of nodes gets fairly large. Maybe something like "SELECT n.* FROM {quotes_random} r INNER JOIN {node} n ON n.nid = r.nid WHERE r.row_num = 1 + RAND() * (SELECT MAX(row_num) FROM {quotes_random})". (Sorry for the D6 version; it would take me a much longer time to construct in D7.)
Nancy
Injustice anywhere is a threat to justice everywhere. -- Dr. Martin L. King, Jr.
________________________________
From: Larry Garfield I don't know that I've ever tried it with 12k nodes so I can't say from experience.
Randomization is a tricky subject. The common method, which Views uses as does the ->orderRandom() method of DBTNG, boils down to adding a random number as a new column for each record, ordering by that, and then (in this case) discarding all but the first record. That involves a linear-growth creation for the new column (a fixed additional cost per record) and then an integer ordering. Assuming the sorting function in the SQL database is sane (which if it isn't you need a new database), that should be an n*log n algorithm. (That's as fast as a sorting algorithm can get.)
Thanks, Peter. That's pretty much the same query as in Ed Rackham's post. My concern that selecting directly from the node table is that randomness would be compromised by the generally low volume of the one content type for most users. I'm going to try to create a simple installation to test this. Nancy Injustice anywhere is a threat to justice everywhere. -- Dr. Martin L. King, Jr. ________________________________ From: Peter Droogmans <Peter@attiks.com> To: "development@drupal.org" <development@drupal.org> Sent: Mon, January 24, 2011 3:28:32 AM Subject: Re: [development] Drupal 7 Entities - view random node My 2 cents, Use something like this: SELECT * FROM node WHERE type = 'quote' AND nid <= FLOOR(RAND() * (SELECT MAX(nid) FROM node)) LIMIT 1; It isn't truly random, because it depends on how the nodes you're looking for are distributed, but it is faster than sort by rand().You can always extend the subquery with the same WHERE as the parent, but it will slow down the whole thing. Best regards, Peter Droogmans Attiks Ketsstraat 94 2140 Borgerhout Belgium 32 3 288 61 17 32 497 44 44 77 -----Original Message----- From: development-bounces@drupal.org [mailto:development-bounces@drupal.org] On Behalf Of Larry Garfield Sent: maandag 24 januari 2011 3:33 AM To: development@drupal.org Subject: Re: [development] Drupal 7 Entities - view random node The only way I can see for a separate lookup table to help would be to prepopulate it with your desired filters and an arbitrary autoinc field. So you would do something like: INSERT INTO {mytemp} SELECT nid FROM {node} WHERE type='quote'; And then rebuild that (incrementally or full) every time you add or remove a quote node. Then to select, you would select one int at random between 1 and COUNT(*) in mytemp, then pull that record. The above makes the actual "fetch one at random" operation almost O(1), but you need to create and maintain that Actually, I just found this in the MySQL manual in user comments: SELECT FLOOR(RAND() * COUNT(*)) AS rand_row FROM {mytemp}; SELECT nid FROM {mytemp} LIMIT $rand_row, 1; That could actually be faster, at the cost of being two queries so there are technically potential race conditions. (Unlikely to be a problem in practice, but still there.) I've not tried the above; I just saw it and thought it was cool. :-) (It's also MySQL-specific. I don't know if the same thing would work well on other databases.) The MySQL manual has a couple of other suggestions in user comments, too: http://dev.mysql.com/doc/refman/5.0/en/select.html --Larry Garfield On Sunday, January 23, 2011 8:12:08 pm nan wich wrote:
Thanks, Larry, that make me feel a bit better.
The suggestion is to keep a separate table of applicable nodes, generate a random number from the number of records, and get that record. Your explanation seems to indicate that one is simply spreading the "overhead" out over time - and probably increasing it in total, maybe. Given that the block can be refreshed on every page load, and a new node created relatively infrequently, I can also see that there may be some merit when the number of nodes gets fairly large. Maybe something like "SELECT n.* FROM {quotes_random} r INNER JOIN {node} n ON n.nid = r.nid WHERE r.row_num = 1 + RAND() * (SELECT MAX(row_num) FROM {quotes_random})". (Sorry for the D6 version; it would take me a much longer time to construct in D7.)
Nancy Injustice anywhere is a threat to justice everywhere. -- Dr. Martin L. King, Jr.
________________________________
From: Larry Garfield I don't know that I've ever tried it with 12k nodes so I can't say from experience.
Randomization is a tricky subject. The common method, which Views uses as does the ->orderRandom() method of DBTNG, boils down to adding a random number as a new column for each record, ordering by that, and then (in this case) discarding all but the first record. That involves a linear-growth creation for the new column (a fixed additional cost per record) and then an integer ordering. Assuming the sorting function in the SQL database is sane (which if it isn't you need a new database), that should be an n*log n algorithm. (That's as fast as a sorting algorithm can get.)
Hi Blaine You should probably not be using the DB API to load the field value. Since (in theory) we are able to have our actual field data stored in various back-ends, you should only really load the data via node or entity loading functions. The node table itself is still reliably in Sql though, so you can select which node to display via the DB API. Peter On 23 January 2011 19:33, Blaine Lang <Blaine.Lang@nextide.ca> wrote:
I have a need for a block that will display a random quote. Node type = 'random_quote' and only 1 random record should be displayed
In D7 the body contents are in field_data_body and I have it working using the DB API but it would probably be better using the entityController class methods like 'entity_load'
I see that you can pass in conditions but is it possible to return a single random record?
Does anyone with more experience with the new EntityController have any recommendations?
Thanks!
participants (6)
-
Blaine Lang -
Carl Jester -
Greenman -
Larry Garfield -
nan wich -
Peter Droogmans