Variables, lookups, and memory usage
Hey folks, I've written a lookup module to store and manage generic name/value pairs, organized by "realms". For example: REALM NAME VALUE state MN Minnesota state TX Texas country US United States The basic premise is to keep this type of data in one place, rather than all over the map (e.g. a states table for ecommerce, a states table for something else, so-on.) With the lookup API, any module can stash/get its codes in the table, and it supports importing from a text file. I got used to working this way at my last job (a big Oracle shop - I think this model is called "one true lookup table") I'm considering using this for module settings to save memory. Some of my modules (e.g. send) have lots of settings for email templates, default message text, etc. It's silly to shove all of this into the variables table and haul it around on each page request, when 1 out of 5,000 page views will actually require it. So, I'd add: REALM NAME VALUE send template <html><body>... send message Here's some content from ... I still have a consistent place to stash these settings ( lookup_set () , lookup_get() ), but I only pull them as needed. An overall memory savings, and handy as well. The lookup_get() is functionally similar to the variable_get() function, except that it requires a $realm parameter: lookup_get($realm, $name=null); // without a name, it returns a keyed array of the whole realm My Questions: 1) There's potential overlap with variables and probably other stuff too. Is there a better way? 2) Is it weird to create a confusing-sounding component module like this and require it for other modules? Should this (or a derivative of this) functionality be available in core instead? Allie Micka pajunas interactive, inc. http://www.pajunas.com scalable web hosting and open source strategies
The lookup_get() is functionally similar to the variable_get() function, except that it requires a $realm parameter:
lookup_get($realm, $name=null); // without a name, it returns a keyed array of the whole realm
My Questions:
1) There's potential overlap with variables and probably other stuff too. Is there a better way?
I think discussed adding "domains" to the variable table a while back on this list. Can't remember who was it (Adrian?), but the code was very similar to what you wrote, but was by module instead of Realm. I like Realm better since it is more generic..
On 15 Dec 2005, at 8:31 PM, Khalid B wrote:
I think discussed adding "domains" to the variable table a while back on this list.
Yes,. It was actually the first test DEP that I wrote. You can probably search the mailing list for it. I'd be happy to discuss it with you ally. I look forward to seeing you in february. I am seeing you in febs ,right ? -- Adrian Rossouw Drupal developer and Bryght Guy http://drupal.org | http://bryght.com
I think discussed adding "domains" to the variable table a while back on this list.
Yes,. It was actually the first test DEP that I wrote. You can probably search the mailing list for it.
I admit I didn't take the time to absorb it fully, but It seems that the cascading variable system is about values and overriding them (think vertical), where the lookup stuff is more about segmenting/ grouping data (think horizontal) I could be convinced that they're the same thing, but right now I feel like they're different. Storing states/countries/etc in lookup makes it a completely separate function from variables. But I considered storing expensive module settings here, got into a grey area, and came here for advice. To clarify, my main goal is to store settings while reducing memory consumption by not loading values on every page load. I'm inclined to go with a component module for now as a proof of concept, but input on making my strategy more generic/core-friendly is welcome. Mostly, I need to finish up these dependent modules soon, and I'd rather move forward than discuss it ad nauseum. With this list's input, I hope to move forward in a more informed/holistic way.
I look forward to seeing you in february. I am seeing you in febs ,right ?
I'm sure gonna try! Allie Micka pajunas interactive, inc. http://www.pajunas.com scalable web hosting and open source strategies
with civicrm we've gone back-n-forth with what model to use for various settings. We store all settings in the DB (makes it easier to configure/tweak/translate etc). However, we ended up going down the road of creating different tables for each setting (quite a few of the tables share the same code). Probably the big factor was the minor variation each setting had which needed to be accomodated rather than shoe-horned into one generic table. a good example is: most settings just need: name , is_active, is_reserved a few like state/country also need additional attributes like: abbreviate, country_id (for state) idd_prefix, ndd_prefix (dialing code prefixes for country), iso_code etc ultimately we decided to go with seperate tables. In retrospect, i would probably combine the table for the generic case (which we'll probably do in 1.4) which match the need exactly and keep the specific ones in seperate tables lobo --- Allie Micka <allie@pajunas.com> wrote:
I think discussed adding "domains" to the variable table a while back on this list.
Yes,. It was actually the first test DEP that I wrote. You can probably search the mailing list for it.
I admit I didn't take the time to absorb it fully, but It seems that the cascading variable system is about values and overriding them (think vertical), where the lookup stuff is more about segmenting/ grouping data (think horizontal)
I could be convinced that they're the same thing, but right now I feel like they're different. Storing states/countries/etc in lookup makes it a completely separate function from variables. But I considered storing expensive module settings here, got into a grey area, and came here for advice.
To clarify, my main goal is to store settings while reducing memory consumption by not loading values on every page load. I'm inclined to go with a component module for now as a proof of concept, but input on making my strategy more generic/core-friendly is welcome.
Mostly, I need to finish up these dependent modules soon, and I'd rather move forward than discuss it ad nauseum. With this list's input, I hope to move forward in a more informed/holistic way.
I look forward to seeing you in february. I am seeing you in febs ,right ?
I'm sure gonna try!
Allie Micka pajunas interactive, inc. http://www.pajunas.com
scalable web hosting and open source strategies
Like Allie, I've worked on 2 large projects (80 to 100 tables) which used a table like this (using Sybase, not Oracle). We called that one table the codelist table. It contained a variety of shortish lists (typically less than 50 rows) of various id-name-value triplets. Each row was prefixed by the 4th column that Allie calls "realm" but what we called something else (I don't remember but something more generic like "list"). Doing this was much more efficient than having separate tables for all of those various lists of triples. In the rare case where we really needed a separate code list table, it was generally named so as to make its function clear: foobar_ct, where ct stood for code table. We definitely need to get everything out of the variables table -- in its current incarnation! -- that is NOT needed on each and every page. It makes no sense to load all kinds of cruft that is only needed once in a while. Whether the solution is Adrian's or Allie's will be based on other criteria, as both appear to solve the problem of loading unneeded data on each page. ..chrisxj
On 16 Dec 2005, at 12:12 AM, Chris Johnson wrote:
Whether the solution is Adrian's or Allie's will be based on other criteria, as both appear to solve the problem of loading unneeded data on each page.
My approach has that as a side benefit ( and i think it's pretty much the same approach) My target is to allow us to flexibly assign variables to different realms. Something we have to do by hand almost every time we do this now. I would also like to clean up the way we override things, and allow us more flexibility, so that we can do interesting new things in the upper layers without having to get bogged down with the back end things. -- Adrian Rossouw Drupal developer and Bryght Guy http://drupal.org | http://bryght.com
On 15 Dec 2005, at 9:34 PM, Allie Micka wrote:
I think discussed adding "domains" to the variable table a while back on this list.
Yes,. It was actually the first test DEP that I wrote. You can probably search the mailing list for it.
I admit I didn't take the time to absorb it fully, but It seems that the cascading variable system is about values and overriding them (think vertical), where the lookup stuff is more about segmenting/grouping data (think horizontal)
It's both. You can get the top variable on the stack, or you can get the variable out of just one layer. Without the last parameter to variable_get, it just gets it from the top of the stack. What it comes down to, is it's a more sane data storage mechanism than the data column in the user field for instance. Although there's a definite schism between variables (that change how the system works) and properties (ie: profile fields and the like). And once we step down that line .. we are on CCK's turf. I envision being able to set $user->settings as an associative array, and then in the back end it uses the variables table to store that. Think of user as one of these realms. Same with views, or email, or whatever. It only loads one of the realms when you specify it in your code, and then any lookups against it are cheap. -- Adrian Rossouw Drupal developer and Bryght Guy http://drupal.org | http://bryght.com
On 15 Dec 2005, at 8:24 PM, Allie Micka wrote:
I still have a consistent place to stash these settings ( lookup_set () , lookup_get() ), but I only pull them as needed. An overall memory savings, and handy as well. in my case it would be : variable_load('realm', $id);
So the user module, would during init go : variable_load('user', $user->uid); Which would load that user's settings into the stack.
The lookup_get() is functionally similar to the variable_get() function, except that it requires a $realm parameter:
Exactly. I am proposing we extend variable get, not write a new function.
lookup_get($realm, $name=null); // without a name, it returns a keyed array of the whole realm Basically in my proposal it would be :
variable_get($name, $default, $realm = null, $id = null); // without a realm it returns the first variable at the top of the stack. -- Adrian Rossouw Drupal developer and Bryght Guy http://drupal.org | http://bryght.com
One thing I'm curious about, Allie, is the number of database lookups (which loading all variables intends to reduce). How does your module handle this? Are all of your lookups done via a centralized function? Or do you sometimes do custom queries on the table itself? Or both? How many queries are the result? I once suggested that the variable table should get a "realm" column and that variable_get should load and static cache all the variables one realm at a time. This was still in the context of "realm" being synonymous with "module". Does your module have a strategy for caching and reducing db queries? Would there be any use in loading all the variables in a realm at once with the thought that the others in the same realm would probably be needed, and the subsequent queries should be avoided? -Robert
On Dec 16, 2005, at 6:45 AM, Robert Douglass wrote:
I once suggested that the variable table should get a "realm" column and that variable_get should load and static cache all the variables one realm at a time.
That's what it does. lookup_get( realm, name=null ) { static lookup = array() if (! isset(lookup[name])) { populate lookup[name] array } if (name) return lookup[realm][name] else return lookup[realm] } Only in real code :) Again, the original intention of the lookup module was not to use it for variables, but to store arbitrary name/value entries. One could argue for or against using it to store variables/settings/ preferences information. Personally, I'm on the fence. When I considered doing that, I realized I was creating ambiguity between variables and lookups so I asked here. What I'm taking from this conversation is that there's work in progress that makes variable setting more flexible (and possibly more efficient), but it will almost certainly not make it into 4.7 and is not part of a short-term solution. I'll have to decide what's best for my needs at this time: sucking up the memory bloat or setting a weird dependency for modules. Allie Micka pajunas interactive, inc. http://www.pajunas.com scalable web hosting and open source strategies
On 16 Dec 2005, at 4:59 PM, Allie Micka wrote:
Again, the original intention of the lookup module was not to use it for variables, but to store arbitrary name/value entries. Variables are arbitrary name / value entries. =)
-- Adrian Rossouw Drupal developer and Bryght Guy http://drupal.org | http://bryght.com
participants (6)
-
Adrian Rossouw -
Allie Micka -
Chris Johnson -
Donald A. Lobo -
Khalid B -
Robert Douglass