[drupal-devel] [feature] 'View terms' for vocabularies with > 25
terms
Chris Johnson
drupal-devel at drupal.org
Thu May 19 23:02:02 UTC 2005
Issue status update for http://drupal.org/node/20505
Project: Drupal
Version: cvs
Component: taxonomy.module
Category: feature requests
Priority: normal
Assigned to: Morbus Iff
Reported by: Morbus Iff
Updated by: Chris Johnson
Status: patch
I think nedjo states the problem clearly.
I've got a site where when it is fully implement, it could have
thousands of terms. Imagine dividing the U.S. by state, then county,
then location (city) name. By state by location gives as many as
12,084 locations in the state of Pennsylvania, for example. Dividing
the same data into counties, I still have 9 counties with more than 500
locations (terms).
I'm sure others have similar large datasets they wish to make use of in
their Drupal-based web sites.
Chris Johnson
Previous comments:
------------------------------------------------------------------------
April 14, 2005 - 10:31 : Morbus Iff
Attachment: http://drupal.org/files/issues/_p_25termvocab.patch (1.28 KB)
There was some small talk about this here and there during my
folksonomy/free tagging patch, but I held off until now to actually
implement it. Much like we handle free tagging vocabularies ("this is a
free tagging vocabulary: view terms", which leads the admin into the
taxonomy pager system), this patch does the same thing for NON-free
tagging vocabularies, but ONLY if the vocabulary has more than 25
terms.
These patches were made during the exploration and customization of
Drupal by http://www.NHPR.org. In loving support of open source
software, http://www.NHPR.org will continue to contribute patches they
feel the community will benefit from. Questions about this patch should
be directed to morbus at disobey.com.
------------------------------------------------------------------------
May 2, 2005 - 05:32 : Dries
If would make sense if "free tags" were also shown if there are less
than 25 terms, not?
------------------------------------------------------------------------
May 2, 2005 - 12:53 : Morbus Iff
My concern with that is how to actually get the term count quickly for a
vocabulary that has 8000 terms. On my installation (with 8000+ terms),
it takes 360ms (per devel.module) for get_tree(). Following through
with the same mentality code as suggested with this patch (using
get_tree() to return an array of countable terms), we'd be wasting
360ms just to get a count of free tagging terms (more or less - I've
been generally assuming that the very smallest free tagging vocab would
be 100 terms, and the largest I've worked with being 8000). The only
other countable equivalent would be a brand new SQL statement that'd
just return the number of terms in a vocabulary (similar to
term_count_nodes, only vocabulary_count_terms). I could certainly add
this in, but for consistency / pattern sake, I'd want to use it IN
PLACE of the get_tree side- effect/counting as demonstrated in this
patch (so, count first, then get_tree when necessary). This would cause
NUM_VOCABULARIES more queries than currently, but would certainly add up
less than returning an entire tree of 8000 terms just to throw it away
again. Thoughts?
------------------------------------------------------------------------
May 3, 2005 - 08:26 : Jaza
Two suggestions:
1. Why >25? Anything special about the number 25? +1 For the idea of
having 'view terms' for vocabularies with a large number of terms, but
I don't see why 25 is necessarily where 'large' begins. This number
should be a setting somewhere (not sure where, since taxonomy has no
options page of its own). 25 can be the default option - it sounds
reasonable to me (but maybe not to all webmasters). Perhaps make it so
that if '0' is entered as the number, all terms are always displayed
for that vocabulary (unless free tagging is enabled).
2. Taking this concept a bit further, it might be good to have an
option: "enable free tagging for vocabularies when the number of terms
reaches x". x could be the same variable as the one above, but I
suggest that it be stored separately, for a more flexible
configuration. As with suggestion 1, it may be hard finding a settings
page on which this option belongs. Also, this might be taking it too
far - could be one setting too many? In terms of performance, there
should be no problem, since free tagging vocabularies do not need to
have their terms counted - only others do.
Number 2 could also be implemented on a per-vocabulary basis (number 1
also could, but that would be silly), but that would mean another
column in the vocabulary table, especially for free tagging. And if
that can be avoided, it should.
Jaza
------------------------------------------------------------------------
May 3, 2005 - 08:39 : Morbus Iff
25 is an arbitrary number based on the assumption that they'll be at
least two vocabularies in a site (one for the site itself, and one for
the forums). A single page with a maximum of 48 items on it is quite
large already, and that doesn't take into consideration other
user-created vocabularies that could increase the size of the page (2
other vocabularies that have 10 items each, for instance, could create
a 68-item sized page). I'm against a user-settable option just for the
sake of a user-settable option. #2 is a whole 'nother feature request
and should be its own separate Issue. I won't respond to it here. In
short, however, I don't like it.
------------------------------------------------------------------------
May 4, 2005 - 03:20 : Bèr Kessels
-1 on yet another option. We must decide what 'big' is. if someone
really has objections about our idea of 'big' she should change that in
the code. Another option is not an option, IMO.
Morbus, what about using variable_get/set? If you detect the number
only for very speecial cases, and find that there are more then 25, you
do variable_set('taxonomy_is_large', array($vid => TRUE)), if its
detected < 25 variable_set('taxonomy_is_large', array($vid => TRUE))
Where exactly you can set that, I do not know, but from that moment on
you only need to check against that variable, on all the places where a
tree would be rendered.
------------------------------------------------------------------------
May 18, 2005 - 12:20 : Morbus Iff
Attachment: http://drupal.org/files/issues/_p_25termvocab_0.patch (1.32 KB)
Does anyone else have comments on berkes' suggestion in #5? Dries?
Perhaps the more adequate question is: is a "free tagging" vocabulary
really useful if it has 25 terms or less? Isn't that more in the realm
of a controlled vocabulary? If the user really is typing in the same 25
terms over and over again, that's gotta be an indication that they can
save time with a dropdown, right? And if the free tagging vocabulary is
just starting out, isn't there an assumption that it will eventually
grow to be larger than 25 terms, In almost every use case? I guess my
definition of a "free tagging" vocabulary ALSO includes the assumption
that it'll grow to be $large, where $large is AT LEAST 100 terms. I've
never seen any folksonomy smaller than that, save for those just
starting out. is that a valid assumption? If it is, should it be
treated as a "speshul" type of vocabulary, one which doesn't need a
check on how many terms are actually in it, especially when said check
become more prohibitive (in added code as well as processing time) as
the vocabulary grows?
(Attached an updated, but unchanged, patch to sync with HEAD).
------------------------------------------------------------------------
May 19, 2005 - 13:31 : Dries
Personally, I'm not tempted to commit this patch. First, a 8.000 term
vocabulary is not very realistic. It simply doesn't work. While
this patch 'fixes' the administration page in such a scenario, it
doesn't fix any of the other pages (like those having to render a
selection menu with 8.000 terms). Morbus, is this _really_ needed?
------------------------------------------------------------------------
May 19, 2005 - 14:20 : Morbus Iff
Note that the original intent of this patch was ONLY for NON-FREETAGGING
vocabularies, and was intended SOLELY to stop multiple, large,
NON-FREETAGGING vocabularies from making the vocabulary page abnormally
long. An 8000 term NON-FREETAGGING vocabulary is quite unrealistic, yes
- that wasn't the "baseline" for this patch. The baseline for this
patch was two or more NON-FREETAGGING vocabularies that had more than
25 terms in it, of which I have three in the NHPR site. 75 terms on one
page create a rather long vertical scroll that I didn't really want or
like, especially when I had a fourth vocabulary that was alphabetized
last, and thus, was four or five pages below the fold.
So, again, the patch was catered for NON-FREETAGGING vocabularies. An
8000 term FREETAGGING vocabulary is quite possible actually (witness
flickr, delicious and technorati - all community sites that use
folksonomies and all have a rather wild number of terms, and my NHPR
conversion has, now, 9000 terms). Whether you think 8000 terms in a
FREETAGGING vocabulary is realistic or not, that's the reality in four
sites, three of which created this whole "folksonomy" meme. But, again,
this patch wasn't catered to that - the existing code in core ALREADY
ASSUMES that folksonomies will grow large, and thus, doesn't attempt to
show them without the "view terms" pagers. This patch was SOLELY created
to prevent vertical disruption of the vocabulary page on the oft-chance
that NON-FREETAGGING vocabularies grew larger than "normal".
------------------------------------------------------------------------
May 19, 2005 - 15:34 : nedjo
This general question would be, how do we differentially display object
references (nodes, terms, users) depending on the number available? As
things stand, we don't do this, but we do take different approaches
based on object type. Consider two examples from form elements.
(a) Users. When we want the user to designate a user, we present a
text box (rather than a select), assuming there will be too many users
to present in a select.
(b) Terms. We present a select, assuming there will be few enough
terms.
But neither of these assumptions holds for all sites. I've got several
sites with few enough users to fit handily in a select, whereas we've
heard there are plenty of sites with too many terms for a select.
+1 for the general answer of thresholds and differential display. The
proposed patch is a good start. I'd prefer to have the threshold
number pulled out into (yes) a configuration option, ideally a general
one that is for all object types. This would enable future work on
lists of other types.
More information about the drupal-devel
mailing list