[drupal-devel] [feature] Add Folksonomy, or "Free Tagging", to Taxonomy

Morbus Iff drupal-devel at drupal.org
Mon Apr 11 21:09:41 UTC 2005

Issue status update for http://drupal.org/node/19697

 Project:      Drupal
 Version:      cvs
 Component:    taxonomy.module
 Category:     feature requests
-Priority:     normal
+Priority:     critical
 Assigned to:  Morbus Iff
 Reported by:  Morbus Iff
 Updated by:   Morbus Iff
-Status:       fixed
+Status:       patch
 Attachment:   http://drupal.org/files/issues/quickie.patch (583 bytes)

Somehow, taxonomy_node_delete got removed in my last patch (I think I
was in the midst of testing another patch directly related to
taxonomy_node_delete). This puts it back in. Critical patch - without
it, people can't edit nodes they've posted (as the DB will complain
about duplicate indexes).

Morbus Iff

Previous comments:

March 30, 2005 - 10:16 : Morbus Iff

Attachment: http://drupal.org/files/issues/taxonomy_all.patch (19.53 KB)

This patch adds folksonomy support to Drupal (named internally as "Free
tagging"). In a nutshell, the core difference is the input method:
unlike normal taxonomies which are administratively controlled, a "free
tagging" vocabulary allows tag creation when the node is submitted. It
does this through an text input box, as opposed to a dropdown or
selectbox. This patch:

Removes the useless "Preview form" of a vocabulary.
Alters the vocabulary table to include a new "tags" column.
Adds a new "Free tagging" preference on vocabulary creation/editing.
Modifies the vocabulary overview to support pagers for free tagging

The new code integrates tightly with the existing taxonomy code. The
only additional processing occurs on node save and edit, where we parse
through the tags associated with a node. All other display (and thus,
code) remains the same. The following screenshots illustrate the
changes, integration, and workflow:

Create/edit vocabulary screen. [1]
Create/edit a node. [2]
Result of previous screen. [3]
The new admin/taxonomy. [4]
Clicking on \"view terms\". [5]

These patches were made during the exploration and customization of
Drupal by http://www.NHPR.org. In loving support of open source
software, http://www.NHPR.org will continue to contribute patches they
feel the community will benefit from. Questions about this patch should
be directed to morbus at disobey.com.
[1] http://disobey.com/detergent/2005/drupal_folkdef.jpg
[2] http://disobey.com/detergent/2005/drupal_folknodeedit.jpg
[3] http://disobey.com/detergent/2005/drupal_folknodesubmit.jpg
[4] http://disobey.com/detergent/2005/drupal_folkpager1.jpg
[5] http://disobey.com/detergent/2005/drupal_folkpager2.jpg


March 30, 2005 - 10:38 : Morbus Iff

Attachment: http://drupal.org/files/issues/taxonomy_all_0.patch (19.52 KB)

Updated patch to fix some errors in the update.inc change.


March 31, 2005 - 10:27 : Morbus Iff

Attachment: http://drupal.org/files/issues/taxonomy_all_1.patch (19.54 KB)

New patch for check_plain and HEAD. Also removed the term indent under
vocabularies - there was an extra-space issue in regards to
_taxonomy_depth, and I felt it was better to just remove the
(non-standard, non-semantic) indent I originally added during the move
to tablular display.


March 31, 2005 - 13:19 : Anonymous

A big +1 from me! This patch is going to be *extremely* useful for
modules like "image" that involve frequent creation of new taxonomy

I've been testing this patch extensively for a couple of days. I love
the fact that it is totally non-intrusive onto existing sites if the
admin doesn't want to use it when creating vocabularies, and that the
free-tagged terms become "ordinary terms" in the database structure,
with no special-case table.

Morbus Iff has done a great job of adding a powerful feature without
breaking anything, as far as I can tell.

I can see lots of ways in which this can evolve in the future, such as
more fine-grained security so that some users can add a given node type
without being able to add new free-tagged terms (i.e., that class of
users would have to pick from existing terms only, even if the
vocabulary allows higher-privileged users to add free terms). I can
also see a place for user-owned free-tag vocabularies that are
dedicated to their personal image albums. But these features could be
added in a future release and still be backward-compatible with what
Morbus has done now. That being the case, I suggest that this patch be
accepted into core.


March 31, 2005 - 13:20 : syscrusher

Comment #3 was from me (syscrusher). Sorry I forgot to login first.


March 31, 2005 - 15:12 : Dries

I'll commit this patch to core as soon CVS HEAD is opened up for
development.  For now, I'm awaiting feedback from the usability folks. 
I'm also left wondering how this would affect taxonomy-based permissions
-- I don't think that should be a problem but it is somewhat

I haven't tested the patch yet, but I glanced at the code quickly:

1. Don't use the word 'node' in user output.  Use 'post'.

2. The words 'term' and 'tags' are both used in user output.  This
might be confusing, but I don't see an easy way around it. The way it
is used makes sense, so it might be a non-issue. 

3. Some extra documentation might be in order.  The explanation of
'free tagging' is quite technical.  For example, I don't understand the
following bit: "Allows the creation of a vocabulary during content
creation, as well as through the normal administrative means.".  I'd
rather see it explain the difference/advantage/drawbacks to help me
decide whether to enable 'free tagging' or not.  Make the documentation
more task-minded.

4. I don't like the way you manipulate the pager's global variables. 
The paging code is a bit of a hack, it seems.

5. Spacing: we write 'foreach (' not 'foreach('.

6. We usually write code comments above the code, not after the code on
the same line.  This is really minor as I'm sure we don't do this
consistently.  Some code comments are rather cryptic and didn't help me
much.  Maybe give your code comments some love.

That's all for now.


March 31, 2005 - 15:19 : FactoryJoe at civicspacelabs.org

Attachment: http://drupal.org/files/issues/drupal_folkdef.png (21.7 KB)

In setting up the folksonomy, you present far too many options to the
user. I tried to cut these extraneous options out but then decided to
redo the whole Vocabulary creation workflow. :)

Go figure.


March 31, 2005 - 15:27 : Morbus Iff

I'll address what I can. A new patch will be forthcoming.

#2: I agree - I previously wanted to keep everything as "term", and
spent a good bit of time on #drupal getting jbond (who has since
disappeared from all discussion) to agree that "term == tag" and the
only difference between the two was their method of input. Eventually,
I felt that "tag" was not only a "term" (and term), but also an action.
I'm not "terming a node", but I'm "tagging it", which is a common sorta
phrase in other folksonomy implementations. Thus, the mixing of the

#4: Yeah, I know it's a hack. I had a comment in there (since removed
due to moshe's suggestion) that I knew it was a hack, and that
integrating with the existing pager code would be uber-difficult based
on hierarchies and the recursive nature of the taxonomy_get_tree.

#6: Heh, heh. Boy oh boy. Wrong thing to say. I often find myself
overdoing it on comments (for example [6]), and the patch as is was a
concentrated effort to /reduce/ the amount of comments I had originally
put in there (see also [7]). I'll take another look at 'em.
[7] http://lists.drupal.org/archives/drupal-devel/2005-03/msg01010.html


March 31, 2005 - 15:36 : Morbus Iff

Regarding #7, it appears factoryjoe wants a far grander rewrite of the
taxonomy UI than this patch purports to do.


March 31, 2005 - 15:59 : Dries

If you can take on such rewrite (or parts thereof) based on Chris'
suggestions, by all means.  

Depending on the required UI changes, such overhaul might automagically
deal with the pager implementation issues (if the pagers get nuked that


March 31, 2005 - 16:19 : Morbus Iff

Not in this patch. His plans are for 4.7, and include wizards, removing
most all of the checkboxes on that page, and so on and so forth.
Likewise, he's only about 30% of the way there (per #drupal) in his UI
mockups, so not to be considered with this folksonomy patch at all.


March 31, 2005 - 16:22 : Morbus Iff

(Er.. which isn't to say that I think this folksonomy patch is for 4.6 -
I know 4.7 will be its earliest release. But, he's just not ready with
the workflow in his head yet for me to address any of his issues. And,
based on discussion in #drupal, I'm not sure /I/ want to be the one
implementing the changes, much less agreeing with them, he has planned
[g]). No offense to him, of course - he's (admittedly) too early in his
thinking to take on all affronts.)


March 31, 2005 - 16:23 : Bèr Kessels

Cris's ideas are great, but should IMO not be confused with this
folksonomy isseu.

Can we not focus on getting folksonomy in, be it with a
less-then-perfect-UI, and then open a new issue to improve the UI of



March 31, 2005 - 16:44 : Uwe Hermann

+1 from me. While I haven't tested the patch, yet, it looks very good
and I'd really love to see this in 4.6. I hope it's not too late to get
it in...


March 31, 2005 - 17:31 : grohk

I will add my +1 to the pile.  This patch is working well for me.  I
like Chris' mockup, but I also agree with Morbus that his taxonomy UI
ideas are probably beyond the scope of this patch.  I actually like
that Morbus has added this functionality without drastically altering
the taxonomy admin interface.  Bravo.


March 31, 2005 - 17:58 : FactoryJoe at civicspacelabs.org

As far as my workflow changes, yeah, they won't be ready for 4.6. I do
think that getting this into the next release is important though, so
that we have the general functionality.

I have concerns about "multi-select" and related terms though. I mean,
those shouldn't be user options -- those should be allowed by default.

Also, I talked to Drumm at lunch about flat-lists vs tree-hierarchy and
we seem to agree that it's a needless distinction. It's better to have a
"tree-like hierarchy" (or controlled vocabulary/outline list) and "free
tagging" as the main distinctions. Because it might make sense to make
your flat-list a hierarchy later on, but not so much your tags. (even
though Morbus tends to see a use for making a hierarchy out of free

But this later discussion probably belongs somewhere else...

So I'm pretty much okay with this moving forward with the understanding
that the categories UI needs a wizard-like overhaul for 4.7.


March 31, 2005 - 22:34 : Jaza

+1 from me for this patch. The interface is so simple - just one text
box - and yet so powerful. Being able to add terms at node creation is
a critical feature for the next release, IMO, but I never envisioned it
being implemented so cleanly and intuitively. Great work, Morbus!

However, three problems/shortcomings that I found with the patch:

1. You cannot create new sub-terms using free tagging (yet). Say I have
an existing term called 'sporting news', and I am writing an article
about soccer and baseball. The term 'soccer' already exists as a
subterm of 'sporting news'. I have no existing term for baseball. I
would like to be able to enter into the text box:
"sporting news->soccer, sporting news->baseball"
And it would create 'baseball' as a new subterm (and assign the
existing term 'soccer'). With the current patch, the only way to do
this is to create the terms with free tagging, and then to go into the
admin interface and make them subterms.

2. I don't like the new "view terms" link. I prefer having my vocabs
and terms all listed together. In fact, when I made my first term using
free tagging, I went into the admin interface, and couldn't work out
where my term was, until I saw the "view terms" link. So this is a
usability issue - Drupal admins are accustomed to the current layout of
the categories page, and may find the change cumbersome. Perhaps make
this a setting?

3. The free tagging text box should IMO be displayed AS WELL AS, not
instead of, the regular 'select term(s)' list box. Surely users will
want to select an existing term, rather than typing it? And if they do
type it, they should be able to check (as they type) that they're
spelling it correctly. Also, users should be able to see what existing
terms there are, so that they don't create new terms that are virtually

I also agree (with the person that already said it) that new
permissions are needed as part of this patch.


March 31, 2005 - 23:12 : Morbus Iff

Addressing Jaza:

#1: There was some discussion about this, and it was eventually put on
the backburner [8].

#2: The goal of the new "view terms" was to address the immense size
that a tag vocabulary can grow to. It is quite easy to have these
vocabularies become 100+, continuing to grow ever bigger. Having the
vocabulary admin screen contain 1000+ terms all at once is a problem.
This is less of an issue with a controlled vocabulary, which is why
those terms are still displayed inline (while it IS possible to create
a controlled vocabulary of 1000+ term, it is a bit more unlikely than
with a folksonomy). There was /some/ talk of "if vocabulary term count
is less than 50, always display inline", but I couldn't readily solve
the UI issue of what happened between term #49 and term #50 (49 terms
are displayed inline, you add a 50th one, go back to the overview
screen and "wtf?! all my tags are deleted! arrrRRgh!"). The nearest fix
to that was a "There are more than 50 terms in this category. View all
terms." message, similar to the \"No terms\" one [9]. If people feel
this is a proper way of handling it, then I'll roll it into this patch.

#3: For the same reason as #2: a vocab with 1000+ is very prohibitive
to have a dropdown. As for similar keywords, I plan on making a third
party module that this sort of \"similar keywords\" GUI. [10] for tag
based vocabularies, which uses the "Related terms" feature of a
taxonomy term.
[8] http://lists.drupal.org/archives/drupal-devel/2005-03/msg00733.html
[9] http://disobey.com/detergent/2005/drupal_folkpager1.jpg
[10] http://disobey.com/detergent/2005/similar_keywords.jpg


April 1, 2005 - 10:19 : Dries

Postponing the UI changes is OK, though I'd still like to see if we can
make the pager stuff less of a hack.

If you have 1000+ terms you probably don't want to manage them as
regular terms.  You'll want to do things like searching terms (eg.
"search for term "*John*"), merging terms (eg. "merge term 'Governer
John Lynch' into term 'John Lynch'"), sort terms by popularity (eg.
"what terms are used only once?") and act upon them in batch mode. 
Eventually, that might also impact the UI.

I also wonder how the folksonomy module affects the various taxonomy_*
modules as well as the content filter on the 'admin/content' page.

This is going to be interesting. ;-)


April 1, 2005 - 10:33 : Morbus Iff

Dries: the big problem with the pager() stuff is vocabulary hierarchies,
which is what _get_tree gives us (along with additional depth and parent
attributes). I could probably reduce the hack's size by using
pager_query() with a throwaway SQL statement, but I'd also have to
throwaway the returned database $result, since it wouldn't be useful
(no hierarchy information). On the order of hacktitude, though, this is
probably as equal a hack, only smaller (and possibly more expensive,
since it'd be another db pull). Thoughts?


April 1, 2005 - 10:38 : moshe weitzman

personally, i think a pager rewrite is out of scope for this patch. it
is also a very minor part of the patch. during the next release cycle
someone can go in and improve pager's API for handling collections that
are not SQL query result sets.


April 1, 2005 - 11:02 : syscrusher

Some comments on the tree hierarchy issue and also release schedules...

Comprehensive tree hierarchy support is nice if feasible, but I don't
think it's essential for the patch to begin being very useful. As a
simple, interim workaround, I would suggest using a slash or backslash
(accept either of them, for maximum user-friendliness) as a delimiter
between levels, and handle the hierarchy internally. For example, if
the tree now looks like this:
-one A
-one B
-one C
-two A
--two A one
--two A two
-two B

then I could put "one/one A/newterm ABC, two/two A/two A one/newterm
IJK,newterm XYZ" into the tags field to make it look like this
-one A
--newterm ABC
-one B
-one C
-two A
--two A one
---newterm IJK
--two A two
-two B
newterm XYZ

(In my examples, I didn't use any backslashes because I wasn't sure how
they would render in the issue, but the idea is that the two punctuation
marks are treated as equivalent. Also, as a Linux maven I found myself
instinctively putting a trailing backslash on things that "felt" like a
directory path, so it would be wise to trivially trim trailing
punctuation from each term when validating, because I'll bet I'm not
the only one who would do that.)


Novice users can just ignore this admittedly-advanced feature, and
still use Morbus' most elegantly simple UI with no changes.
Syntactically, looks like disk directories, which makes it a little
more intuitive to users than other delimiters like "->" or "::" that I
have seen similarly used in other software (and which are "intuitive"
only to programmers).
Relatively modest code addition to the existing patch.
Adds, but doesn't /change/ anything in the existing UI, nor require
database schema changes, so it can be added /after/ initial release of
the module without requiring user-retraining or upgrade scripts.


What to do if the user mistypes an ancestor path component?

Add it as-is, as if doing "mkdir -p /pathname/" in Linux /et al/?
Issue an error message and ignore that term? (Node gets created with
one free-tag term missing.)
Issue an error message and force correction? (Annoying to novices, but
probably only advanced users would use this hierarchy feature
anyway...and the message could helpfully list all of the /existing/
terms at the failing tree level. The user would still manually type
what they want when correcting the entry, but now they'd have a guide
in the error message text to help them know what they mistyped.)
Try to intelligently match to a near-miss? (Complex code! SOUNDEX might
help here.)

I still feel that the patch deserves consideration as-is, with only bug
fixes before it is available as a contrib. Morbus has done a great job
of laying a foundation to which we can add all this nifty stuff later
without ripping out his existing UI. Let's leverage that elegant
thinking and get this tool into the hands of site owners. My suggestion
would be a published contrib patch available for 4.6, then let Dries do
what he sees fit with regard to core for 4.7.



April 1, 2005 - 12:24 : FactoryJoe at civicspacelabs.org

At the risk of sounding obtuse, I really think that this module should
be as simple as possible, and then let other modules add functionality
to the new free tagging vocabulary type. And by that I mean that
tree-hierarchies seem to me to be beyond the scope of this patch, even
though you get it "for free".

I know that a good number of developers are going to come out against
this idea, but in all the popular folksonomic systems that I've seen
gain popular following (delicious, flickr, gmail and so on), they only
offer flat tagging and that seems to sufficient. If you've used a
tree-enabled folksonomy and can point me to a demo, please do -- I
simply have never seen such a thing attempted before in a popular

As a matter of fact, Jaza makes the perfect case for me when he
comments that "You cannot create new sub-terms using free tagging
(yet). Say I have an existing term called 'sporting news', and I am
writing an article about soccer and baseball.... I would like to be
able to enter into the text box: "sporting news-soccer, sporting
news-baseball" And it would create 'baseball' as a new subterm (and
assign the existing term 'soccer')."

This is what a tree-like hierarchy is for and the limitation here is
not going to, nor should it, be fixed with free tagging. Rather, this
is a UI issue with the current implementation of tree-like hierarchies
in Drupal. What Jaza wants is a tree -- so instead of free-tagging, he
should be able to use his existing vocabulary "sporting news" and then
*be able to add a term to the tree inline* instead of having to go all
the way through the round-trip of adding the term in the admin UI. This
is *not* free tagging; this is adding a new leaf to a tree -- free
tagging, as Jaza suggests, would /simply make it more convenient/
without actually solving the real problem.

Honestly, I think putting too much specific implementation stuff into
this patch before we've had time to see how it's going to be used in
the wild is a bad idea. I do, however, think that it's important to
focus on the synonomic, spelling and merging issues that are inherent
problems in folksonomic systems. Related terms are intrinsic in
folksonomies, so I would rather see more work put into that UI issue
than trying to recreate tree-hierarchies.

And I know that, just as I say, "we don't know how this is going to be
used in the wild" I'll get the response that "well then we shouldn't
limit its functionality until we know how it will be used" but I
fervently disagree. That's one of Drupal's out-of-the-box problems. You
get /too much/ stuff! Give people things that are manageable -- things
that I can fit in my pea-brain. And if you need more functionality --
by all means built it! And stick it in a module that I can download
later! But folksonomy is important enough -- and Morbus has done a
really great job keeping it fairly minimal so far -- that I think we
can possibly go just a little further, pull it back some, and make free
tagging work the way it should work instead of being just a more
convenient version of the other types of vocabulary.


April 1, 2005 - 12:36 : moshe weitzman

factoryjoe - all that text and i can't figure out what you are
proposing. you proposing we don't use the vocabulary/taxonomy system
for tags? if so, what is the better option? or maybe you are just
emphasizing the importance of getting the 'realtions' stuff right. we
will get there, and taxonomy provides excellent tools for doing so (see
'related terms' and 'synonyms')

folks, lets use *action* statements when we comment on patches. if you
don't like something, make a counter proposal. patch reviews are not
places for chatter.

i apologize for my grumpiness these days. i am seeing way more chatter
on the devel list than focused collaboration on core functionality.


April 1, 2005 - 13:21 : Morbus Iff

I think his points were:

* hierarchy in folksonomy = bad; don't work on it.
* minimal patch to start = good; let's see what people do.
* related terms = good; (my screenshot [11]).

[11] http://disobey.com/detergent/2005/similar_keywords.jpg


April 1, 2005 - 13:40 : FactoryJoe at civicspacelabs.org

Yes, Morbus summarized it pretty well (sorry for sounding chatterish). I
just don't want to see this excellent functionality muddled in its
initial release, so the question of hierarchies, I feel, cannot be
answered with the first release and so should not be included.

Tags should represent flat lists -- labels -- and nothing more (for now
-- let other modules add to that later).


April 1, 2005 - 13:45 : Anonymous

Q: "If you've used a tree-enabled folksonomy and can point me to a demo,
please do".
A: Category in Wikipedia.
Which probably doesn't help us at all here.
I'm back. I
ll take a look at this over the weekend, but my guess is I'm going to
do +1 and then niggle about specific code detail.


April 1, 2005 - 15:25 : moshe weitzman

/ so the question of hierarchies, I feel, cannot be answered with the
first release and so should not be included/

OK, but take this a step further. what are the implications of this
course of action? Should Drupal actively force all free tag vocabs to
be non hierarchical? An admin can choose that configuration already.
And that configuration is the default. Why write more code to disallow
something that is optional and useful for some people? Or perhaps you
think are proposing to divorce free tags from taxonomy entirely.

As you can see, I don't think this course of action stands up to


April 1, 2005 - 21:59 : Jaza

My response to FactoryJoe's strong rejection of hierarchies in

1. *grumble grumble*... *mutters under breath*... *curses audibly* :(

2. Yeah, well, I guess I can see where you're coming from, in terms of
usability. You're right, hierarchical tagging is something that most
users won't be interested in. And for that vast majority, introducing
this advanced feature will confuse rather than conduce (productivity,
that is). You have a point, Drupal's current out-of-the-box features
are already daunting for many, and we want to reduce this problem, not
aggravate it. So although I would love this feature, I do admit that
I'm probably in the minority. As the leader of CivicSpace - which is
targeting non-tech-savvy users more aggressively than Drupal ATM - you
are one of the best placed people around here to comment on usability.

So maybe I'll just have to hack the extra functionality into a separate
module, as you suggested ;-).


April 2, 2005 - 02:07 : jbond

The only hierarchical free tagging system I know of is Categories in
Wikipedia. del.icio.us does actually make it possible and some people
did try using tags like this "branch1/branch2/leaf" but there's no
specific UI to support it. My objection to trying to implement this now
in Drupal is that nobody knows what the UI should look like. If we use a
simple text field with comma delimited terms, how would the user
indicate that a term they enter is actually part of a tree? Wikipedia
gets away with this by having *no* UI, Categories are just another


April 2, 2005 - 02:43 : jbond

Thoughts on the patch.

1) The regex to support quotes around terms that include a comma is
really good. Explaining this to the user in the description of the term
field may be hard but needs to be added.

2) In del.icio.us and flickr, related tags is an important part of the
navigation. This is simpler and  different from Morbus' example
screenshot. The obvious place to cache this information is in
{term_relation}. The question is when to cache it. For the sake of
keeping the taxonomy patch simple, it could be done by a contrib module
using cron. But the obvious place to do it is just after
taxonomy_node_save() has finished and the obvious input is an array of
tids or ideally an array of terms. That way the {term_relation} entries
are immediately up to date rather than some cron time later. Since
taxonomy.module doesn't do much of anything with {term_relation} I
think this should be a hook for contribs rather than forced in. So I'd
suggest a hook right at the end of taxonomy_node_save() that passes an
array of tids. In this case, I think it would make sense to split
taxonomy_node_save() logically into 2 parts. Process free terms first
creating any that are not found. Add the found and created tids to
$terms. Then use the existing code to iterate through $terms and save
the  {term_node} entries.


1) I'd like to see a hook so that contrib modules can add functionality
to the text input field for free terms. Specifically, adding one click,
suggested terms for this node.

2) In developing UI for navigating free terms, I frequently want to get
usage counts for a term, or to order queries on usage counts. This can
get slow where there are lots of joins. So where should term.count be
cached? A new field on {term_data} ?

Oh, and that's a big +1 Morbus, This is great stuff. I really want to
get it in so I can take it further.


April 2, 2005 - 07:16 : syscrusher

jbond wrote:

"The only hierarchical free tagging system I know of is Categories in
Wikipedia. del.icio.us does actually make it possible and some people
did try using tags like this "branch1/branch2/leaf" but there's no
specific UI to support it. My objection to trying to implement this now
in Drupal is that nobody knows what the UI should look like. If we use a
simple text field with comma delimited terms, how would the user
indicate that a term they enter is actually part of a tree?


How would the user indicate the tree? Easy -- just as you did. {grin}

Use slashes/backslashes, as I proposed in my earlier post on this
issue. There is no UI change required for the user at all, just
additional instructions that they're allowed to use the slashes to
indicate tree levels. Novice users can just ignore this feature; in
fact, it could easily be a role-based "security" feature along the
lines of input formats, where the "advanced user" role (for example)
gets the permission "create nested free tags" under the permissions for

It's a trivial documentation change from the user's perspective, and
within the code, just a little more regexp and array-splitting magic.
No need for complex UI form elements.



April 2, 2005 - 15:31 : factoryjoe

I will have some other comments about the hierarchy thing later, but I
wanted to pose a quick off-hand question (unrelated to the tree
issue!): if folksonomy gets added to core, would I be able to add tags
with an XML-RPC app like MarsEdit? Eventually I would really like to
have a desktop app that I can use to post to Drupal -- does the design
or implementation of this patch support this activity for the future?


April 2, 2005 - 16:40 : Morbus Iff

Regarding #33, the determination for when to "create term, then
associate" (as opposed to a normal, non-tagging vocabulary of
"associate") is whether a [tags] array is included in the data sent to
taxonomy_node_save. I know nothing about MarsEdit, only a tiny bit of
the Blogger API, and absolutely nothing about Drupal's XML-RPC
interface, but it would seem that the assumption is that the category
already exists in the backend. As such, I'm going to assume that this
patch does not support addition of new categories from the Blogger API.
Nor would I know how to do so.


April 3, 2005 - 05:25 : Bèr Kessels

my $0.02 on the hierarchies:
I want them! *I* will be able to use them.

We shouold not forget about two things, when handling that useability

1) Drupal is not only used by pea-brained people. WE use them too, and
IMO we ,as developers *always* come on #1. scratch you own itch above
2) If no-one does something, that is no reason not to do it. If no-one
(virtually) uses linux and a mac, that does not mean we shold all be
working on a windows machine!

Remember, factoryjoe, and all the others: free tagging goes far beyond
the scope of that thing de.li.cio.us (or wherever they put these dots)

I, for example am testing to use it as a very simple
keywords/taxonomy-on the fly/quick-filer system on a weblog and on a
big photosite. Both do not use folksonomy in a community way (yet)!


April 3, 2005 - 07:37 : syscrusher

Bèr Kessels writes:
> Drupal is not only used by pea-brained people. WE use them too, and
> IMO we ,as developers *always* come on #1. scratch you own itch above
> all.

This is a point of vital importance, in my opinion. As long as a novice
user is not actively impeded, having advanced features valuable to
sophisticated users is a /good thing/, not a problem.

The analog is the use of a GUI for Linux, UNIX, or BSD operating
systems, which makes them more approachable to a novice -- a very good
thing. But few people would argue that we should remove the command
shell from Linux just because most novice users don't understand it.

We need to make Drupal accessible for beginners, true, but as Bèr
wisely points out, that need not mean "dumbing it down" for the rest of
us. If Drupal loses its appeal to advanced users and hard-core techies,
they will not only leave the user community but also the developer



April 3, 2005 - 17:07 : factoryjoe

It’s really very important to understand that my goal of improving
Drupal’s usability isn’t a process of “dumbing down” anything.
The goal is help people get things done in clear, logical ways. If that
means simplifying a complex interface so that more people can get more
done faster, I will remove extraneous, forward-facing UI elements to
achieve that. I seriously have no interest in holding back developers
from getting what they want so long as it doesn’t come at a
significant mental effort cost to the rest of the potential
user-base—Drupal’s wider adoption depends on it.

With that said, and applying that approach to the issue at hand, I have
serious concerns about hierarchic folksonomies, especially with the
numerous syntaxes that have been suggested (slashes, colons, arrows,
etc). I do not believe, in other than a handful of cases, that you will
be able to design a syntax for folksonomic hierarchies that really makes
building said hierarchies easier, faster or more enjoyable. I do think
that you can cludge a solution on after the fact through a separate
module, like Morbus’ related terms module. Or perhaps you could
/infer/ a hierarchy from the structure of tags, but unless you can
provide a Google-suggest-like feature that helps you /build/ your
hierarchy at the time of tagging, you’re going to end up with a mess.

Consider the example of tagging a sports new story (as suggested [12]
by Jaza). Start with this:

sporting news->baseball

Now let’s add teams to this:

baseball->teams->“red sox”

And add an individual player:

baseball->teams->“red sox”>roster>“trott nixon”
baseball->players->“trott nixon”
“trott nixon”

Now here’s where this becomes unweildy… How many layers of tags do
we need? Shouldn’t these “tags” come from a fixed vocabulary?
Isn’t there a reason why companies spend tons of money [13]
developing such taxonomies? The point is, if you start using
folksonomies for hierarchic organization, all you’re really achieving
is /convenience but not accuracy/. What if, for example, I ended up
tagging the story with these tags?

basebal->players->“mike piaza”
baseball->player->“trott nixon”

Though my intention was to have a unified hierarchy like so:

+--+  baseball
   +--+ players
      +-- trott nixon
      +-- mike piaza

I actually ended up with two completely different hierarchies because
of two simple typos (/basebal/ and /player/). A Google-suggest [14]
feature might have helped me avoid that mistake, but we’re not going
to be shipping with anything like that so we’re instead putting a
huge burden on taggers to remember the correct tags, their spelling and
in what order they should be applied.

Again, I don’t necessarily mind leaving in this feature for
power-users… I don’t think I’m going to “win” this discussion
anyway. But I really think that this problem is *much more* complicated
than it’s been made out to be so far. And even though we don’t have
to imitate everyone else out there, there may be a reason why others
have not attempted folksonomic hierarchies yet. I personally [15] use
hierarchies in del.icio.us (at least semantic pairings, like
person:boris_mann [16]) but I actually perfer that del.icio.us hasn’t
tried to dictate a syntax for this yet, preferring to offer an interface
for build ad-hoc lists of related tags.

I still say that this module should be as loose as possible out of the
box and that creating folksonomic vocabularies should be stupidly
simple. I might recommend Steve Krug’s book on this topic… Don’t
Make Me Think: A Common Sense Approach to Web Usability [17]

[12] http://drupal.org/node/19697#comment-25532
[13] http://www.hyperorg.com/blogger/mtarchive/003836.html
[14] http://www.google.com/webhp?complete=1&hl=en
[15] http://del.icio.us/factoryjoe/
[16] http://del.icio.us/factoryjoe/person:boris_mann


April 4, 2005 - 11:55 : jibbajabbaboy

I was asked to comment on this (sorry if it adds chatter). Problem with
the term "folksonomy" is that it implies hierarchy, when in actuality,
free tagging implies the application of discrete descriptions (e.g. of
one concept), not for hierarchical ones. That said, if people /can/ add
hierarchy into your term, they will. Likelihood of this happening is
probably less than 20% in any system I would argue. Probably less than
5% even.

You probably have several issues here: At the very least, you have
administrator's configuration of hierarchy and end-user's ability to
tag hierarchically. Free tagging is, in my opinion, a non-hierarchical
task. Things get put into hierarchies after they're first described.
Then they become folksonomies. The real problem is with finding ways to
deal with synonyms after the fact. I like Morbus' ideas about helping
the system find similar keywords. By the way, I haven't yet used/seen a
module that actually takes advantage of relationsihps in a taxonomy. I'd
love to see that aspect of controlled vocabularies be utilized in a
meaningful way here.

For an example of a free-tagging system that allows for post-tagging
heirarchy, see what James Spahr designed for the Pratt Talent site
[18]. The process is 1) let users enter free tags, 2) let administrator
drop tags into taxonomy, 3) use different methods of display, i.e. flat
lists and hierarchies.

How this gets used will largely depend on the user environment. In
multi-user environments, I agree that you want to keep the design as
simple as possible. I would target that group for the design.

1) Simplify this from the end-user perspective by making free tagging a
flat activity as much as possible (don't encourage hierarchy by default)
2) Consider ways to organize hierarchies as a function of
administration (after the fact-classification).

[18] http://designweenie.com/blog/index.php/1230


April 4, 2005 - 15:23 : Morbus Iff

Attachment: http://drupal.org/files/issues/taxonomy_all_2.patch (20.16 KB)

I've attached a new patch:

per #6,1: removed "node" from all (new) public help text.
per #6,3: expanded the (new) documentation in help/taxonomy.
per #6,5: fixed the foreach spacing error.
per #6,6: added/revised code commenting.
per #31,1: added "Company, Inc." to Example on the input box.

This patch does not address any hierarchy (none of my patches for this
Issue will) or the pager() discussion. I will follow this comment
shortly with the exact same patch EXCEPT for an alternative means of
doing the pager() (as per comments #6,4 and #20). This alternative
means has its own pros/cons.


April 4, 2005 - 15:29 : Morbus Iff

Attachment: http://drupal.org/files/issues/taxonomy_altpager.patch (20.07 KB)

I've attached an alternate patch that implements a slightly different
way of handling pager() pages for the "view terms" administration
screen. This alternative makes the display look exactly like the
previous patch, so no new screenshot is needed. Ultimately, this patch
reduces the lines of code needed, but at the expense of an extra SQL
query which we do nothing with (which is the prime CON of this patch,
complemented by a PRO that suggests the throwaway SQL is used on a
rarely-visited page anyways). Per my comment in #20:

"Dries: the big problem with the pager() stuff is vocabulary
hierarchies, which is what _get_tree gives us (along with additional
depth and parent attributes). I could probably reduce the hack's size
by using pager_query() with a throwaway SQL statement, but I'd also
have to throwaway the returned database $result, since it wouldn't be
useful (no hierarchy information)

Regarding features, workflow, and interface, this patch IS NO DIFFERENT
than the one in #39.


April 6, 2005 - 10:09 : Morbus Iff

Attachment: http://drupal.org/files/issues/taxonomy_all_3.patch (20.95 KB)

Mmkay. Another patch based on more comments. No new features, just
fixes, usability, etc.

Dries: don't commit #40, the altpager version. It's broken. It does,
however, still have merit as "another way to handle pagers", but if you
want me to pursue it, I'll need to make ya a new patch. It won't,
however, be much "better" than the original pager hack, mainly because
I'll still have to handle the increment and "from" manually (which is
why #40 is broken - I forgot to handle "from").

This patch's improvements:

per UnConeD: admin/taxonomy/# displayed same help as admin/taxonomy.
per UnConeD: admin/taxonomy/#/add/term showed "view terms" not "add
terms". Fixed.
per UnConeD/#18,2: admin/taxonomy shows "This is a free tagging
vocabulary: view terms."

Two other changes that are more substantial:

A recursion "bug" was found in taxonomy_get_tree that would cause the
function to be called once for every term in the desired $vid. This is
unnoticable for small vocabularies but, on a vocabulary with 8000
terms, caused any code that used that function to stall indefinitely.
Since this would/could happen with an 8000 term controlled vocabulary,
it was decided that this was a 4.6 bug, and UnConeD has already
committed the fix. [19]
admin/node shows a dropdown of terms in its filter dialogs. With 8000
terms, this caused the dropdown to slow the page down immensely (and
only worked after increasing PHP's memory allocation to 64M, instead of
the default 8M) and to make the dropdown pretty well unusable.
node.module uses taxonomy_form_all to create this dropdown, and is the
only (core) module that makes use of that function. _form_all,
descended from _form, had a bunch of erroneous cut-and-paste parameters
from _form that were never used in the actual code. I've removed these
parameters and added a single new one, called $free_tags, that defaults
to 0. When 0, it won't return any free tag vocabulary data, which
removes the need to patch node.module (or any other contrib code that
uses taxonomy_form_all). Developers who want $free_tags displayed will
be able to pass 1 on their module_invoke.



April 6, 2005 - 12:55 : Morbus Iff

Attachment: http://drupal.org/files/issues/taxonomy_all_4.patch (21.78 KB)

I really really hope this is my last one for a while. This patch adds
the indenting back in (removed per comment #2, and sorta shown in this
screenshot [20]). Whereas the screenshot used nonbreakingspaces for
indenting, this new code uses CSS, piggybacking off the indent used on
"Permissions" at admin/access. I've reduced the indent slightly after
lamenting to UnConeD about the 2em being too much - he suggested 1.5em
as the absolute minimum.
[20] http://disobey.com/detergent/2005/drupal_folkpager1.jpg


April 6, 2005 - 23:57 : jjeff

Very cool. Think I found a bug though...

Until the first 'free tags' were created I was getting the following
*Warning: Invalid argument supplied for foreach() in
/Users/jeff/Sites/drupal/modules/taxonomy.module on line 758*

that line is this:
 foreach ($children[$vid][$parent] as $child) { 

Once I created some tags, the error went away... and I am in free
tagging heaven!



April 8, 2005 - 09:20 : Morbus Iff

Attachment: http://drupal.org/files/issues/taxonomy_all_5.patch (21.66 KB)

#43 has been previously addressed.

Final attached patch per Dries' comments in #drupal.


April 8, 2005 - 09:59 : Dries

Committed to HEAD.  Great job!

More information about the drupal-devel mailing list