[drupal-devel] Folksonomy: No taxonomy.module Changes / URL Design
Ok, apparently. I'm starting on this "now", even more sooner than before. I've named it folksonomy_shared so as not to interfere with the existing folksonomy.module API by Moshe. I've dubbed it "shared" as it is the key term that distinguishes it from the future direction that Moshe was thinking (per-user-vocabularies). I confess to not really knowing the direction that jbond wants to head in. And, unlike awTags (which, at face value, seems to do most everything I need), folksonomy_shared tries to integrate as tightly as possible with the existing Drupal taxonomy. This gives me the benefit of RSS feeds (an absolute requirement), as well as using synonyms to store similar keywords. * I've started a sandbox for this: sandbox/morbus/folksonomy_shared/ * Only one database change is needed: adding a uid to term_node. http://lists.drupal.org/archives/drupal-devel/2005-03/msg00422.html This morning, I took a look at all the places in taxonomy.module that used term_node, so as to properly handle the new database addition above. At first glance, I don't see /any/ place that requires changing: * The most important thing that distinguishes a folksonomy from a taxonomy ISN'T the I of UI, it's the U. As such, when looking over term_node mentions in taxonomy.module, there were no places that I felt REQUIRED the addition or understanding of the uid. For a "taxonomy" (or, really, a "controlled vocab"), the uid is irrelevant. All normal taxonomies would log the uid as 0, with no change to the existing code. * taxonomy_node_get_terms_by_vocabulary "finds all terms associated to the given node, within one vocabulary." This would seem to be the most immediate and obvious place to include the uid code ("here are all the terms applied to this node, and here are the people who applied them"). But doing so here would be a bad idea: it'd unnecessarily complicate the code (especially when you think about the $key), and ultimately overload the meaning of the function name. Same with the regular _get_terms. * taxonomy_node_save doesn't need to be modified. Without the uid specified, it'll default to 0, which is a proper value for non-folksonomies. folksonomy_shared would save its values with folksonomy_shared_node_save on a nodeapi 'save' hook. * taxonomy_term_count_nodes wouldn't need to be touched, because the number of nodes with a term cares little for uid assignment. * taxonomy_select_nodes would be handled with a folksonomy equivalent version which would allow tids and uids to be passed as params. One of the next usability issues I've been thinking of is URLs, and I've been having some conceptual issues of how to get them right. Your comments will be heartily appreciated. The following are known possibilities: drupal.org/taxonomy/term/34 drupal.org/taxonomy/term/$tid del.icio.us/jagbot del.icio.us/jagbot/origami del.icio.us/origami del.icio.us/$uname del.icio.us/$uname/$term del.icio.us/tag/$term autowitch.org/tags/861 autowitch.org/usertags/1 autowitch.org/tags/$tid autowitch.org/usertags/$uid I heartily agree with jbond: the URL should use $term, not $tid. The problem is that it becomes an either/or situation. Drupal prefers using the $tid (see the first example above), but there's no way to distinguish a $term of "1984" or "666" or "101" from the $tids of the same value. So, using $term is arguably "more" folksonomish, but breaks Drupal tradition. The current format I'm thinking of implementing is as follows: /folksonomy/1 # show all tags in vid 1 /folksonomy/1/term/hello # show all tags in vid 1 for "hello" /folksonomy/1/term/1984 # show all tags in vid 1 for "1984" /folksonomy/term/hello # show all tags in any vid for "hello" # see below for my concerns on this one. The problem with the above is two-fold: * if we want to support multiple folksonomies, the $vid needs to be part of the URL. Otherwise, there'd be no way to distinguish which "hello" from which $vid we'd like to look at. The current taxonomy doesn't do this because we always have the unique $tid. But, if we're going to put $term in the URL, we need a further bit of information to properly assert it. Using the path module would have a similar problem. * if we're using $term and not $vid, why not $vname? As for users, I'm leaning toward: /folksonomy/user/41/1 # show all vid 1 tags for user 41 /folksonomy/user/41/1/term/hello # and so on and so forth. I feel /folksonomy/user/41/1/term/1984 # like these URLs are horrendous. /folksonomy/user/41/term/hello The most immediate problem is, again, the lack of standards. If we're using $term and not $tid, why shouldn't we use $username and not $uid? The URLs become even worse and sometimes broked: /folksonomy/user/morbus/1 /folksonomy/user/rogue_githyanki/1/term/hello /folksonomy/user/killes@www.drop.org/1/term/1984 /folksonomy/user/ohmygodicantdecidemyusername/term/hello Help! This design is the next big hurdle in my head. -- Morbus Iff ( you are nothing without your robot car, NOTHING! ) Culture: http://www.disobey.com/ and http://www.gamegrene.com/ Spidering Hacks: http://amazon.com/exec/obidos/ASIN/0596005776/disobeycom icq: 2927491 / aim: akaMorbus / yahoo: morbus_iff / jabber.org: morbus
This morning, I took a look at all the places in taxonomy.module that used term_node, so as to properly handle the new database addition above. At first glance, I don't see /any/ place that requires changing:
In chatting with killes on IRC, there is one place. Everytime a node with taxo is saved, all term_node data for that nid is deleted: db_query('DELETE FROM {term_node} WHERE nid = %d', $nid); This would also obliterate all of the folksonomy data (which wouldn't be in a taxonomy input field, and thus nothing ever passed in as $terms to taxonomy_node_save()). It would be entirely possible to do a dirty ugly evil hack and name the module zzz_folksonomy_shared, but god, strike me down for even suggesting the idea. The quick and immediate fix is: db_query('DELETE FROM {term_node} WHERE nid = %d AND uid = 0', $nid); Since a normal taxonomy would receive all relations as uid 0, and a folksonomy would save proper folksonomy tags with the proper uid, the above would work. But, I'm now fighting back and forth mentally about how I should move forward with this. One of my design decisions was "as tightly bound to taxonomy as possible." The above would be a required patch to core. Besides the previous SQL change, the minisculeness of the above change visually lessens its importance. It's easy to forget the reasonings WHY it is necessary, especially since it'd be the only uid mention in the entire taxonomy module. On the other hand, I could lose my tightnening, and reproduce term_node as folksonomy_term_node. Unfortunately, that'd mean I'd have to reproduce a lot of taxonomy retrieval code in the folksonomy_shared module. I'm not sure which direction to go here. -- Morbus Iff ( you are nothing without your robot car, NOTHING! ) Culture: http://www.disobey.com/ and http://www.gamegrene.com/ Spidering Hacks: http://amazon.com/exec/obidos/ASIN/0596005776/disobeycom icq: 2927491 / aim: akaMorbus / yahoo: morbus_iff / jabber.org: morbus
Morbus Iff wrote:
db_query('DELETE FROM {term_node} WHERE nid = %d AND uid = 0', $nid);
Since a normal taxonomy would receive all relations as uid 0, and a folksonomy would save proper folksonomy tags with the proper uid, the above would work.
But, I'm now fighting back and forth mentally about how I should move forward with this. One of my design decisions was "as tightly bound to taxonomy as possible." The above would be a required patch to core. Besides the previous SQL change, the minisculeness of the above change visually lessens its importance. It's easy to forget the reasonings WHY it is necessary, especially since it'd be the only uid mention in the entire taxonomy module.
I lean towards staying tightly bound with taxonomy and making the change to core. A comment saying why this is the only uid mention should be sufficient for anyone modifying core in the future. I see a lot of advantages to staying tightly bound with taxonomy. The only disadvantage I worry about is ultimate taxo/folkso performance when both are in heavy use on a site. And that's just a knee-jerk reaction; have not looked at the code and SQL. -- Chris Johnson
Chris Johnson <chris@tinpixel.com> Thu, 10 Mar 2005 18:56:49
I see a lot of advantages to staying tightly bound with taxonomy. The only disadvantage I worry about is ultimate taxo/folkso performance when both are in heavy use on a site. And that's just a knee-jerk reaction; have not looked at the code and SQL.
The nasty SQL is - Get all terms ordered by usage count desc for this vocabulary. This is "Popular" terms and might be further limited by user. - Get all terms ordered by usage count desc for this vocabulary that appear along side this term on nodes. This is related terms for this term. I think the rest of it is not too bad. Even with a few thousand terms on a few tens of thousands nodes. I suspect that some work ought to be done on term_data and term_node indexes. -- Julian Bond Email&MSM: julian.bond at voidstar.com Webmaster: http://www.ecademy.com/ Personal WebLog: http://www.voidstar.com/ M: +44 (0)77 5907 2173 T: +44 (0)192 0412 433 S: callto://julian.bond/
I'm swinging back and forth on the game plan here. The first approach is to use vocab and term tables and start hacking patches into taxonomy. This scares me as I don't know enough about taxonomy and how it's used by core and contrib modules. The second approach is to use vocab but create parallel term tables. This only requires a single small patch to taxonomy to stop it generating form edit fields for modules that are not taxonomy or forum. This lets us keep folksonomy.module as a separate and usable contrib module and thrash out user interface and requirements without screwing up taxonomy in the process. The long term plan would still be to roll this back into taxonomy once we know exactly how it should work. -- Julian Bond Email&MSM: julian.bond at voidstar.com Webmaster: http://www.ecademy.com/ Personal WebLog: http://www.voidstar.com/ M: +44 (0)77 5907 2173 T: +44 (0)192 0412 433 S: callto://julian.bond/
Morbus Iff <morbus@disobey.com> Thu, 10 Mar 2005 15:36:06
In chatting with killes on IRC, there is one place. Everytime a node with taxo is saved, all term_node data for that nid is deleted:
db_query('DELETE FROM {term_node} WHERE nid = %d', $nid);
This would also obliterate all of the folksonomy data (which wouldn't be in a taxonomy input field, and thus nothing ever passed in as $terms to taxonomy_node_save()). It would be entirely possible to do a dirty ugly evil hack and name the module zzz_folksonomy_shared, but god, strike me down for even suggesting the idea.
I hit exactly the same problem. My equally nasty hack is to use hook_nodeapi('update') to merge the folksonomy tids into the taxonomy tids and let taxonomy save them. This relies on F coming before T in module_invoke_all Ugh! -- Julian Bond Email&MSM: julian.bond at voidstar.com Webmaster: http://www.ecademy.com/ Personal WebLog: http://www.voidstar.com/ M: +44 (0)77 5907 2173 T: +44 (0)192 0412 433 S: callto://julian.bond/
Ooops. I missed this discussion. Please see http://www.voidstar.com/node.php?id=2262 and http://drupal.org/node/18826 I think I'd better read up about sandbox. ;-) Morbus Iff <morbus@disobey.com> Thu, 10 Mar 2005 14:06:06
I confess to not really knowing the direction that jbond wants to head in.
Very similar to awtags.
* I've started a sandbox for this: sandbox/morbus/folksonomy_shared/ * Only one database change is needed: adding a uid to term_node. http://lists.drupal.org/archives/drupal-devel/2005-03/msg00422.html
Are you sure you need uid on term_node? It's inevitably available with a join to node. There's a slight conceptual difference in that term_node.uid would be all the terms I've created whereas term_node join node.uid is all the terms I've used.
I heartily agree with jbond: the URL should use $term, not $tid. The problem is that it becomes an either/or situation. Drupal prefers using the $tid (see the first example above), but there's no way to distinguish a $term of "1984" or "666" or "101" from the $tids of the same value. So, using $term is arguably "more" folksonomish, but breaks Drupal tradition.
The current format I'm thinking of implementing is as follows:
/folksonomy/1 # show all tags in vid 1 /folksonomy/1/term/hello # show all tags in vid 1 for "hello" /folksonomy/1/term/1984 # show all tags in vid 1 for "1984" /folksonomy/term/hello # show all tags in any vid for "hello" # see below for my concerns on this one.
taxonomy/tname/hello any tag from any vocab named "hello" taxonomy/tvname/mytags/hello hello tag from vocab mytags taxonomy/tvname/mytags anything tagged with any term from vocab mytags -- Julian Bond Email&MSM: julian.bond at voidstar.com Webmaster: http://www.ecademy.com/ Personal WebLog: http://www.voidstar.com/ M: +44 (0)77 5907 2173 T: +44 (0)192 0412 433 S: callto://julian.bond/
On Sun, 13 Mar 2005, Julian Bond wrote:
Morbus Iff <morbus@disobey.com> Thu, 10 Mar 2005 14:06:06
* I've started a sandbox for this: sandbox/morbus/folksonomy_shared/ * Only one database change is needed: adding a uid to term_node. http://lists.drupal.org/archives/drupal-devel/2005-03/msg00422.html
Are you sure you need uid on term_node? It's inevitably available with a join to node.
Morbus took some time to explain this to me. He wants users to be able to tag any content that they might want to tag, or at least he wants admins to be able to enable that functionality. That is, also nodes they haven't authored. That's why he needs (tid, nid, uid) in term_node.
There's a slight conceptual difference in that term_node.uid would be all the terms I've created
"all the terms I've tagged a node with". The terms I've created would be term_data.uid.
whereas term_node join node.uid is all the terms I've used.
... for nodes I've created myself. Cheers, Gerhard
The terms I've created would be term_data.uid.
In the latest schema change I've proposed, there is nowhere to catalog who has created a term, only who has used them. I talk a bit about not tracking ownership here, and the only comment I received was a +1. http://lists.drupal.org/archives/drupal-devel/2005-03/msg00421.html -- Morbus Iff ( accept no prostitutes ) Technical: http://www.oreillynet.com/pub/au/779 Culture: http://www.disobey.com/ and http://www.gamegrene.com/ icq: 2927491 / aim: akaMorbus / yahoo: morbus_iff / jabber.org: morbus
Are you sure you need uid on term_node? It's inevitably available with a join to node. There's a slight conceptual difference in that term_node.uid would be all the terms I've created whereas term_node join node.uid is all the terms I've used.
I had a discussion very similar to this with killes, before you showed up in IRC. Depending on node.uid doesn't give us the flexibility to (smartly) create a delicious-style system. * delicious presumably stores a URL only once. * when someone adds that URL for them, delicious makes a relation between the existing URL and the existing uid. * now, think of delicious in drupal. * user creates a node representing a URL. * his node.uid asserts he created it. * second user now wants to assert something about that URL. * since there's no place to do it (as node.uid is already used), a second, duplicate node must now be created. 50 users asserting different things about 1 URL means 50 duplicate nodes. * with term_node.uid, you're able to store two pieces of information: the user who originally created the node (node.uid), and the user that is asserting tids associated with the node. this stops data duplication, but also opens up the folksonomy to allow all users to assert something about anything, whether they created it or not. -- Morbus Iff ( there is no morbus, there is only zuul! ) Technical: http://www.oreillynet.com/pub/au/779 Culture: http://www.disobey.com/ and http://www.gamegrene.com/ icq: 2927491 / aim: akaMorbus / yahoo: morbus_iff / jabber.org: morbus
Morbus Iff <morbus@disobey.com> Sun, 13 Mar 2005 09:50:58
* with term_node.uid, you're able to store two pieces of information: the user who originally created the node (node.uid), and the user that is asserting tids associated with the node. this stops data duplication, but also opens up the folksonomy to allow all users to assert something about anything, whether they created it or not.
I've been shying away from this because it introduces a whole new set of issues that I don't think the community of folksonomy commentators have really addressed yet. (grin). In a wiki style I can quite imagine a group of people editing the node terms but to make that really work, you need to keep a history so changes can be undone. If you let anyone tag a node, then I think you get into areas where some people like "Author" count for more than other "commentators". There again maybe there's no problem here. Get the data-model right and the solution may just fall out. -- Julian Bond Email&MSM: julian.bond at voidstar.com Webmaster: http://www.ecademy.com/ Personal WebLog: http://www.voidstar.com/ M: +44 (0)77 5907 2173 T: +44 (0)192 0412 433 S: callto://julian.bond/
Morbus Iff <morbus@disobey.com> Sun, 13 Mar 2005 09:50:58
* with term_node.uid, you're able to store two pieces of information: the user who originally created the node (node.uid), and the user that is asserting tids associated with the node. this stops data duplication, but also opens up the folksonomy to allow all users to assert something about anything, whether they created it or not.
Do you have a picture in your head of how this might look? Is there a much cut down edit page for people who don't have full edit rights? -- Julian Bond Email&MSM: julian.bond at voidstar.com Webmaster: http://www.ecademy.com/ Personal WebLog: http://www.voidstar.com/ M: +44 (0)77 5907 2173 T: +44 (0)192 0412 433 S: callto://julian.bond/
* with term_node.uid, you're able to store two pieces of information: the user who originally created the node (node.uid), and the user that is asserting tids associated with the node. this stops data duplication, but also opens up the folksonomy to allow all users to assert something about anything, whether they created it or not.
Do you have a picture in your head of how this might look? Is there a much cut down edit page for people who don't have full edit rights?
I don't, no. I was just thinking ahead regarding data storage and the most flexible (while at the same time being "easy") approach, but have applied no thought at all to the UI for the possible feature. -- Morbus Iff ( shower your women, i'm coming ) Technical: http://www.oreillynet.com/pub/au/779 Culture: http://www.disobey.com/ and http://www.gamegrene.com/ icq: 2927491 / aim: akaMorbus / yahoo: morbus_iff / jabber.org: morbus
http://drupal.org/node/18826 http://www.voidstar.com/node.php?id=2262 Here's the short version of the readme Overview This set consists of three items - taxonomy.module.patch A patch to taxonomy to stop it generating form elements for vocabularies that are from modules taxonomy or forum. - folksonomy.module A module to provide folksonomy support to contrib modules. Requires taxonomy - folkterms.module A module that demonstrates adding folksonomies to nodes using folksonomy.module. Approach - Uses taxonomy vocabulary and term tables to store all data. - Creates term entries on the fly Game Plan I think this is the right long term approach. So the sequence is 1) Build proof of concept folksonomy.modules 2) Hack patches into taxonomy as we need them 3) Merge the finished folksonomy.module back into taxonomy ------------------------------------- I've stopped working on this because I keep needing changes to taxonomy to go further. So I'm back to the other game plan. 1) Build proof of concept folksonomy.modules 2) Ignore Taxo for the moment and use separate folksonomy_term tables 3) Once the Proof of concept is far enough along go back to Taxonomy and work out what needs to change 4) Merge the finished folksonomy.module back into taxonomy -- Julian Bond Email&MSM: julian.bond at voidstar.com Webmaster: http://www.ecademy.com/ Personal WebLog: http://www.voidstar.com/ M: +44 (0)77 5907 2173 T: +44 (0)192 0412 433 S: callto://julian.bond/
participants (4)
-
Chris Johnson -
Gerhard Killesreiter -
Julian Bond -
Morbus Iff