[drupal-devel] Schema Changes for a Shared Taxonomy OTF/Folksonomy
I'll be working on this folksonomy thing a lot quicker than I expected. I'm not sure if that's a good thing or a bad thing, sadly - there seems to be four (mine, moshe's, jbond's, and atf's) current thoughts on how it should work. Anyways, I was doing some thinkwork on "my vision" today which, to recap, includes: * multiple folksonomies ("realms" per jbond; "vocabulary" per Drupal). * tags are unique to a unique folksonomy (but, see below). * users share tags per a unique folksonomy/vocabulary. (note, however, that the admin could see create two folksonomies, assign them to different node types, or even to the same type,etc., and "funny" could be a unique value in two different folksonomies.) With that said, it appears that two taxonomy tables would change: * The first would be "term_data". It would include a new "uid" field which would be the uid of the user who created the term. In vocabularies that were not folksonomies, uid would be 0. * The second would be "term_node", which would need a new "uid" column as well. This would represent the uid that assigned the tid to the nid. This is the change that concerns me: * instead of just one tid (3)/nid (5) relation in a normal vocabulary, this change would potentially allow an infinite number, as multiple uids apply the same tid to the same nid. * theoretically, if the tid's vid (this is fun to say!) is not a folksonomy, then the uid would be 0, as before. but, I can see the uid being useful regardless. If multiple admins are editing a nid, then it could be quite useful to see that uid (45) added tid (12), and then, a day later, uid (23) decided that tid (15) was equally as relevant. With the above changes, the following would be possible: * a "my users" page would be easiliy implemented by just grabbing all term_node's that match the $uid (presumably, we wouldn't need a join here as the $user object would be loaded normally). * ownership of tags is possible by checking for the uid of the tid in the term_data table. * All relations/synonyms would utilize the same old code. * We don't have to rely on the taxonomy UI, because the vocabulary.module would be "folksonomy". In 4.6, it'd be hidden in the standard "categories" menu, and we can focus on just the UI and added features in folksonomy.module. Thoughts? -- Morbus Iff ( you are nothing without your robot car, NOTHING! ) Culture: http://www.disobey.com/ and http://www.gamegrene.com/ Spidering Hacks: http://amazon.com/exec/obidos/ASIN/0596005776/disobeycom icq: 2927491 / aim: akaMorbus / yahoo: morbus_iff / jabber.org: morbus
I slept on this. I'm wondering if the following change:
* The first would be "term_data". It would include a new "uid" field which would be the uid of the user who created the term. In vocabularies that were not folksonomies, uid would be 0.
is even necessary. Is it important to keep track of who created a tid? * in non-folksonomy cases, typically only admins control the creation of tids. generally speaking, there are never "more than enough admins" to cause confusion over who created a term - an inquiring admin can just ask around and find out the culprit. * knowing who created a tid (folksonomy or not) seems only important when it becomes a negative issue: someone made "dumbshit" and you want to know who to yell at. here, you'd have to assume: * other users would find no use for "dumbshit", reducing it's use so that the number of uids using the tid (via term_node, see previous email) would be small enough to ask around "manually". * if "dumbshit" is a bad tag, /any/ uid using it is at fault - it's less of an issue of who created the bad tag, and who is actually /using/ the bad tag. Thoughts? These questions apply to all the folksonomy implementations. -- Morbus Iff ( if god is in me, he is a tumor ) Technical: http://www.oreillynet.com/pub/au/779 Culture: http://www.disobey.com/ and http://www.gamegrene.com/ icq: 2927491 / aim: akaMorbus / yahoo: morbus_iff / jabber.org: morbus
Morbus Iff wrote:
I slept on this. I'm wondering if the following change:
* The first would be "term_data". It would include a new "uid" field which would be the uid of the user who created the term. In vocabularies that were not folksonomies, uid would be 0.
is even necessary. Is it important to keep track of who created a tid?
* in non-folksonomy cases, typically only admins control the creation of tids. generally speaking, there are never "more than enough admins" to cause confusion over who created a term - an inquiring admin can just ask around and find out the culprit.
* knowing who created a tid (folksonomy or not) seems only important when it becomes a negative issue: someone made "dumbshit" and you want to know who to yell at. here, you'd have to assume:
* other users would find no use for "dumbshit", reducing it's use so that the number of uids using the tid (via term_node, see previous email) would be small enough to ask around "manually".
* if "dumbshit" is a bad tag, /any/ uid using it is at fault - it's less of an issue of who created the bad tag, and who is actually /using/ the bad tag.
Thoughts? These questions apply to all the folksonomy implementations.
I agree with this post completely. I don't imagine that del.icio.us records who created each term, just who's using which term for what content. The whole thing with a Folksonomy is that it's not 'controlled', good terms float to the top because of popularity. I don't think it's a great idea to mix 'controlled' vocabularies with 'uncontrolled' ones, as they are completely different approaches to organising content. If it's possible to get them working along side each other then great.
Ross Kendall <drupal@rosskendall.com> Thu, 10 Mar 2005 12:54:29
The whole thing with a Folksonomy is that it's not 'controlled', good terms float to the top because of popularity. I don't think it's a great idea to mix 'controlled' vocabularies with 'uncontrolled' ones, as they are completely different approaches to organising content. If it's possible to get them working along side each other then great.
Two questions that arise. - What should happen to terms that are no longer used anywhere? Should there be automatic garbage collection or perhaps Admin pruning? - There's a fundamental difference doing folksonomy within Drupal from del.icio.us. Probably, only the author (and admins) can apply terms to a node whereas lots of people can apply terms to a bookmark. So bad terms don't get submerged in quite the same way as in del.icio.us. There's a potential issue here with "tag spam". A malicious user might tag a node with all the most popular terms. That node will then appear (perhaps briefly) at the top in the pages for all those terms. There's quite a few scenarios where this will encourage the stupid to repost with thousands of tags repeatedly to keep their node at the top of every list. My solution to this is an admin variable "Max terms per post per vocab" of say, 10. Combined with spam controls to prevent creating identical or nearly identical nodes from one or more users. -- Julian Bond Email&MSM: julian.bond at voidstar.com Webmaster: http://www.ecademy.com/ Personal WebLog: http://www.voidstar.com/ M: +44 (0)77 5907 2173 T: +44 (0)192 0412 433 S: callto://julian.bond/
- What should happen to terms that are no longer used anywhere? Should there be automatic garbage collection or perhaps Admin pruning?
The same question could be asked of a normal taxonomy, in which case the answer would be "absolutely nothing." Deleting a term would also delete it's URL, and I'd much rather serve a "there is currently no content tagged with this term" than a 404. You establish:
- There's a fundamental difference doing folksonomy within Drupal from del.icio.us. Probably, only the author (and admins) can apply terms to a
And then suggest:
There's a potential issue here with "tag spam". A malicious user might tag a node with all the most popular terms. That node will then appear
While I certainly don't want to suggest this discussion is irrelevant, I think it's fair to say that every node of a Drupal site has a much better chance of being viewed by some sort of admin than an open-ended system like delicious. I think the biggest problem with "tag spam" and a Drupal site would be if we allowed commenters to insert tags - ignoring that, it's a safe bet that tag spam in a community-oriented site like Drupal (as opposed to, IMO, a service-oriented site like delicious) would be more easily seen and addressed. -- Morbus Iff ( cheese and rice saves ) Technical: http://www.oreillynet.com/pub/au/779 Culture: http://www.disobey.com/ and http://www.gamegrene.com/ icq: 2927491 / aim: akaMorbus / yahoo: morbus_iff / jabber.org: morbus
Morbus Iff <morbus@disobey.com> Sun, 13 Mar 2005 11:17:53
There's a potential issue here with "tag spam". A malicious user might tag a node with all the most popular terms. That node will then appear
While I certainly don't want to suggest this discussion is irrelevant, I think it's fair to say that every node of a Drupal site has a much better chance of being viewed by some sort of admin than an open-ended system like delicious. I think the biggest problem with "tag spam" and a Drupal site would be if we allowed commenters to insert tags - ignoring that, it's a safe bet that tag spam in a community-oriented site like Drupal (as opposed to, IMO, a service-oriented site like delicious) would be more easily seen and addressed.
I'd agree until money is involved. If you have a section of the site where people are allowed to advertise services for money, someone *will* spam it. The classic example is a Craigslist style "Listings" services. -- Julian Bond Email&MSM: julian.bond at voidstar.com Webmaster: http://www.ecademy.com/ Personal WebLog: http://www.voidstar.com/ M: +44 (0)77 5907 2173 T: +44 (0)192 0412 433 S: callto://julian.bond/
Morbus Iff <morbus@disobey.com> Wed, 9 Mar 2005 17:41:44
* ownership of tags is possible by checking for the uid of the tid in the term_data table.
As on another post. Why not just use a join to node.uid to see terms I've used rather than terms I've created.
* We don't have to rely on the taxonomy UI, because the vocabulary.module would be "folksonomy". In 4.6, it'd be hidden in the standard "categories" menu, and we can focus on just the UI and added features in folksonomy.module.
Again as on another post. Taxonomy wants to generate form edit fields for all vocabularies. I think it should only do this for vocabs of module taxonomy and forum. -- Julian Bond Email&MSM: julian.bond at voidstar.com Webmaster: http://www.ecademy.com/ Personal WebLog: http://www.voidstar.com/ M: +44 (0)77 5907 2173 T: +44 (0)192 0412 433 S: callto://julian.bond/
On Sun, 13 Mar 2005, Julian Bond wrote:
Morbus Iff <morbus@disobey.com> Wed, 9 Mar 2005 17:41:44
* We don't have to rely on the taxonomy UI, because the vocabulary.module would be "folksonomy". In 4.6, it'd be hidden in the standard "categories" menu, and we can focus on just the UI and added features in folksonomy.module.
Again as on another post. Taxonomy wants to generate form edit fields for all vocabularies. I think it should only do this for vocabs of module taxonomy and forum.
Right, one might consider this a taxonomy bug. The module really needs some attention. Cheers, Gerhard
Again as on another post. Taxonomy wants to generate form edit fields for all vocabularies. I think it should only do this for vocabs of module taxonomy and forum.
My current thought process has been dwelling on this. Currently, I'm thinking about adding a new BOOLEAN to the vocabulary table called "default_ui". When enabled, the display would be whatever taxonomy thinks it should be. When disabled, taxonomy assumes that another module is going to handle the GUI setting. Or. "default_ui" is the selectbox. When disabled, taxonomy.module just spits an input box (and then some new taxonomy.module code would split them on commas or spaces and run them through the normal add/relation process). A BOOLEAN addition seems cleaner than: * hardcoding known modules that require the selectbox. * a new serialized setting/preference. On the other hand, I'm not sure if "default_ui" is the right conceptual approach. Should it instead be "is_folksonomy" and the UI and workflow of taxonomy change based on that flag? The danger in this is future growth, of course. What other "types" of taxonomy besides controlled vocabs and folksonomies are there? An "is_folksonomy" flag seems an admission that it'd be "ok" to add "is_mediocracy", "is_mobonomy", "is_morbonomy" flags should they ever been needed, and that's not good. -- Morbus Iff ( i assault your sensibilities! ) Technical: http://www.oreillynet.com/pub/au/779 Culture: http://www.disobey.com/ and http://www.gamegrene.com/ icq: 2927491 / aim: akaMorbus / yahoo: morbus_iff / jabber.org: morbus
participants (4)
-
Gerhard Killesreiter -
Julian Bond -
Morbus Iff -
Ross Kendall