[drupal-devel] A Folksonomy module
I've been thinking a lot and writing code to experiment with a Folksonomy module. The long term goal is a parallel to Taxonomy that can be used alongside or instead of Taxonomy. It would provide a free form text field(s) on nodes that let the user enter comma-separated keyword tags in a del.ico.us style. There are two use cases for this. 1) General classification and navigation of internal Drupal objects. Presumably this means exclusively nodes. 2) Various group classifications of external objects to duplicate del.icio.us functionality for a Drupal site's community. I think, but I'm not sure that this is exactly the same as 1) There are a number of key ideas taken from current folksonomy projects on the web. These tend to be database intensive. - Using clustering analysis to derive "popular" tags. - Similarly to derive "related" tags. - Navigation via an information block about the current tag page containing; search; my tags; popular tags; tags related to this one. - Faceted search to drill down on multiple parameters. - Innovative data entry support. eg; one click adding of existing tags from a list; Google Suggest style on the fly updating of suggested tags based on what's been entered so far. There's also quite a lot of knotty questions around things like access rights. - Should only the author be able to add tags to a node? - Can the administrator prune and clean the developing tag vocabulary? - Does it need synonym support? - Does "tag spam" need controlling to stop a user polluting the vocabulary by adding 250 tags to a node? So how to move forwards with this? Moshe's folksonomy module is currently really just a placeholder (no offence Moshe!). I think there's an assumption in there that a node would only have a single tag from a particular Realm (folksonomy->realm ~= taxonomy-vocabulary). This doesn't look right to me. I would expect most nodes to have 2 or 3 and perhaps 10 tags from a realm. I've found it impossible to add the next iteration to Moshe's code without at least doubling the amount of code and reworking the database table(s). The next step is a biggish one. The Taxonomy on the fly module really isn't the same. The key bit of del.icio.us style tagging is the free form entry of multiple keywords. There should be no onus on the user to force them to select from a list and create new entries in the list if one's missing. It should just happen. Some of the parallels with taxonomy mean that you end up cutting, pasting and reworking large amounts of Taxonomy code. And because Taxonomy has been around for a long time there's quite a few places where Taxonomy is treated as core and modules have specific support for it. Which is not a problem except that a folksonomy module would want to be similarly treated as special. Which suggests that perhaps Folksonomy should be a development of Taxonomy. Perhaps there's a new vocabulary type "folks" which has free form tags instead of a tree. Working against that is that it's sufficiently different that a lot of Taxonomy code would have to be duplicated anyway. I also think this is too unformed yet and so at least to start with it should be done in the sandbox of a contributed module. Maybe later it can be merged into Taxonomy. So anyway, my second attempt at this is much more Drupal style and extends folksonomy.module. But it extends it enough and makes enough new assumptions that I really need to talk about this before careering off into left field. ps. The hidden agenda in all this is a tag based Craigslist clone built in Drupal. But that's another story. ;-) -- Julian Bond Email&MSM: julian.bond at voidstar.com Webmaster: http://www.ecademy.com/ Personal WebLog: http://www.voidstar.com/ M: +44 (0)77 5907 2173 T: +44 (0)192 0412 433 S: callto://julian.bond/
I've been thinking a lot and writing code to experiment with a
Hey Julian. Nice to see you again. Been a few years, eh? Within a month or so, I'll be working on converting a homebrew site to Drupal, and they've got about 7000 terms in a similar-enough-to-be-a-folksonomy situation. I've recently discussed this on the list, including a) the GUI they're used to using, b) the workflow they expect, and c) the "here are some better tags" UI and code. Unfortunately, the list archives are not updating as frequently as normal so, if you'd like copies, lemme know and I'll forward them on. For now, an example of c): http://www.disobey.com/detergent/2005/similar_keywords.jpg
- Innovative data entry support. eg; one click adding of existing tags from a list; Google Suggest style on the fly updating of suggested tags
From my standpoint, "one click adding of existing tags" would be impossible - 7000 terms creates a rather disasterous selectbox, and even something like "top 50 terms that have used the most" creates such a small subset of the whole as to be unuseful.
- Should only the author be able to add tags to a node?
This should be a per-site preference, IMO, but ultimately devolves into why /I/ personally despise folksonomies: my "funny" is not yours, my "sexy" is your "disgusting", and your "rock" is my "crap". I'd never allow someone to change my tags, solely for this reason. Perhaps append only, but that's just a different type of war.
- Can the administrator prune and clean the developing tag vocabulary?
An administrator should be able to do everything, yes, and that permission should be grantable to other users via access privs.
- Does it need synonym support?
My envisioning of folksonomy support within Drupal is tightly keyed into the existing taxonomy stuff. Likewise, my need for the "similar" code (in the URL above) prompted a great idea from killes: if a user chooses a better keyword, then their original choice becomes a synonym of their new choice. This automatic synonym linkage will help limit the number of tags, as well as satisfy your "related tags" feature request. Largely, my vision of folksonomy support in Drupal: * user creates a taxonomy via the normal means. * user flags taxonomy as a "folksonomy". * user flags which node types to apply the "folksonomy" to. * a "folksonomy" input box is appended to selected node types. * user types tags in comma-spliced format. * upon submission of the node form: * each tag is checked against taxonomy flagged as "folksonomy" * new tags are added automatically, and user id gets "created" status. * tags are applied to nodes (via normal taxonomy) and user (new table) * after submission of the node form: * optional "improve your tags" code comes up (see UI URL above). * re-chosen tags becomes synonyms of master tag. * re-chosen created tags are then deleted. The above uses the existing taxonomy support, and seems to only require one additional table to remember what user is creating what term: tid | uid | created ------------------- 3 | 12 | 0 4 | 17 | 1 With this, a "My Tags" thing is one or two JOINs away.
- Does "tag spam" need controlling to stop a user polluting the vocabulary by adding 250 tags to a node?
Sure. Default to 10?
be similarly treated as special. Which suggests that perhaps Folksonomy should be a development of Taxonomy. Perhaps there's a new vocabulary
I agree. -- Morbus Iff ( i've eaten fruity pebbles with p-diddy ) Technical: http://www.oreillynet.com/pub/au/779 Culture: http://www.disobey.com/ and http://www.gamegrene.com/ icq: 2927491 / aim: akaMorbus / yahoo: morbus_iff / jabber.org: morbus
Morbus Iff <morbus@disobey.com> Sun, 6 Mar 2005 11:35:23
- Innovative data entry support. eg; one click adding of existing tags from a list; Google Suggest style on the fly updating of suggested tags
From my standpoint, "one click adding of existing tags" would be impossible - 7000 terms creates a rather disasterous selectbox, and even something like "top 50 terms that have used the most" creates such a small subset of the whole as to be unuseful.
The trick is clearly to get down to a small enough subset (say 30-50). Some approaches. - top 30 of my tags - top 30 all tags - top 30 tags related to the tags already there. This obviously works best in edit rather than add. The google suggest approach would be to have a 30 most likely in a box below the field and have that box update as the user types in entries. nutr.icio.us and the latest del.icio.us bookmarklet give guidance here. This is icing on the cake, but something to drive convergence and consistent tagging is important.
- Should only the author be able to add tags to a node? Perhaps append only, but that's just a different type of war.
In a wiki style, this would make complete sense. But then you need history. Otherwise you get into either flickr style "only my friends can change the tags on my nodes" or author tags and viewer tags with different weight for each. Either way this is icing and can be ignored for the moment. Just Author and Administrator is fine.
The above uses the existing taxonomy support, and seems to only require one additional table to remember what user is creating what term:
tid | uid | created ------------------- 3 | 12 | 0 4 | 17 | 1
With this, a "My Tags" thing is one or two JOINs away.
I'm undecided about the data model. One option is a node_tag table and that's it. node_tag:nid, uid, tag. this isn't normalised but it makes the queries simple and fast. Add count and related fields and almost all the access routes become really simple at the cost of calculation time of count and related during inserts, updates and deletes. The normalised approach is tag: tid, tag, count; node_tag:nid,tid,uid; tag_tag: tid1, tid2 Easier updates, more joins, roughly the same calculation. Remove that uid and rely on node:uid and you have almost the same thing for another join. Arguably tags only exist in relation to nodes and not on their own. So "My tags" means "Tags I've used on nodes".
- Does "tag spam" need controlling to stop a user polluting the vocabulary by adding 250 tags to a node?
Sure. Default to 10?
be similarly treated as special. Which suggests that perhaps Folksonomy should be a development of Taxonomy. Perhaps there's a new vocabulary
I agree.
From a development POV, this makes it hard to do purely as an external module. -- Julian Bond Email&MSM: julian.bond at voidstar.com Webmaster: http://www.ecademy.com/ Personal WebLog: http://www.voidstar.com/ M: +44 (0)77 5907 2173 T: +44 (0)192 0412 433 S: callto://julian.bond/
Julian Bond wrote:
The trick is clearly to get down to a small enough subset (say 30-50). Some approaches. - top 30 of my tags - top 30 all tags - top 30 tags related to the tags already there. This obviously works best in edit rather than add. The google suggest approach would be to have a 30 most likely in a box below the field and have that box update as the user types in entries. nutr.icio.us and the latest del.icio.us bookmarklet give guidance here. This is icing on the cake, but something to drive convergence and consistent tagging is important.
At the usability sprint we talked about introducing a form_ function which adds support for 'Google suggest'-style textfields. We felt it was useful for the various username textfields. Folksonomy might be another use case. Anyway, integrating folksonomy support in the existing taxonomy framework would be very powerful. Being able to combine 'strict vocabularies' and folksonomies has the potential of being a killer feature. We could do advanced queries like retrieve all forum topics in the 'News and announcement' forum (i.e. a term in a 'strict vocabulary') that have been tagged with the word 'Ecademy' or 'SpreadFirefox' (i.e. folksonomy tags). Or, a second example, retrieve all forum topics with the 'image.module' term regardless the fact whether 'image.module' is a fixed term as we know it, or a folksonomy tag. Or, for your particular use case, return me all 'IT jobs' (a fixed taxonomy term) in the 'Bay area' (a fixed taxonomy term) which have been tagged with either 'CMS' or 'Fortune 500' (two folksonomy terms). While I haven't given the implementation much thought, in my mind, folksonomies are a natural extension of the taxonomy module. It is a special kind of vocabulary that shares the underlying data structures (terms) with the conventional taxonomy system. -- Dries Buytaert :: http://www.buytaert.net/
Dries Buytaert <dries@buytaert.net> Sun, 6 Mar 2005 20:06:01
At the usability sprint we talked about introducing a form_ function which adds support for 'Google suggest'-style textfields. We felt it was useful for the various username textfields. Folksonomy might be another use case.
If you haven't seen it via the numerous blogdex links see http://www.modernmethod.com/sajax/index.phtml "Sajax is a tool to make programming websites using the Ajax framework — also known as XMLHTTPRequest or remote scripting — as easy as possible. Sajax makes it easy to call PHP functions from your webpages via JavaScript without performing a browser refresh. The toolkit does 99% of the work for you so you have no excuse to not use it."
While I haven't given the implementation much thought, in my mind, folksonomies are a natural extension of the taxonomy module. It is a special kind of vocabulary that shares the underlying data structures (terms) with the conventional taxonomy system.
Maybe we should think about the absolute minimum change to Taxonomy to allow this and to allow an add on module to be written. It might even be as simple as one additional status value for vocabulary:hierarchy. And all the associated code changes to hide that value. -- Julian Bond Email&MSM: julian.bond at voidstar.com Webmaster: http://www.ecademy.com/ Personal WebLog: http://www.voidstar.com/ M: +44 (0)77 5907 2173 T: +44 (0)192 0412 433 S: callto://julian.bond/
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Julian Bond wrote: | Maybe we should think about the absolute minimum change to Taxonomy to | allow this and to allow an add on module to be written. It might even be | as simple as one additional status value for vocabulary:hierarchy. And | all the associated code changes to hide that value. | Just wanted to point out that a folksonomy API module does already exist. It can be found here: http://drupal.org/project/folksonomy There is also another module that is fairly far along that is rather nice and has some promise on this front. It is called awTags and is being developed by autowitch over here: http://www.autowitch.org/node/4091 Don't know if either of these projects might be moving in the direction ~ you are looking for, but I figured I would mention them. Brady -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.0 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFCK2fEO+lwfsap+f0RAk9bAJwKPhh9/tcq/Y9+Lie2ggS7mFvSyACgnZve xprTVh5VEpkuVl/ZY+qkF8I= =2aEN -----END PGP SIGNATURE-----
Last year there was a post here suggesting a "system" type of taxonomy vocabulary that would use the taxonomy tables but provide no user interface. I've been looking at a similar idea to re-use the taxonomy module for a folksonomy. It looks to me like almost every function in taxonomy would have to be modified or replaced. And I don't think taxonomy would cope well with the target of 1000 tags in a folksonomy vocabulary. So where I'm going now is to extend Moshe's folksonomy.module and progressively add cut, paste and modify versions of the taxo administrative function, to provide utility functions, but not to include any user interface. This to stay in what I think is Moshe's original intent which is to support other modules that provide innovative UI to the data. Right now I've got 3 tables. - folksonomy // tag node links - folksonomy_realm // corresponds to vocabulary - folksonomy_realm_node_types // corresponds to vocabulary_node_types In parallel, I'm working on a folktags.module that uses folksonomy and provides the UI. -- Julian Bond Email&MSM: julian.bond at voidstar.com Webmaster: http://www.ecademy.com/ Personal WebLog: http://www.voidstar.com/ M: +44 (0)77 5907 2173 T: +44 (0)192 0412 433 S: callto://julian.bond/
Last year there was a post here suggesting a "system" type of taxonomy vocabulary that would use the taxonomy tables but provide no user
For a similar implementation, take a look at forums within 4.6 - they use a special taxonomy that is not modifiable through "normal" taxo means.
I've been looking at a similar idea to re-use the taxonomy module for a folksonomy. It looks to me like almost every function in taxonomy would have to be modified or replaced. And I don't think taxonomy would cope well with the target of 1000 tags in a folksonomy vocabulary.
Could you explain to me why you feel this way? I'm not seeing it. -- Morbus Iff ( you are nothing without your robot car, NOTHING! ) Culture: http://www.disobey.com/ and http://www.gamegrene.com/ Spidering Hacks: http://amazon.com/exec/obidos/ASIN/0596005776/disobeycom icq: 2927491 / aim: akaMorbus / yahoo: morbus_iff / jabber.org: morbus
Julian Bond wrote:
Last year there was a post here suggesting a "system" type of taxonomy vocabulary that would use the taxonomy tables but provide no user interface.
I reently decided to reimplement folksonomy using system vocabularies. As mentioned, these are successfully being used for image galleries in the upcoming image.module and already being used for forum.module in 4.6. My thought was to create a vocabulary for each user. The vocab would be created the first time the user tagged an item. That way, we don't unnecessarily create vocabs for every single user ... One disadvantage about this approach is that you will only be able to tag nodes. So the noption of tagging users would be gone (without reworking taxonomy). One nice application of tagging users is in buddylists where a given user can be a member of 'work', 'friends', etc. Think IM buddy list groups ... But overall, thats my thinking.
I've been looking at a similar idea to re-use the taxonomy module for a folksonomy. It looks to me like almost every function in taxonomy would have to be modified or replaced. And I don't think taxonomy would cope well with the target of 1000 tags in a folksonomy vocabulary.
do you have evidence for this claim?
So where I'm going now is to extend Moshe's folksonomy.module and progressively add cut, paste and modify versions of the taxo administrative function, to provide utility functions, but not to include any user interface. This to stay in what I think is Moshe's original intent which is to support other modules that provide innovative UI to the data.
I am hoping to duplicate very little of taxo functions. Not sure what you are up to here. But yes, some management pages are helpful for tags.
Right now I've got 3 tables. - folksonomy // tag node links - folksonomy_realm // corresponds to vocabulary - folksonomy_realm_node_types // corresponds to vocabulary_node_types
with system vocabs, you don't make your own tables. thats my goal. i'm not doing much to get there though. perhaps juklian and others want to move this along.
nodes. So the noption of tagging users would be gone (without reworking taxonomy). One nice application of tagging users is in buddylists where
Then some rework is due. There was also talk about nested sets. More and more things to do with taxonomy is gathering... I just had a problem where users needed to be tagged by terms. There are conferences in each region (or state or whatever you call it), and every user is living in a region. Easiest way to display the same-region-conferences is to tag users by the appropriate term. Regards Karoly Negyesi
I reently decided to reimplement folksonomy using system vocabularies. As mentioned, these are successfully being used for image galleries in the upcoming image.module and already being used for forum.module in 4.6. My thought was to create a vocabulary for each user. The vocab would be created the first time the user tagged an item. That way, we don't unnecessarily create vocabs for every single user ... One disadvantage about this approach is that you will only be able to tag nodes. So the noption of tagging users would be gone (without reworking taxonomy). One nice application of tagging users is in buddylists where a given user can be a member of 'work', 'friends', etc. Think IM buddy list groups ... But overall, thats my thinking.
Hrm. Isn't a duplication of data (all users having "cat", "cats", "funny", etc.) "bad"? The above seems focussed on solving the problem of "My Tags", but also seems to make relevant and synonomous tags more difficult, and makes it nearly impossible for me to do the "similar keywords" I need for the NHPR conversion. Was your design decision "no new tables, come hell or highwater?" Could we use hidden profile.module fields to associate uids with tids? -- Morbus Iff ( you are nothing without your robot car, NOTHING! ) Culture: http://www.disobey.com/ and http://www.gamegrene.com/ Spidering Hacks: http://amazon.com/exec/obidos/ASIN/0596005776/disobeycom icq: 2927491 / aim: akaMorbus / yahoo: morbus_iff / jabber.org: morbus
Hrm. Isn't a duplication of data (all users having "cat", "cats", "funny", etc.) "bad"? The above seems focussed on solving the problem of "My Tags", but also seems to make relevant and synonomous tags more difficult, and makes it nearly impossible for me to do the "similar keywords" I need for the NHPR conversion.
The whole point of folksonomy is that users make up their own tags. Otherwise, you have one central vocab like today. At first blush, it seems you want taxonomy on the fly where users add terms to the central vocabulary.
Was your design decision "no new tables, come hell or highwater?"
thats is not part of my design decision at all. if new tables are needed to implement the best solution, we use enw tables.
Could we use hidden profile.module fields to associate uids with tids?
certainly we will ned a table or profile field to associate a uid to a vocab
The whole point of folksonomy is that users make up their own tags.
Quite so. But, I'm more concerned with duplicate data (a technical implementation) and how it should, or shouldn't, affect "users making up their own tags" (an, arguably, UI implementation). If one user creates "cat" and another user creates "cat", aren't they the same thing? Why should there be two unique tags (and thus tids, RSS feeds, etc.)? Is this going to make equivalency code more difficult? If there are two different vocabs, and one user says that "feline" is a "synonym" of "cat", is that information that we want to throw away for every other user, simply because they didn't care or know to create the same (arguably correct in this innocent example) association? And, if not, why should I need to do a join on the user table? Does /no one else/ want any sort of "similar keywords" UI, per the UI at http://disobey.com/d/2005/similar_keywords.jpg? If yes, then would similar keywords be across all folksonomy vocabs, or just an individual user? If there are 1000 users and each user creates 100 terms in their vocab, with a perfect match of (even) 10%, aren't we wasting a lot of resources storing those duplicates? (Note: I suck at math. I have no clue if that calculation makes some sort of scary figure. But, uh, I meant it to <g>).
Otherwise, you have one central vocab like today. At first blush, it seems you want taxonomy on the fly where users add terms to the central
Correct, but an additional table (whether it be new or profile.module) would associate uids to tids. I'm having a tough time comprehending or believing that: * technorati is creating $copies amount of terms called "funny", as opposed to just storing one instance of it and uid joining. * that my innocent gamegrene site, with two controlled vocabs, is suddenly going to blossom into 900+ vocabs that I have absolutely no control over, nor can ever delete. * that the NHPR site, with 7000 unique terms, is going to store at least 10 times that many (I'd say there are about 80 users/ reporters/ submitters, all who use about 3-10 tags per entry) in your current proposal/design. -- Morbus Iff ( you are nothing without your robot car, NOTHING! ) Culture: http://www.disobey.com/ and http://www.gamegrene.com/ Spidering Hacks: http://amazon.com/exec/obidos/ASIN/0596005776/disobeycom icq: 2927491 / aim: akaMorbus / yahoo: morbus_iff / jabber.org: morbus
On Tue, 8 Mar 2005, Morbus Iff wrote:
The whole point of folksonomy is that users make up their own tags.
Quite so. But, I'm more concerned with duplicate data (a technical implementation) and how it should, or shouldn't, affect "users making up their own tags" (an, arguably, UI implementation). If one user creates "cat" and another user creates "cat", aren't they the same thing? Why should there be two unique tags (and thus tids, RSS feeds, etc.)? Is this going to make equivalency code more difficult? If there are two different vocabs, and one user says that "feline" is a "synonym" of "cat", is that information that we want to throw away for every other user, simply because they didn't care or know to create the same (arguably correct in this innocent example) association? And, if not, why should I need to do a join on the user table?
I think it really depends on the use case what you want. if you have a large anonymous site, users will want their own tags. If you have a community site, users are more likely to want to share their tags.
Does /no one else/ want any sort of "similar keywords" UI, per the UI at http://disobey.com/d/2005/similar_keywords.jpg? If yes, then would similar
I do want it. :)
keywords be across all folksonomy vocabs, or just an individual user? If
I would only use one vocab for all users. But that is for my current use case.
there are 1000 users and each user creates 100 terms in their vocab, with a perfect match of (even) 10%, aren't we wasting a lot of resources storing those duplicates? (Note: I suck at math. I have no clue if that calculation makes some sort of scary figure. But, uh, I meant it to <g>).
Terms don't take much space.
Otherwise, you have one central vocab like today. At first blush, it seems you want taxonomy on the fly where users add terms to the central
Correct, but an additional table (whether it be new or profile.module) would associate uids to tids.
I used to do that for groups.module. ;)
I'm having a tough time comprehending or believing that:
* technorati is creating $copies amount of terms called "funny", as opposed to just storing one instance of it and uid joining.
* that my innocent gamegrene site, with two controlled vocabs, is suddenly going to blossom into 900+ vocabs that I have absolutely no control over, nor can ever delete.
What kind of control would you need or want to have? As long as they are not clogging your /admin/taxo page (which private vocabs do not) I do not see a problem.
* that the NHPR site, with 7000 unique terms, is going to store at least 10 times that many (I'd say there are about 80 users/ reporters/ submitters, all who use about 3-10 tags per entry) in your current proposal/design.
For that type of site sharing terms makes probably more sense. Cheers, Gerhard
"Taxonomy won't work well with 1000 tags". "Taxonomy needs too much work to support folksonomy modules.". "Use system vocabs" I looked at this page /admin/taxonomy and some of the Taxo edit controls and couldn't see it working with very large numbers of terms. I think *all* the user interface for a tag vocabulary would need reworking. So it's hard to see what actually gets reused. The form control for use on nodes is going to be different so preview form would need a hook. Although some Taxo functions for saving and loading could be re-used, the output from a free form tag entry will need converting before it fits into the Taxo scheme. So yes, it would be possible to use the vocabulary and vocabulary_node_types tables (at least I haven't yet needed to extend them) but taxonomy admin couldn't show them usefully and almost none of the vocab manipulation code can be used as is. So what's the point? However, Moshe's last post convinces me I need to go and look at 4.6 forum and prove to myself again that Realm can't use the vocabulary table and there's nothing in taxonomy that can be used as is. Data model. I don't believe that users own their tags. I think they only use tags on specific data. So "My Tags" means "The Tags I've used". In which case either node_tag has a uid field or we derive it from node.author. There's a fundamental choice that affects everything else. And it all depends on if you think a tag exists independent of the nodes or other objects it's attached to. Or that tags only exist 1) Independent tag Realm table; rid, name Realm_type table; rid, type Tag table; tid, rid, name optionally add cached data like count node_tag table; nid, tid Optionally rid to make queries faster and reduce joins. 2) Dependent tags Realm table; rid, name Realm_type table; rid, type node_tag table; nid, rid, name 1) Is more normalised. It will make adding tag_tag synonym and related tables easier to build later. It makes admin editing a tag easier. It needs more joins. The saving process may get slow as each tag has to be found or added before the node_tag link can be created. In 2) most of the queries are easier, faster and use less joins. The downside is admin editing a tag is a little harder. Arguably there's data duplication; potentially X*Y*Z entries in node_tag where X is nodes,Y is tags per node, Z is average tag length. In practice that's just not an issue. Note: All the data model stuff assumes a single link between nodes and tags. If you want to do users as well I think there has to be a second link table user_tag. Or maybe table_tag_type; id,tid,type but then the joins are going to get very nasty. -- Julian Bond Email&MSM: julian.bond at voidstar.com Webmaster: http://www.ecademy.com/ Personal WebLog: http://www.voidstar.com/ M: +44 (0)77 5907 2173 T: +44 (0)192 0412 433 S: callto://julian.bond/
*all* the user interface for a tag vocabulary would need reworking. So it's hard to see what actually gets reused. The form control for use on nodes is going to be different so preview form would need a hook.
Taxonomy shall use the form_combo, an auto complete JS textfield. We have discussed this on the usability session. A good autocomplete code: http://www.webreference.com/programming/javascript/gr/column5/3.html And XMLHTPPRequest could come from Sajax. Regards NK
Op dinsdag 8 maart 2005 23:31, schreef Morbus Iff:
Does /no one else/ want any sort of "similar keywords" UI, per the UI at http://disobey.com/d/2005/similar_keywords.jpg?
Yay for this one! And if it works, it could be expanded to nodes too. (recipe pizza-margharita similar to margharita-pizza) And in how far could the bayesian code from spam module help taking work out of users hands here? So, learn what sort of duplicates are there an present them as dupes for confirmation (or autocorrect). Regards, Bèr -- [ Bèr Kessels | Drupal services www.webschuur.com ]
On Tue, 2005-03-08 at 17:31 -0500, Morbus Iff wrote:
The whole point of folksonomy is that users make up their own tags. Why should there be two unique tags (and thus tids, RSS feeds, etc.)? Is this going to make equivalency code more difficult?
Does /no one else/ want any sort of "similar keywords" UI, per the UI at http://disobey.com/d/2005/similar_keywords.jpg? If yes, then would similar keywords be across all folksonomy vocabs, or just an individual user? If there are 1000 users and each user creates 100 terms in their vocab, with a perfect match of (even) 10%, aren't we wasting a lot of resources storing those duplicates? Everything depends on the context or meaning the user puts into cats, dogs, foxes, etc... How are you going to find out if a dog means the canine friend or ugly woman - both are common use terms and humans might understand dependent on context. Splitting the different contexts into different vocabularies is not a panacea either.
The folksonomy and friends approach is betting on an old Marxist dogma to come true - that if you pile up a lot of quantity it will often change into a new quality. And there is a point to it. But some work down the line will be needed to actually get the underlying 'real' taxonomy behind the folksonomy. The taxonomies as they stand are or should be carefully designed, control vocabularies, the folksonomies represent a parts of the BIG vocabulary, yes, with duplicates, synonims, discrepancies, outright errors but have a strong point on actually capturing individual views and classification systems of people, and possibly allowing the system to speculate on behalf of the group.
* technorati is creating $copies amount of terms called "funny", as ... * that my innocent gamegrene site, with two controlled vocabs, is ... * that the NHPR site, with 7000 unique terms, is going to store ... in your current proposal/design.
You have a point here, but it depends what behaviour you want - early or late speculation. Another way is to keep the labels of the taxonomy terms in a separate table, keep the unique tids for the each user. One kind of aggregate speculation might be "which tids have the same label". Then use the answer to get all nodes with these tids. One or two queries, depending on what do you want to optimise for. There are others more complex possible, but don't want to get into maths here. The post is long and boring enough :) Cheers, Vlado
Op dinsdag 8 maart 2005 23:06, schreef Morbus Iff:
Could we use hidden profile.module fields to associate uids with tids?
A node to user system is really simple using both nodeapi() and hook_user() and a simple extra table, user-node. Maybe this should be an API-only module, with two form-elements: one in profile to attach a new node to a user, one in nodeforms to atach new users to a node. Regards, Bèr -- [ Bèr Kessels | Drupal services www.webschuur.com ]
Moshe Weitzman wrote:
Last year there was a post here suggesting a "system" type of taxonomy vocabulary that would use the taxonomy tables but provide no user interface.
I reently decided to reimplement folksonomy using system vocabularies. As mentioned, these are successfully being used for image galleries in the upcoming image.module and already being used for forum.module in 4.6. My thought was to create a vocabulary for each user. The vocab would be created the first time the user tagged an item. That way, we don't unnecessarily create vocabs for every single user ... One disadvantage about this approach is that you will only be able to tag nodes. So the noption of tagging users would be gone (without reworking taxonomy). One nice application of tagging users is in buddylists where a given user can be a member of 'work', 'friends', etc. Think IM buddy list groups ... But overall, thats my thinking.
I'm open for patches that extend or rework core's taxonomy module. As explained in my previous e-mail, there is room for folksonomies (tags) in the taxonomy module. It might take more thought and work to extend or refactor parts of the taxonomy module but it is going to pay off. (Julian, if you decide to go with your own version of the folksonomy module, know that it might be difficult to upgrade later on.)
I've been looking at a similar idea to re-use the taxonomy module for a folksonomy. It looks to me like almost every function in taxonomy would have to be modified or replaced. And I don't think taxonomy would cope well with the target of 1000 tags in a folksonomy vocabulary.
Why is that? Can you provide us some examples? It would certainly help us grasp the challenges. -- Dries Buytaert :: http://www.buytaert.net/
Dries Buytaert <dries@buytaert.net> Wed, 9 Mar 2005 08:49:33
I'm open for patches that extend or rework core's taxonomy module. As explained in my previous e-mail, there is room for folksonomies (tags) in the taxonomy module. It might take more thought and work to extend or refactor parts of the taxonomy module but it is going to pay off.
I've just reworked my code to get rid of a folksonomy_realm table and use vocabulary instead. It all works with one small addition to taxonomy.module. taxonomy_form(), taxonomy_form_all(), taxonomy_node_form() really need a check so that they only return something for $vocabulary->module=='taxonomy' in the same way as in taxonomy_overview() The new forum module avoids this by only associating forum vocabularies with node->type=='forum'. But a folksonomy vocabulary would provide it's own form field and would be used on node types that also have categories. Perhaps this is overloading the module field. I can imagine a module that needed it's own admin but used taxonomy's form fields.
(Julian, if you decide to go with your own version of the folksonomy module, know that it might be difficult to upgrade later on.)
Noted ;-)
I've been looking at a similar idea to re-use the taxonomy module for a folksonomy. It looks to me like almost every function in taxonomy would have to be modified or replaced. And I don't think taxonomy would cope well with the target of 1000 tags in a folksonomy vocabulary.
Why is that? Can you provide us some examples? It would certainly help us grasp the challenges.
This 1000 tag thing. I've found vocabularies with a tree of 100 tags a bit awkward to use. The Admin vocab overview gets very long and the few places where you have all tags for a vocab in a combo box will also get awkward to use. I'm progressively using more and more of taxonomy as is. At the moment, I'm replacing _help _save _delete _menu _admin _overview _form_vocabulary -- Julian Bond Email&MSM: julian.bond at voidstar.com Webmaster: http://www.ecademy.com/ Personal WebLog: http://www.voidstar.com/ M: +44 (0)77 5907 2173 T: +44 (0)192 0412 433 S: callto://julian.bond/
participants (9)
-
Brady Jarvis -
Bèr Kessels -
Dries Buytaert -
Gerhard Killesreiter -
Julian Bond -
Morbus Iff -
Moshe Weitzman -
Negyesi Karoly -
Vladimir Zlatanov