I hope to go from "I have an itch" to "I have a concrete task on which to concentrate" soon. If anyone has suggestions, I'm all ears. aggregator add-on - feed item classification: The problem - you have chunks of text, some feeds provide tags, you want that filtered and mapped to your own website tags/classification/taxonomy.
This screams for an AI based approach. It is text classification, very regular stream of very small text chunks. Pre-classified. What learning methods can be used? SOM/WebSOM - maybe, but too static, Bayesian learners, LVQ and other vector space methods, .... You can have a lot of different models and scenarios to play with, and a tons of data - just hook to technorati, drupal.org/planet, icerocket, .... choose your preferences. You can play with both supervised aqnd unsupervised learning algorithms - there is space for both kinds here. Actually, search has improved it's data model in HEAD. You might want to have a look at that. The data from the search table can be be used without conversion with most of the learning algorithms in the literature out there. Wide open, and best of all this is really needed. Cheers Vlado