On Mon, 2005-12-05 at 13:11 -0600, Mark Fredrickson wrote:
Hello,
I have an interest in machine learning that I would like to bring to bear on Drupal, and I am hoping to enlist the help of some other people who share this interest - or can help me by providing data. I am interested. But everything depends on time and ability.
Briefly, machine learning is the algorithmic application of statistical principles. A classic example is the Bayesian spam in your email program/gateway/etc. Based on a learned model, this filter classifies incoming mail as either SPAM or NOT SPAM based on a vector of data drawn from the message. Jeremy uses a Bayesian classifier in spam.moduel
I am looking for interested parties to join me in developing a series of machine learning modules for Drupal. These modules will use data that Drupal can collect to predict outcomes. Examples might include smart "What's Related" type modules, better troll and spam bot protection, better searching, auto categorization, and a wide variety of other predictive tasks. Actually, this is why I started doing the relations stuff I'm currently coding. For a primitive, non-learning, feasibility test based on some simple metrics have a look at http://dikini.net/30.11.2005/relations_battle_plan_ii_and_first_results and the similar things block.
I envision the following phases to this project: I think the plan may be good, but it looks as a very legthy ang very general.
What I learned about machine learning and datamining over the years is that they are most successfull, when you have a very well defined target of what do you want to achieve/find. With drupal, we have a multitude of applications, a zillion data-models, and infinite number of "this thing is in my head, but I'll do it" todos. Having a generic catch-all module is going to fail badly. What might be useful is a framework of basic methods - bayesian learner, rule based learner, etc... which, can be used in concrete applications, but if not used - it is a waste. Or going the evolutionary approach is pick a target, adaptive behaviour for example, so a website adapts to the user preferences and the current trends and presents the most relevant and up to date information. It's a good idea overall. And good luck. Cheers, Vlado