[development] Scratching an itch: Machine Learning

Bèr Kessels ber at webschuur.com
Tue Dec 6 14:14:30 UTC 2005


I am interested. A looong time ago i developed a relations.module that used 
the search index + algorythms to define relations between nodes. 
http://cvs.drupal.org/viewcvs/drupal/contributions/sandbox/ber/related/

Op dinsdag 06 december 2005 10:32, schreef vlado:
> On Mon, 2005-12-05 at 13:11 -0600, Mark Fredrickson wrote:
> > Hello,
> >
> > I have an interest in machine learning that I would like to bring to bear
> > on Drupal, and I am hoping to enlist the help of some other people who
> > share this interest - or can help me by providing data.
>
> I am interested. But everything depends on time and ability.
>
> > Briefly, machine learning is the algorithmic application of statistical
> > principles. A classic example is the Bayesian spam in your email
> > program/gateway/etc. Based on a learned model, this filter classifies
> > incoming mail as either SPAM or NOT SPAM based on a vector of data drawn
> > from the message.
>
> Jeremy uses a Bayesian classifier in spam.moduel
>
> > I am looking for interested parties to join me in developing a series of
> > machine learning modules for Drupal. These modules will use data that
> > Drupal can collect to predict outcomes. Examples might include smart
> > "What's Related" type modules, better troll and spam bot protection,
> > better searching, auto categorization, and a wide variety of other
> > predictive tasks.
>
> Actually, this is why I started doing the relations stuff I'm currently
> coding. For a primitive, non-learning, feasibility test based on some
> simple metrics have a look at
> http://dikini.net/30.11.2005/relations_battle_plan_ii_and_first_results
> and the similar things block.
>
> > I envision the following phases to this project:
>
> I think the plan may be good, but it looks as a very legthy ang very
> general.
>
> What I learned about machine learning and datamining over the years is
> that they are most successfull, when you have a very well defined target
> of what do you want to achieve/find. With drupal, we have a multitude of
> applications, a zillion data-models, and infinite number of "this thing
> is in my head, but I'll do it" todos. Having a generic catch-all module
> is going to fail badly. What might be useful is a framework of basic
> methods - bayesian learner, rule based learner, etc... which, can be
> used in concrete applications, but if not used - it is a waste. Or going
> the evolutionary approach is pick a target, adaptive behaviour for
> example, so a website adapts to the user preferences and the current
> trends and presents the most relevant and up to date information.
>
> It's a good idea overall.
>
> And good luck.
>
> Cheers,
> Vlado


More information about the development mailing list