[bayes][bcc][faked-from] Re: [development] Scratching an itch: Machine Learning

Mark Fredrickson mfredrickson at ppmns.org
Tue Dec 6 15:17:44 UTC 2005


> I am interested. But everything depends on time and ability.
> 
Excellent. For others sitting on the fence, I am not asking for a lot of
time or coding (though I will not turn it away), but more for help gathering
data. I do not have an active Drupal installation at my finger tips (yet),
so I need assistance with data collection.

> Jeremy uses a Bayesian classifier in spam.moduel
> 
I'll check it out. Perhaps the classifiers should be factored out into a
separate component module. I'll look at the feasibility of that.


> Actually, this is why I started doing the relations stuff I'm currently
> coding.
> For a primitive, non-learning, feasibility test based on some simple
> metrics have a look at
> http://dikini.net/30.11.2005/relations_battle_plan_ii_and_first_results
> and the similar things block.

Thanks. I'll investigate.

>> I envision the following phases to this project:
> I think the plan may be good, but it looks as a very legthy ang very
> general. 
> 

I agree it is long, but I want to be honest about the project. My experience
has been that creating a successful machine learning model is a slow,
iterative process. One refines the data collection, and then modifies the
model, tests, and repeats. Depending on your hunches at the beginning this
can be either a quick process or a painfully lengthy one.

> What I learned about machine learning and datamining over the years is
> that they are most successfull, when you have a very well defined target
> of what do you want to achieve/find.

This is good advice. I hope to go from "I have an itch" to "I have a
concrete task on which to concentrate" soon. If anyone has suggestions, I'm
all ears.

-Mark



More information about the development mailing list