[drupal-devel] looking for text/RSS data

Vladimir Zlatanov vlado at dikini.net
Mon Jan 24 14:30:40 UTC 2005


I'm working on a set of text classification algorithms. They are
extremely experimental at the moment. I have the basics covered.
If useful the code will end up in contrib.

I want to use them for automatic classification of RSS
feed items to a taxonomy. To test the algorithms and see what and how
to improve on them I need to get access to some well classified text
data - preferably drupal based. 

Can anyone share their db or a part of, so I can try and run the tests.
The websites, which started me on this path are too plain for the time
being to use them as guinea pigs. 

I would prefer to run the tests on some isolated machines in here, so
I can measure the performance and run some more elaborate statistics,
which while useful, will definitely bring an average web server to
crawl.

Thanks,
Vlado





More information about the drupal-devel mailing list