[drupal-devel] looking for text/RSS data
I'm working on a set of text classification algorithms. They are extremely experimental at the moment. I have the basics covered. If useful the code will end up in contrib. I want to use them for automatic classification of RSS feed items to a taxonomy. To test the algorithms and see what and how to improve on them I need to get access to some well classified text data - preferably drupal based. Can anyone share their db or a part of, so I can try and run the tests. The websites, which started me on this path are too plain for the time being to use them as guinea pigs. I would prefer to run the tests on some isolated machines in here, so I can measure the performance and run some more elaborate statistics, which while useful, will definitely bring an average web server to crawl. Thanks, Vlado
participants (1)
-
Vladimir Zlatanov