[drupal-devel] Dealing with spam (was rel=nofollow)

Chris Messina chris.messina at gmail.com
Fri Jan 21 20:39:11 UTC 2005


These parts are of particular interest to me:

> What I don't like about exchanging hard rules is that you will end up
> with a unified database of urls, regexes, whatever... 
> Hard rules, means clever peolple will find a way around
> them - just look at the progress of the email spam fight.

> In the short run the regex exchange is going to work. 
> Its downside is the downside of every overtrained
> learning algorithm. It looses precision big time, especialy in an
> evolving system. It does not adapt well.

> A brief comparison: regex exchange - surgery; content exchange between
> learning systems - holistic medicine. 

I was thinking some more about these adaptive filters that learn and
the sharing of spam posts/comments between sites. It seems to be like
the p2p model would work really well in this case.

Just as we had proposed doing a sort of "trusted sites" system where
you exchange regexps with your trusted network of sites (which, to my
thinking, seems inherently *more* risky than just dealing with spam
locally), the way that we could leverage spammers' own behavior would
be to syndicate an unpublished feed between trusted sites of ONLY spam
content.

I mean, that way, if someone taps your "spam feed", who cares?! BUT,
if Drupal can tap into Spread Firefox's spam.rss feed, Drupal's
learning filters would learn that much faster -- and in real time,
because as SFX gets hit, Drupal would have the content from the feed
which has been pre-identified as spam. Thus when and if the spammer
moves to Drupal, Drupal will be one step ahead of them, having learned
from SFX.

Thoughts?

Chris



More information about the drupal-devel mailing list