[drupal-devel] Dealing with spam (was rel=nofollow)

Fri Jan 21 10:41:31 UTC 2005

> >>What I /do/ beleive will be a great improvement is p2p sharing
> >>of the filtered tokens, regexps et al.
.....
> So, you use the MT public master, and maintain this with cron'd daily
> fetches on drupal instances. You also have a private personal blacklist,
> when items are added to this, they are web-of-trust written DIRECTLY to
> other drupal instances via XML-RPC. (this solves Bèr's issue)
> 
> You also publish your merged list, and just your personal lists so other
> instances NOT in your web of trust can anon pull from you. You also
> periodicaly submit your new items to the MT master to help maintain
> everyone in the world.

All that is good, but introduces another problem:
If a spammer knows a filter or a subset of it, it is trivial to write
a generator for spam based on the filter.

This means that there should be a web of trust and the various spam
filters out there should be different enough for the spammer to be very
hard to create such a generator.

Sending, pushing, pulling, whatever regexes and other rules securely so
it is not eavsedropped is difficult enough, but there always be leaks.

What I suggest is sending not rules but actual spam content, 
having the different learning approaches should help to have very
different spam filters. The spammer will be able to create one of 
their own, but there will be no certainty that their new content will
not be caught immediately. The spam content can be distributed freely,
it just has to be not available to the search engines, etc...

This way you will have enough data to start your new learner,
you can have immediate updates, via a p2p-like exchange between 
servers, you need a minimum level of trust - 'identify yourself'.

I though I'll add my 2p.
Cheers,
Vlado

-- 
Vladimir Zlatanov <vlado at dikini.net>