[drupal-devel] Dealing with spam (was rel=nofollow)

mike at fuckingbrit.com mike at fuckingbrit.com
Thu Jan 20 14:53:47 UTC 2005


>Jeremy pointed out earlier that black and whitelisting made no sense.
>Because of the huge amount of (open proxies) domains spammers use. 

Black and white listing of IPs/hosts isn't sensible no, but the MT
blacklist is actualy a content list. It contains lots of regular
expressions, primarily for target urls.

>The best method still is to filter on content. 

Which is exactly what these lists do, if a post contains a link to a known
site, then drop it. Essentialy, it's exactly the same thing being
discussed, a list of places to block. Only the MT approach is to share it
publicly, not have small webs of trust.

>>What I /do/ beleive will be a great improvement is p2p sharing
>>of the filtered tokens, regexps et al.

As I said, MT already do this, the master list is here:
http://www.jayallen.org/comment_spam/blacklist.txt

Latest Changes:
http://www.jayallen.org/comment_spam/blacklist_changes.txt
Also as RSS 1.0:
http://www.jayallen.org/comment_spam/feeds/blacklist-changes.rdf
And RSS 2.0:
http://www.jayallen.org/comment_spam/feeds/blacklist-changes.xml


I, and apparantly MT and GL think it's better to have this information
public.

Yes, if the spammer is blocked everywhere they may move their
freeviagra.com to freeviagra.org, but if each time they do as soon as /one/
site blocks that url /all/ sites block that url (ok with the manual
submission to the MT master list there is a small lag, but GL implements
list sharing via xml amongst gl sites I think, or at least has the first
stages of this, and with drupal, cron and what you are suggesting...) then
the technique is going to be much more efficient than if when they start
advertising a new site, it has to be added to the lists on hundreds of
small webs of trust.

Yeah, the spammers can know they are blocked and buy a new url for their
viagra and animal porn emporium, but, they will be blocked faster and more
efficiently. A better trade off I think.

Now, what you are suggesting /also/ has it's place in this. If you set
trust between sites, then those sites could automaticaly inject their
content to each other with no intervention.

So, you use the MT public master, and maintain this with cron'd daily
fetches on drupal instances. You also have a private personal blacklist,
when items are added to this, they are web-of-trust written DIRECTLY to
other drupal instances via XML-RPC. (this solves Bèr's issue)

You also publish your merged list, and just your personal lists so other
instances NOT in your web of trust can anon pull from you. You also
periodicaly submit your new items to the MT master to help maintain
everyone in the world.

But

--------------------------------------------------------------------
mail2web - Check your email from the web at
http://mail2web.com/ .





More information about the drupal-devel mailing list