[development] a bikeshed color problem
Bèr Kessels
ber at webschuur.com
Sat Sep 23 21:11:57 UTC 2006
Hello
Op vrijdag 22 september 2006 05:11, schreef Augustin (Beginner):
> > If all Drupal web sites were collaborating on gathering useful data, and
> > passing on this data to relevant organizations, we might collectively
> > achieve something.
> > One spam report against one IP might achieve nothing, but a concerted
> > effort to systematically denounce bad IPs might force people to take
> > positive actions.
> >
> > I really don't know how such a thing could be organized. One has to study
> > first how organizations fighting spam and organizations setting up
> > blacklists operate.
I used to publish my/our spam export lists from our host. What I did was
simple: pipe all the ip addresses and all the blocked domains from mysql db
into a textfile and have that file online. It was downloaded exactly 124
times, about a 100 times by bots, at least 4 times by me (tests). Wich leaves
20 interested people in this data.
Spam.module has an import export function which I used several times and I
must say that it works. People will argue that it won't work, but I can
assure: If you have a starter for all the bayesian tokens, your at least five
weeks of training (on an average blog) ahead. As opposed to not having that
starters. Spam.module comes (or used to, I haven't checked in a while) with
sqldumps to fill your filters.
Right. How about a distribution system for this? Lets say I ping over XMLRPC
to a flock of, five, six sites, if they have new tokens, IPS etc. If they do,
I upgrade my database with what (some of) these sites have learned. "Together
we learn a lot more". Each of these five, six sites do the same. This
exponential network enables you to get huge amounts of spammer data with each
ping.
Now. Consider me being a smart spammer (I still need to upgrade my CV one day)
and I actually know of this P2P system to upgrade eachothers tokens. In fact,
I am that smart (I really need to write that CV) that I know how to reverse
engineer those tokens. I learn, for example, that bikshed is bounced as a
word. I then use this datamine to upgrade my spamming techniques, and write
out mails that no longer contain the words bikeshed or any color known in the
rainbow.
Basically, I, as smart spammer can use that 'data mine' just as well as anyone
else.
So before we can use such a ring/flock/group/p2p upgrade system, we need to
find a way to sort out trust. Options I see right now are GPG/PGP keyrings,
Ebay-alike trust ratings, or ability to define the people whom can access
your datamine.
Bèr
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 191 bytes
Desc: not available
Url : http://lists.drupal.org/pipermail/development/attachments/20060923/0eb6d058/attachment.pgp
More information about the development
mailing list