[drupal-devel] [bug] comment preview "Required" is easily bypassed

18 Sep 2005

      Issue status update for 
http://drupal.org/node/28420
Post a follow up: 
http://drupal.org/project/comments/add/28420

 Project:      Drupal
 Version:      cvs
 Component:    comment.module
 Category:     bug reports
 Priority:     normal
 Assigned to:  Jeremy@kerneltrap.org
 Reported by:  Jeremy@kerneltrap.org
 Updated by:   Jeremy@kerneltrap.org
 Status:       patch (code needs review)

I have written a function called hash_is_duplicate() that uses a
serialized array stored in the variable table to validate that the
token is not duplicated for the n last tokens.  (n is a constant,
defined as 256 in my patch).  It is relatively simple, and avoids the
need to introduce a new database table.  However, left to do is to
teach it the difference between a preview and a submit (ie, multiple
previews are okay, but only one submit is okay), I'll try and post a
finished patch soon.

An alternative idea is to simply remember the last token used (rather
than remembering n tokens).  At this time, spammers simply submit the
same thing over and over as fast as they can.  But they're not stupid,
and they'd quickly learn to switch back and forth between two different
spams.
...
For the contact module, maybe it is better to use the recipient's
e-mail
address to calculate the token. Like that, each personal contact form
would have a unique token, making it slightly more difficult to reuse
tokens.
That is what it currently does, on one of the forms.  On the other, the
information is dynamic so not a good token.  The only way we can use
dynamic data as tokens is to first force previews.
...
Another solution might be IP- or session-based tokens. Do spammers
post
from a single IP, or do they come from different IPs (e.g. using
hacked
computers). If you use the poster's IP to calculate the token, tokens
become
more dynamic.
Very good idea, and a simple change.  From what I've observed, spammers
use a wide range of IPs.
...
For sake of simplicity, I didn't commit the cron-based private key
regenerator.
Please reconsider.  There are dozens of md5 brute force crackers freely
available.  They would have to be customized to easily crack our tokens,
but as all data but the private key is freely available the
customization is simple.  Thus, a smart spammer could brute force crack
a private key with relative ease and a little patience.  An alternative
approach would be to double the length of the private key, which would
at least make this harder.  Thoughts?

Jeremy@kerneltrap.org

Previous comments:
------------------------------------------------------------------------

Mon, 08 Aug 2005 01:55:34 +0000 : Jeremy@kerneltrap.org

Setting "Preview comment" to "Required" does not strictly require that
the comment be previewed first.  This is being abused by spammers to
quickly and efficiently post spam comments.

I discovered this after I added a new feature to my new spam module [1]
to auto-blacklist spammer IP addresses, allowing me to block comment
spammers when they preview a comment and thus preventing them from ever
inserting their spam into my database.  I configured my comment module
to "require" comment previews, and yet found that the comments were
slipping past my filter.  I finally realized what the spammer is doing
is setting $_POST['op'] to 'Post comment', effectively bypassing the
preview phase.

I'm currently looking for a clean solution to this.  At the moment the
only idea I have is to generate a token at the preview phase, and
validate the token at the post phase.  Unfortunately the token would
have to be stored in the databse between the preview and the post,
which adds overhead.

Alternatively, I've considered using a time-based hash which would
constantly update depending on the time of day.  This could easily be
validated without storing anything in the database.  If too long has
gone between the preview and the post, an additional preview step would
be required...  The down side here is that the time-based hash would be
publically available, and thus the spammer could easily duplicate it in
their script.  A private key could solve for that, but increases the
complexity as it adds a configuration step.

I have the feeling I'm missing a simpler, cleaner solution. 
Suggestions?
[1] http://kerneltrap.org/jeremy/drupal/spam/

------------------------------------------------------------------------

Mon, 08 Aug 2005 02:26:21 +0000 : moshe weitzman

even if you get this fixed, won't these bots just add a preview step?

this 'preview required' feature is designed to maintain high quality
submissions by forcing users to proof read. it isn;t designed for
security.

i think you want to hook into comment_validate(). just add a hook here
- there is already a hook_comment() waiting for you to add an
operation.

------------------------------------------------------------------------

Mon, 08 Aug 2005 08:49:30 +0000 : Eaton

I posted a patch a few days ago (http://drupal.org/node/28255) that adds
validation and form construction hooks for comments. It's similar to the
one that the captcha module uses, though it adds comment form_pre and
form_post hooks instead of a single comment form hook.

------------------------------------------------------------------------

Mon, 08 Aug 2005 13:30:34 +0000 : Jeremy@kerneltrap.org

Attachment: http://drupal.org/files/issues/comment.module_11.patch (2.5 KB)
...
even if you get this fixed, won't these bots just add a preview step?
Eventually, yes, but it drastically changes their ability to fling spam
at a site.  As is, they simply have a script that shoots data out at
high speed without having to wait for messages to return from the
server.  It is the server that is doing all the work, thus making it
simple for a spammer to DoS a site.

If "preview required" really meant "preview required", they would be
forced to first automate clicking "preview", and then wait for a
response before clicking "submit".  This requires more resources on
their side, and allows us to add delays after clicking "preview" (if we
detect that they are a spammer) further using their resources.
...
this 'preview required' feature is designed to maintain high quality
submissions by forcing users to proof read. it isn;t designed for
security.
Regardless of the intention, I was misled to believe that configuring
my site to require previews would require that all comments were first
previewed.  As a site administrator, I would prefer to know that
"required" really means "required".
...
i think you want to hook into comment_validate(). just add a hook
here -
there is already a hook_comment() waiting for you to add an
operation.
Yes and no.  Ultimately yes that will work and will allow my spam
module to prevent the spam from ever being posted.  But it still leaves
the greatest burden on the web server, instead of on the spammer.  The
spammer can still use a very simple script that only pushes data, and
thus can generate spam at an unbelievable rapid rate.

Here is an example patch to enforce "preview required".  It's one idea,
I'm sure there are better ones.

------------------------------------------------------------------------

Mon, 08 Aug 2005 14:01:38 +0000 : Jeremy@kerneltrap.org

Attachment: http://drupal.org/files/issues/comment.module_12.patch (1.4 KB)

Here's a second version of the patch that doesn't require any manual
configuration.

------------------------------------------------------------------------

Mon, 08 Aug 2005 16:27:27 +0000 : Jose A Reyero

I like this idea, and the patch looks good

Still, I think it misses something, like some timestamp related hash,
because once you get the hash code you can post multiple comments with
that.

Another problem I can think of is, what happens when a cron run happens
between the preview and the post?? I'm afraid comments would get lost

For this second problem, I think a key generated only once after module
activation could do. About the first one....mmm... I'll sit down for a
while and think.....

------------------------------------------------------------------------

Tue, 09 Aug 2005 12:27:20 +0000 : Jeremy@kerneltrap.org

Attachment: http://drupal.org/files/issues/comment.module_13.patch (1.33 KB)
...
I think it misses something, like some timestamp related hash, because
once you get the hash code you can post multiple comments with that.
Using a timestamp will mean that the comment form "expires".  That is,
if you wait too long to preview your comment, it will generate an error
when you try to post.

Yes, technically a spammer could post one real comment, then based on
what was in the session from that they could post the same identical
comment over and over, so long as it was attached to the same node. 
But this is not what they do, they try and spread their spam throughout
your webpage.  Furthermore, the spam module is perfectly capable of
detecting and preventing this.
...
Another problem I can think of is, what happens when a cron run
happens
between the preview and the post?? I'm afraid comments would get lost
The key is only generated once, that's what the first test is about. 
In any case, in the unlikely event that the key were to change between
preview and post they would simply have to post a second time.

My earlier patch wasn't quite right, I was testing the token in the
wrong place.  This patch fixes that.

BTW:  This is beneficial for maintaining high quality submissions too,
as prior to this change someone could:
  1) enter a comment
  2) press preview
  3) completely change their comment (introducing a mistake)
  4) press submit and the comment (mistake and all) would go into the
database unpreviewed

After this change:
  1) enter a comment
  2) press preview
  3) completely change their comment (introducing a mistake)
  4) press submit and they get an error because they didn't preview
their changes - forcing them to preview once after any change

------------------------------------------------------------------------

Wed, 10 Aug 2005 03:50:33 +0000 : Jeremy@kerneltrap.org

FWIW:  I've been getting slammed by spam attacks this whole week. 
Installing this patch has made a huge difference.  Well over 100 spam
attempts per minute (sometimes two and three times that) and I hardly
notice the spammer, whereas before it was choking my database. 

(Granted, the spammer has not yet upgraded his script to first preview,
then submit.  But even if he did it wouldn't help him as testing has
verified that the new spam module would prevent the comments from ever
getting to the database.)

Additionally, user and anonymous (nonspam) comments continue to show up
at a normal rate.

------------------------------------------------------------------------

Tue, 16 Aug 2005 14:08:04 +0000 : Jeremy@kerneltrap.org

I would love to see _any_ discussion on this.  Drupal is currently too
easy to spam, with little effort on the spammer's side, and lots of
resources wasted on the Drupal side.  A patch like this will greatly
increase the spammer's burden, and make it possible to effectively
block even the most aggressive spammer attacks.

------------------------------------------------------------------------

Wed, 17 Aug 2005 16:24:04 +0000 : Jose A Reyero

Well, this patch is definitely better than what we have, and would save
some spam for sure.

But maybe keeping track, at the session level, of generated hashes for
a user, and then removing them when the comment is sent, could do the
work.

This way we can forget about previewing comments or not, and also the
"permission" to post the comment would expire when the session expires.
Any randomly generated value could do for this, no need for complex
hashes, but having nid and pid in the hash would add some extra
security.

------------------------------------------------------------------------

Wed, 17 Aug 2005 19:58:02 +0000 : breyten

Jeremy, a big +1 on the idea, but why not generate the private key when
it is actually needed (Ie, when displaying the comment form), instead
of wasting a _cron() hook on it?

------------------------------------------------------------------------

Thu, 18 Aug 2005 03:20:02 +0000 : Jeremy@kerneltrap.org
...
Well, this patch is definitely better than what we have, and would
save some spam for sure.
It is continuing to work very well on my site, which seems to be under
nearly perpetual spam attacks from multiple sources.
...
But maybe keeping track, at the session level, of generated hashes
for a user, and then
removing them when the comment is sent, could do the work.
The catch is:  the key has to be something unique to the server, not
guessable or learnable from the outside  Simply storing the hash data
in the session alone is not enough, as then the spammer could create
any random data and store it in the session.

That said, the hash could be generated off something other than the
text of the comment as it is now, so that a preview is not required. 
I'll look at doing something like that and submit another patch.
...
This way we can forget about previewing comments or not, and also the
"permission" to
post the comment would expire when the session expires. Any randomly
generated value
could do for this, no need for complex hashes, but having nid and pid
in the hash would
add some extra security.
nid and pid alone are worthless, as they are easy to learn.  The pid
can always be 0 (spam is rarely attached to a pre-existing comment). 
The nid is obtained in the path of where the spam is being posted.  

The solution is a "private-key", which is what my patch adds.  Then
sure, hash the private key plus the nid and the pid, and you've got
enough protection to prevent most spammers.  To make it even more
secure, automatic rey-keying could be easily accomplished.

------------------------------------------------------------------------

Thu, 18 Aug 2005 04:09:10 +0000 : Jeremy@kerneltrap.org

Attachment: http://drupal.org/files/issues/comment.module_15.patch (1.18 KB)

The attached patch:
 1) gets rid of the _cron() hook
 2) no longer requires that comments be previewed

Prior to this patch, comment spammers were able to send data to a
Drupal server acting as though they'd filled out a comment form and
pressed submit.  As they didn't actually use the form, they could
submit spam comments at an obscene rate.  

With this patch, comment spammers will have to actually load the form,
enter text, and press submit.  Yes, that can still be automated, but it
takes much more work and slows them down, as they have to wait for the
entry form to load each time.

Unfortunately a spammer could manually submit one comment, then re-use
that same session info over and over to attach repeated spam comments
to the same node.  Such an attempt would be detected and blocked by the
spam module if enabled, but again such a session re-use attack could be
done without loading the form each time.  Fortunately there is much
less gain for a spammer to submit 100 spam comments on the same page,
versus submitting 100 spam comments each on a different page as they do
now.

Ideas to improve upon this concept include:
 - re-key every day or week, changing the private key regularly to be
sure it couldn't ever be permanently cracked
 - add a key table, and generate a unique key for every comment form. 
essentially, upon comment form creation generate a random key which is
stored both in a database table and in the session.  when a comment is
submited, look for the key from the session in the database table, if a
match is found delete it from the database table and post the comment. 
this would prevent session re-use, but adds overhead.  i don't know if
it's worth it, perhaps as an external module if the hooks were
available.

------------------------------------------------------------------------

Fri, 19 Aug 2005 18:55:11 +0000 : drumm

<?php
form_set_error('error', t('Validation error, please be sure cookies are enabled on your browser.'));
?>

form_set_error [2]()'s fist argument should be the name of a form
field, not 'error.' Using (..., 'error') would be better in this case.

And the actual message needs work. Since this is a hidden field I don't
think it has anything to do with cookies.
[2] http://drupaldocs.org/api/head/function/form_set_error

------------------------------------------------------------------------

Fri, 19 Aug 2005 18:56:41 +0000 : drumm

The unclosed link in my last update was supposed to say
drupal_set_message(..., 'error')

------------------------------------------------------------------------

Sat, 20 Aug 2005 16:00:15 +0000 : Jeremy@kerneltrap.org

Attachment: http://drupal.org/files/issues/comment.module_16.patch (1.11 KB)

drupal_set_message(..., 'error') isn't sufficient to prevent the comment
from being posted.  I have instead updated the patch to set the error on
the hidden 'token' form field.

I have updated the message to read:
"Unable to validate your comment, please try again.  If this error
persists, please contact the site administrator."

If you don't like the error message, better suggestions are welcome.

------------------------------------------------------------------------

Fri, 09 Sep 2005 03:16:06 +0000 : Jeremy@kerneltrap.org

Any feedback on this patch?  I have been running it on my website for a
couple of weeks, and it has completely stopped the most persistent
auto-spam scripts that had been posting poker type comments constantly
to my site.

------------------------------------------------------------------------

Sat, 10 Sep 2005 18:12:15 +0000 : Zed Pobre

This patch is against HEAD?  It doesn't want to apply to my 2.6.3
comment.module.

------------------------------------------------------------------------

Sun, 11 Sep 2005 18:09:08 +0000 : Abalieno

It's is for cvs but I'm trying to manually apply it to 4.6.3.

Will comment later to tell how it went.

------------------------------------------------------------------------

Mon, 12 Sep 2005 19:49:17 +0000 : Abalieno

Well, it worked.

No spam at all in more than a day. I don't know if other users are
having problem but this patch broke the tool the spammer was using.

:)

------------------------------------------------------------------------

Wed, 14 Sep 2005 02:47:40 +0000 : Jeremy@kerneltrap.org

Attachment: http://drupal.org/files/issues/form_validation.patch (2.72 KB)

Here's a completely rewritten version of the patch, based on an email
discussion with Dries.  This provides a more generic interface that can
be used to validate other form submissions, not just comments.  The
patch introduces two new functions, form_token() and form_validate(). 
The first function uses a private key and a public key to set a token
in a hidden field.  The second function validates the token.  The patch
also updates the comment module, demonstrating how these new functions
are used.

More information as to how the patch works can be found in the comments
that are within the code.

Based on my own experiences on kerneltrap.org, this patch blocks 99% of
the current comment spammers.

------------------------------------------------------------------------

Wed, 14 Sep 2005 03:16:21 +0000 : Jeremy@kerneltrap.org

Attachment: http://drupal.org/files/issues/form_validate-2.patch (1.1 KB)

This optional patch is intended to be applied after my
form_validation.patch attached to #20 above.  It makes the
drupal_private_key more secure by regenerating it every 24 hours. 
Without this patch, it would be possible for a spammer to use
brute-force to learn a site's private key by reading the code and
observing the token generated for a given form.  With this patch, brute
force is still possible, but with this patch it would have to be done
every 24 hours.

It is possible for a form to be generated prior to a rekey, and then to
be validated after a rekey.  In this event, the key will fail validation
and the user will see a message telling them something like "Validation
error, please try again.  If this error persists, please contact the
site administrator."  When they try pressing "submit" again it will
work fine.

I kept this patch separate as it adds complexity that may not be
desired in core.  Personally I think it is important and should be
merged along with the first patch.  I added it to the system module as
it seemed the most logical place, and is a "required" module that can
not be disabled.

------------------------------------------------------------------------

Wed, 14 Sep 2005 03:58:35 +0000 : Jeremy@kerneltrap.org

Attachment: http://drupal.org/files/issues/contact.module_8.patch (1.44 KB)

This third patch updates the contact.module to use the new form token
validation functions.

Note that this is not yet a perfect solution.  Someone wanting to spam
the contact form could manually fill it out once to obtain a valid
token.  They could then use that token to spam the contact form
repeatedly, so long as they don't change the fields that were used to
generate the token.  

The solution is to keep a history of recently validated tokens.  Each
time a token is validated, make sure it was not already recently
validated -- if it was, don't allow it a second time.  I would be happy
to provide patches for this if it is deemed necessary, and these first
patches are merged.  It would require a new database table, and a
db_query to validate the token.  Ideas for alternative solutions are
welcome.

------------------------------------------------------------------------

Sun, 18 Sep 2005 11:51:00 +0000 : Dries

For the contact module, maybe it is better to use the recipient's e-mail
address to calculate the token.  Like that, each personal contact form
would have a unique token, making it slightly more difficult to reuse
tokens.  

Another solution might be IP- or session-based tokens.  Do spammers
post from a single IP, or do they come from different IPs (e.g. using
hacked computers).  If you use the poster's IP to calculate the token,
tokens become more dynamic.

------------------------------------------------------------------------

Sun, 18 Sep 2005 11:53:50 +0000 : Dries

I committed the current implementation of this patch, though I'd like us
to think some more about how we can make reusing tokens more difficult
without introducing complex logic.  For sake of simplicity, I didn't
commit the cron-based private key regenerator.