Issue status update for http://drupal.org/node/28420 Post a follow up: http://drupal.org/project/comments/add/28420 Project: Drupal Version: cvs Component: comment.module Category: bug reports Priority: normal Assigned to: Jeremy@kerneltrap.org Reported by: Jeremy@kerneltrap.org Updated by: Jeremy@kerneltrap.org Status: patch (code needs review) I have written a function called hash_is_duplicate() that uses a serialized array stored in the variable table to validate that the token is not duplicated for the n last tokens. (n is a constant, defined as 256 in my patch). It is relatively simple, and avoids the need to introduce a new database table. However, left to do is to teach it the difference between a preview and a submit (ie, multiple previews are okay, but only one submit is okay), I'll try and post a finished patch soon. An alternative idea is to simply remember the last token used (rather than remembering n tokens). At this time, spammers simply submit the same thing over and over as fast as they can. But they're not stupid, and they'd quickly learn to switch back and forth between two different spams.
For the contact module, maybe it is better to use the recipient's e-mail address to calculate the token. Like that, each personal contact form would have a unique token, making it slightly more difficult to reuse tokens.
That is what it currently does, on one of the forms. On the other, the information is dynamic so not a good token. The only way we can use dynamic data as tokens is to first force previews.
Another solution might be IP- or session-based tokens. Do spammers post from a single IP, or do they come from different IPs (e.g. using hacked computers). If you use the poster's IP to calculate the token, tokens become more dynamic.
Very good idea, and a simple change. From what I've observed, spammers use a wide range of IPs.
For sake of simplicity, I didn't commit the cron-based private key regenerator.
Please reconsider. There are dozens of md5 brute force crackers freely available. They would have to be customized to easily crack our tokens, but as all data but the private key is freely available the customization is simple. Thus, a smart spammer could brute force crack a private key with relative ease and a little patience. An alternative approach would be to double the length of the private key, which would at least make this harder. Thoughts? Jeremy@kerneltrap.org Previous comments: ------------------------------------------------------------------------ Mon, 08 Aug 2005 01:55:34 +0000 : Jeremy@kerneltrap.org Setting "Preview comment" to "Required" does not strictly require that the comment be previewed first. This is being abused by spammers to quickly and efficiently post spam comments. I discovered this after I added a new feature to my new spam module [1] to auto-blacklist spammer IP addresses, allowing me to block comment spammers when they preview a comment and thus preventing them from ever inserting their spam into my database. I configured my comment module to "require" comment previews, and yet found that the comments were slipping past my filter. I finally realized what the spammer is doing is setting $_POST['op'] to 'Post comment', effectively bypassing the preview phase. I'm currently looking for a clean solution to this. At the moment the only idea I have is to generate a token at the preview phase, and validate the token at the post phase. Unfortunately the token would have to be stored in the databse between the preview and the post, which adds overhead. Alternatively, I've considered using a time-based hash which would constantly update depending on the time of day. This could easily be validated without storing anything in the database. If too long has gone between the preview and the post, an additional preview step would be required... The down side here is that the time-based hash would be publically available, and thus the spammer could easily duplicate it in their script. A private key could solve for that, but increases the complexity as it adds a configuration step. I have the feeling I'm missing a simpler, cleaner solution. Suggestions? [1] http://kerneltrap.org/jeremy/drupal/spam/ ------------------------------------------------------------------------ Mon, 08 Aug 2005 02:26:21 +0000 : moshe weitzman even if you get this fixed, won't these bots just add a preview step? this 'preview required' feature is designed to maintain high quality submissions by forcing users to proof read. it isn;t designed for security. i think you want to hook into comment_validate(). just add a hook here - there is already a hook_comment() waiting for you to add an operation. ------------------------------------------------------------------------ Mon, 08 Aug 2005 08:49:30 +0000 : Eaton I posted a patch a few days ago (http://drupal.org/node/28255) that adds validation and form construction hooks for comments. It's similar to the one that the captcha module uses, though it adds comment form_pre and form_post hooks instead of a single comment form hook. ------------------------------------------------------------------------ Mon, 08 Aug 2005 13:30:34 +0000 : Jeremy@kerneltrap.org Attachment: http://drupal.org/files/issues/comment.module_11.patch (2.5 KB)
even if you get this fixed, won't these bots just add a preview step?
Eventually, yes, but it drastically changes their ability to fling spam at a site. As is, they simply have a script that shoots data out at high speed without having to wait for messages to return from the server. It is the server that is doing all the work, thus making it simple for a spammer to DoS a site. If "preview required" really meant "preview required", they would be forced to first automate clicking "preview", and then wait for a response before clicking "submit". This requires more resources on their side, and allows us to add delays after clicking "preview" (if we detect that they are a spammer) further using their resources.
this 'preview required' feature is designed to maintain high quality submissions by forcing users to proof read. it isn;t designed for security.
Regardless of the intention, I was misled to believe that configuring my site to require previews would require that all comments were first previewed. As a site administrator, I would prefer to know that "required" really means "required".
i think you want to hook into comment_validate(). just add a hook here - there is already a hook_comment() waiting for you to add an operation.
Yes and no. Ultimately yes that will work and will allow my spam module to prevent the spam from ever being posted. But it still leaves the greatest burden on the web server, instead of on the spammer. The spammer can still use a very simple script that only pushes data, and thus can generate spam at an unbelievable rapid rate. Here is an example patch to enforce "preview required". It's one idea, I'm sure there are better ones. ------------------------------------------------------------------------ Mon, 08 Aug 2005 14:01:38 +0000 : Jeremy@kerneltrap.org Attachment: http://drupal.org/files/issues/comment.module_12.patch (1.4 KB) Here's a second version of the patch that doesn't require any manual configuration. ------------------------------------------------------------------------ Mon, 08 Aug 2005 16:27:27 +0000 : Jose A Reyero I like this idea, and the patch looks good Still, I think it misses something, like some timestamp related hash, because once you get the hash code you can post multiple comments with that. Another problem I can think of is, what happens when a cron run happens between the preview and the post?? I'm afraid comments would get lost For this second problem, I think a key generated only once after module activation could do. About the first one....mmm... I'll sit down for a while and think..... ------------------------------------------------------------------------ Tue, 09 Aug 2005 12:27:20 +0000 : Jeremy@kerneltrap.org Attachment: http://drupal.org/files/issues/comment.module_13.patch (1.33 KB)
I think it misses something, like some timestamp related hash, because once you get the hash code you can post multiple comments with that.
Using a timestamp will mean that the comment form "expires". That is, if you wait too long to preview your comment, it will generate an error when you try to post. Yes, technically a spammer could post one real comment, then based on what was in the session from that they could post the same identical comment over and over, so long as it was attached to the same node. But this is not what they do, they try and spread their spam throughout your webpage. Furthermore, the spam module is perfectly capable of detecting and preventing this.
Another problem I can think of is, what happens when a cron run happens between the preview and the post?? I'm afraid comments would get lost
The key is only generated once, that's what the first test is about. In any case, in the unlikely event that the key were to change between preview and post they would simply have to post a second time. My earlier patch wasn't quite right, I was testing the token in the wrong place. This patch fixes that. BTW: This is beneficial for maintaining high quality submissions too, as prior to this change someone could: 1) enter a comment 2) press preview 3) completely change their comment (introducing a mistake) 4) press submit and the comment (mistake and all) would go into the database unpreviewed After this change: 1) enter a comment 2) press preview 3) completely change their comment (introducing a mistake) 4) press submit and they get an error because they didn't preview their changes - forcing them to preview once after any change ------------------------------------------------------------------------ Wed, 10 Aug 2005 03:50:33 +0000 : Jeremy@kerneltrap.org FWIW: I've been getting slammed by spam attacks this whole week. Installing this patch has made a huge difference. Well over 100 spam attempts per minute (sometimes two and three times that) and I hardly notice the spammer, whereas before it was choking my database. (Granted, the spammer has not yet upgraded his script to first preview, then submit. But even if he did it wouldn't help him as testing has verified that the new spam module would prevent the comments from ever getting to the database.) Additionally, user and anonymous (nonspam) comments continue to show up at a normal rate. ------------------------------------------------------------------------ Tue, 16 Aug 2005 14:08:04 +0000 : Jeremy@kerneltrap.org I would love to see _any_ discussion on this. Drupal is currently too easy to spam, with little effort on the spammer's side, and lots of resources wasted on the Drupal side. A patch like this will greatly increase the spammer's burden, and make it possible to effectively block even the most aggressive spammer attacks. ------------------------------------------------------------------------ Wed, 17 Aug 2005 16:24:04 +0000 : Jose A Reyero Well, this patch is definitely better than what we have, and would save some spam for sure. But maybe keeping track, at the session level, of generated hashes for a user, and then removing them when the comment is sent, could do the work. This way we can forget about previewing comments or not, and also the "permission" to post the comment would expire when the session expires. Any randomly generated value could do for this, no need for complex hashes, but having nid and pid in the hash would add some extra security. ------------------------------------------------------------------------ Wed, 17 Aug 2005 19:58:02 +0000 : breyten Jeremy, a big +1 on the idea, but why not generate the private key when it is actually needed (Ie, when displaying the comment form), instead of wasting a _cron() hook on it? ------------------------------------------------------------------------ Thu, 18 Aug 2005 03:20:02 +0000 : Jeremy@kerneltrap.org
Well, this patch is definitely better than what we have, and would save some spam for sure.
It is continuing to work very well on my site, which seems to be under nearly perpetual spam attacks from multiple sources.
But maybe keeping track, at the session level, of generated hashes for a user, and then removing them when the comment is sent, could do the work.
The catch is: the key has to be something unique to the server, not guessable or learnable from the outside Simply storing the hash data in the session alone is not enough, as then the spammer could create any random data and store it in the session. That said, the hash could be generated off something other than the text of the comment as it is now, so that a preview is not required. I'll look at doing something like that and submit another patch.
This way we can forget about previewing comments or not, and also the "permission" to post the comment would expire when the session expires. Any randomly generated value could do for this, no need for complex hashes, but having nid and pid in the hash would add some extra security.
nid and pid alone are worthless, as they are easy to learn. The pid can always be 0 (spam is rarely attached to a pre-existing comment). The nid is obtained in the path of where the spam is being posted. The solution is a "private-key", which is what my patch adds. Then sure, hash the private key plus the nid and the pid, and you've got enough protection to prevent most spammers. To make it even more secure, automatic rey-keying could be easily accomplished. ------------------------------------------------------------------------ Thu, 18 Aug 2005 04:09:10 +0000 : Jeremy@kerneltrap.org Attachment: http://drupal.org/files/issues/comment.module_15.patch (1.18 KB) The attached patch: 1) gets rid of the _cron() hook 2) no longer requires that comments be previewed Prior to this patch, comment spammers were able to send data to a Drupal server acting as though they'd filled out a comment form and pressed submit. As they didn't actually use the form, they could submit spam comments at an obscene rate. With this patch, comment spammers will have to actually load the form, enter text, and press submit. Yes, that can still be automated, but it takes much more work and slows them down, as they have to wait for the entry form to load each time. Unfortunately a spammer could manually submit one comment, then re-use that same session info over and over to attach repeated spam comments to the same node. Such an attempt would be detected and blocked by the spam module if enabled, but again such a session re-use attack could be done without loading the form each time. Fortunately there is much less gain for a spammer to submit 100 spam comments on the same page, versus submitting 100 spam comments each on a different page as they do now. Ideas to improve upon this concept include: - re-key every day or week, changing the private key regularly to be sure it couldn't ever be permanently cracked - add a key table, and generate a unique key for every comment form. essentially, upon comment form creation generate a random key which is stored both in a database table and in the session. when a comment is submited, look for the key from the session in the database table, if a match is found delete it from the database table and post the comment. this would prevent session re-use, but adds overhead. i don't know if it's worth it, perhaps as an external module if the hooks were available. ------------------------------------------------------------------------ Fri, 19 Aug 2005 18:55:11 +0000 : drumm <?php form_set_error('error', t('Validation error, please be sure cookies are enabled on your browser.')); ?> form_set_error [2]()'s fist argument should be the name of a form field, not 'error.' Using (..., 'error') would be better in this case. And the actual message needs work. Since this is a hidden field I don't think it has anything to do with cookies. [2] http://drupaldocs.org/api/head/function/form_set_error ------------------------------------------------------------------------ Fri, 19 Aug 2005 18:56:41 +0000 : drumm The unclosed link in my last update was supposed to say drupal_set_message(..., 'error') ------------------------------------------------------------------------ Sat, 20 Aug 2005 16:00:15 +0000 : Jeremy@kerneltrap.org Attachment: http://drupal.org/files/issues/comment.module_16.patch (1.11 KB) drupal_set_message(..., 'error') isn't sufficient to prevent the comment from being posted. I have instead updated the patch to set the error on the hidden 'token' form field. I have updated the message to read: "Unable to validate your comment, please try again. If this error persists, please contact the site administrator." If you don't like the error message, better suggestions are welcome. ------------------------------------------------------------------------ Fri, 09 Sep 2005 03:16:06 +0000 : Jeremy@kerneltrap.org Any feedback on this patch? I have been running it on my website for a couple of weeks, and it has completely stopped the most persistent auto-spam scripts that had been posting poker type comments constantly to my site. ------------------------------------------------------------------------ Sat, 10 Sep 2005 18:12:15 +0000 : Zed Pobre This patch is against HEAD? It doesn't want to apply to my 2.6.3 comment.module. ------------------------------------------------------------------------ Sun, 11 Sep 2005 18:09:08 +0000 : Abalieno It's is for cvs but I'm trying to manually apply it to 4.6.3. Will comment later to tell how it went. ------------------------------------------------------------------------ Mon, 12 Sep 2005 19:49:17 +0000 : Abalieno Well, it worked. No spam at all in more than a day. I don't know if other users are having problem but this patch broke the tool the spammer was using. :) ------------------------------------------------------------------------ Wed, 14 Sep 2005 02:47:40 +0000 : Jeremy@kerneltrap.org Attachment: http://drupal.org/files/issues/form_validation.patch (2.72 KB) Here's a completely rewritten version of the patch, based on an email discussion with Dries. This provides a more generic interface that can be used to validate other form submissions, not just comments. The patch introduces two new functions, form_token() and form_validate(). The first function uses a private key and a public key to set a token in a hidden field. The second function validates the token. The patch also updates the comment module, demonstrating how these new functions are used. More information as to how the patch works can be found in the comments that are within the code. Based on my own experiences on kerneltrap.org, this patch blocks 99% of the current comment spammers. ------------------------------------------------------------------------ Wed, 14 Sep 2005 03:16:21 +0000 : Jeremy@kerneltrap.org Attachment: http://drupal.org/files/issues/form_validate-2.patch (1.1 KB) This optional patch is intended to be applied after my form_validation.patch attached to #20 above. It makes the drupal_private_key more secure by regenerating it every 24 hours. Without this patch, it would be possible for a spammer to use brute-force to learn a site's private key by reading the code and observing the token generated for a given form. With this patch, brute force is still possible, but with this patch it would have to be done every 24 hours. It is possible for a form to be generated prior to a rekey, and then to be validated after a rekey. In this event, the key will fail validation and the user will see a message telling them something like "Validation error, please try again. If this error persists, please contact the site administrator." When they try pressing "submit" again it will work fine. I kept this patch separate as it adds complexity that may not be desired in core. Personally I think it is important and should be merged along with the first patch. I added it to the system module as it seemed the most logical place, and is a "required" module that can not be disabled. ------------------------------------------------------------------------ Wed, 14 Sep 2005 03:58:35 +0000 : Jeremy@kerneltrap.org Attachment: http://drupal.org/files/issues/contact.module_8.patch (1.44 KB) This third patch updates the contact.module to use the new form token validation functions. Note that this is not yet a perfect solution. Someone wanting to spam the contact form could manually fill it out once to obtain a valid token. They could then use that token to spam the contact form repeatedly, so long as they don't change the fields that were used to generate the token. The solution is to keep a history of recently validated tokens. Each time a token is validated, make sure it was not already recently validated -- if it was, don't allow it a second time. I would be happy to provide patches for this if it is deemed necessary, and these first patches are merged. It would require a new database table, and a db_query to validate the token. Ideas for alternative solutions are welcome. ------------------------------------------------------------------------ Sun, 18 Sep 2005 11:51:00 +0000 : Dries For the contact module, maybe it is better to use the recipient's e-mail address to calculate the token. Like that, each personal contact form would have a unique token, making it slightly more difficult to reuse tokens. Another solution might be IP- or session-based tokens. Do spammers post from a single IP, or do they come from different IPs (e.g. using hacked computers). If you use the poster's IP to calculate the token, tokens become more dynamic. ------------------------------------------------------------------------ Sun, 18 Sep 2005 11:53:50 +0000 : Dries I committed the current implementation of this patch, though I'd like us to think some more about how we can make reusing tokens more difficult without introducing complex logic. For sake of simplicity, I didn't commit the cron-based private key regenerator.