On Thursday 09 February 2006 13:08, Arnab Nandi wrote:
Hi,
All the 8+ challenges that you mention are things that it's easy to write a script for. Plus, since captcha.module is a publicly available script, script writers can use it to come up with code that solves the challenges. For every set of N problems that you pose to the user, a determined enemy will come up with something that solves the N problems.
All true. I don't think I'm dealing with a "determined" enemy, though. But I'll concede your point that what I proposed is scriptable. [...]
I'm not fond of typing in strange words from images. It's rather demeaning, actually. However, an image captcha it is the ONLY challenge mechanism that you CAN NOT write a feasible script for. By
Here I disagree. According to my research on the 'net, there are plenty of existing scripts, and...
feasible, I mean something that can execute in limited time and memory for DDoS attacks. (Image recognition requires CPU and memory). Hence, a good challenge test is something that a human requires very little effort to do, and a computer requires a lot of CPU and memory.
...Moore's Law will soon change that, if it hasn't already. The image is small, after all -- not that many pixels.
So how do we get rid of the stupid image checks? I've been looking into trapdoor functions for a while, which are very easy to pose and check, but take time to solve. Factorization of prime multiples is a possible challenge. For example, If I take 31, and 13, and multiply them and tell you, "403". How long does it take to divide and tell me the solution? A while, if I use large prime numbers instead of 13 and 31. But if you do give me the answer, it's easy to check if you're right. This is the basis of modern cryptography, btw.
I knew that last bit, but hadn't thought of using JavaScript to let the *browser* do the validation. That's a very cool idea!
Hence, if we write some code that poses this challenge on the server side, and asks a smal bit of javascript to do this on the client side, any DDoS attacker will give up, because his CPU will die. However, if you're just writing casual comments on a web page, a little CPU spike is ignorable. You computer solves the problem by the time you type things, and we validate everything on the server side, and the human is not bothered at all.
Especially since in this situation the C/R mechanism only happens when the human wants to create an account or post something. Most page views are read-only, requiring no activation of the trapdoor.
But there are some problems to this approach, too lengthy to explain here :)
How about this (pseudocode): // Make a random 40-character string using A-Z and a-z chars. $random_string = make_random_chars(40, 'ABCDE.....XYZabcde.....xyz'); $n = 4; // Measure of complexity of challenge; admin-adjustable $hash = md5($random_string); $challenge = substring($random_string, 0, 40-$n) . ":" . $hash . ":" . $n; Now, the client has something like this: "AxEvBw......rUs:51cae0322f00f123.....93c:4" It knows the MD5 of the full string, and it knows the first 36 characters of that string. Now it needs to guess the rest of the characters by brute force (52**$n permutations), concatenating each set with the known 36 character random string until it finds the one whose MD5 matches the MD5 supplied by the server. The correct answer is the last four characters of $random_string. A lookup table in a bot-client won't help, because the challenge is random each time -- unless the client has enough disk to store 52**40 precomputed MD5 sums. Since 52**4 is about 7.3 million, the client will have to compute an average of just over 3 million MD5s for each iteration if $n is 4. If $n is 3, that number drops to an average of about 70 thousand MD5 computes on average. You can adjust the steepness of the complexity by changing the number of allowed characters in the allowed set (e.g., using only uppercase would cause $n=3 to average about 8500 computes and $n=4 to average about 225 thousand. This, and the ability to change $n to a higher number, allow the site admin to keep up with computational speed advances over time. Just sending the MD5 of $n random characters won't help, because it just might be feasible to store 7.3 million pregenerated MD5s in a bot. :-) That's the purpose of the added one-time-pad to the string. Even though we disclose it in the clear, it's virtually unique to this transaction. The PHP session ID could be used instead to save some compute time on the server generating all those random characters. You would use the PHPSESSID as the openly-disclosed part of $random_string, and the server would generate only the $n characters to append to PHPSESSID. (Remember that the client can easily obtain PHPSESSID from the GET variable or cookie, so the concealed "answer" part of $random_string can't be from PHPSESSID.)
Another possible, and simple solution is to restrict form submits by an IP address to 3 in a second, for example. However, this fails if you have a group of people behind a proxy.
Won't help in this case. They are submitting at a slow relative rate, relying on the nuisance of the extra accounts rather than actually trying to bring down my server. (I suspect they may be probing for Drupal sites that allow authenticated users to post without moderation, so they can post spams. My sites all are fully moderated, so that tactic fails.)
CONCLUSION: If you ARE interested in writing up a function like this, OR a function amongst the ones you had suggested, no problem! simply pickup the captcha.module in cvs, and start coding! It has an API that allows you to write simple _challenge() and _response parts, and it does the rest. For example, the default challenge in captcha.module cvs is the math problem you had posed (3 times 5).
Been there, done that...My site still runs Drupal 4.4 (and a BETA at that!), and I actually had to back-port captcha to get it working with that old Drupal. I was operating in urgent-mode to get the site at least minimally protected against this 'bot. Captcha as it stands seems to be working in that regard. I'll look into coding some enhanced C/R capabilities in a newer version. First, though, I need to get this site updated at least to Drupal 4.6. :-) Thanks for the comments! Scott -- ------------------------------------------------------------------------------- Syscrusher (Scott Courtney) Drupal page: http://drupal.org/user/9184 syscrusher at 4th dot com Home page: http://4th.com/