[drupal-support] how does the throttle module works?

Jeremy Andrews lists at kerneltrap.org
Wed Apr 13 01:42:13 UTC 2005


> But it would be possible to just modify the throttle.module to add
> more  dynamic settings to it? My idea isn't to change every
> module, but just to  add fuctions on top of what Drupal will do
> with the next version.

Go for it.  If you feel you've made useful modifications, you can
submit patches via Drupal's project tracker.

> I rather dislike this continuous loss of features to add
> "simplification".  Designing a good system doesn't mean removing
> its potential but just to  better organize and show it. But this
> is a personal opinion that is not  important.

The system has to be usable.  This was a complaint of the previous
throttle system, that it was too complicated for people to be able
to understand/use it.

> You know what could work wonderfully? To still have each of the
> six levels  (0-5) and then let the user to type directly the
> number of pages for each.  This would allow to use all the five
> levels or just the first and the last.  Depending on what the
> admin chooses.

You are asking for added complexity, which makes it much more
confusing to the end user.  As the author of the throttle module, I
have mixed feelings about this.  My personally preference is for
everything to be configurable.  That is, until I have to support it.
 Then simplicity is preferable.

> Anyway, how does the new system work? I tried to follow your links
> but I  didn't grasp on what the new switch is based. The current
> throttle checks  the hits from time to time depending on how it's
> set. The link tells that  the new module (4.6.0) doesn't use
> anymore the access log, so what does it  use?

Take a look at throttle_exit().  It counts the number of anonymous
users, and the number of registered users that are currently online,
using the sessions table.  If the number of either is greater than
the configured maximum, the throttle is enabled.  If the number is
lower than both, then it is disabled.  This test is done for every
page load.

> Side-question: why a spider bot is counted as "many different
> users" and not  just as the same one requesting different pages?
> If I'm an anonymous user  and if I keep refreshing the page I'm
> still counted as "1" in the "Who's  online" block. So why Dries
> wants to base the throttle on the number of  anonymous users and
> NOT on the number of pages served? I don't understand  how it
> could work, that block doesn't seem consistent. If there's just

Spiders generally use a large number of IP addresses, not just one
IP address.  The more aggressive the bot, generally the larger the
number of IP addresses that are being used.

That's not say a spammer/spider couldn't use just one IP address.

> one  users that is savagely spamming my site the method Dries
> proposes won't  detect that. I also do not understand what he
> writes in that link, it  actually sounds like the OPPOSITE should
> happen. If the throttle checks how  many pages are served in a
> minute it is supposed to easily locate the spam,  while if it is
> based on the "Who's online" block it will simply ignore an 
> emergency.

A normal Drupal user is only aware of the # that is displayed in the
Who's online block.  That's their view of what's happening on the
site.  Whether or not this is sufficient for the throttle is yet to
be seen once 4.6 is released.  In particular, Dries found that the
old method was too difficult to configure and in practice didn't
work for drupal.org.  I have not heard any complaints of the new
method, but we'll see how it goes once 4.6 is released.  I suspect
the change will be met favorably.

> I've read that message but so how the throttle know when to switch
> back at  level 4? If no more checks are performed why the site
> doesn't stay at level  5 forever once it reaches it?

In pre-4.6 it was a cron event.  In 4.6+, the throttle does one or
two queries per page load regardless of if the throttle is enabled
or not.  (The one or two is dependent on if the throttle is testing
the number of users, the number of guests, or both.)

> This is my proposal:
> Let the admin set the number of hits before the throttle switches
> on, then  add a field where the admin can choose for how long the
> emergency module  will remain active. No matter of usage. When
> this timeout is over (for  example after ten minutes) a new check
> is performed to decide if there are  the conditions to go back at
> the "normal" mode.

That's how the cron implementation worked.  Look at throttle_cron()
in the pre-4.6 version of the throttle module.

> Basing the throttle module on the "Who's online" block is NOT
> CONSISTENT.

Writing in capital letters is annoying.

> Yes, but I don't want by default to always use the cache on my
> site. This is  why I asked if it would be useful to dynamically
> switch it on and off by  checking the throttle status. That's what
> I coded myself into the current  module, I hope it will work:

I can't imagine why you'd not want the cache enabled.  However, it's
simple enough to tie it to the throttle, as you've done. 

-Jeremy



More information about the drupal-support mailing list