[drupal-devel] [feature] Ban hosts from the 'Top Visitors' page

moshe weitzman drupal-devel at drupal.org
Tue Jun 7 03:58:11 UTC 2005


Issue status update for http://drupal.org/node/24135

 Project:      Drupal
 Version:      cvs
 Component:    user.module
 Category:     feature requests
 Priority:     normal
 Assigned to:  moshe weitzman
 Reported by:  moshe weitzman
 Updated by:   moshe weitzman
 Status:       patch
 Attachment:   http://drupal.org/files/issues/ban_3.diff (12.73 KB)

I made the requested changes. I also optimized drupal_deny() so that it
usually issues only 1 query.




moshe weitzman



Previous comments:
------------------------------------------------------------------------

June 2, 2005 - 00:54 : moshe weitzman

Attachment: http://drupal.org/files/issues/ban.diff (5.94 KB)

The intent of this patch is to provide an easy way to ban hosts that are
consuming too much bandwidth. Ideal for over-aggressive crawlers.




------------------------------------------------------------------------

June 2, 2005 - 00:59 : moshe weitzman

Attachment: http://drupal.org/files/issues/ban_0.diff (5.95 KB)

Oops. Here is the latest version.




------------------------------------------------------------------------

June 2, 2005 - 05:59 : Dries

Any particular reason we are not storing this in the database?  Storing
it in the database sounds like the better option to me.




------------------------------------------------------------------------

June 2, 2005 - 06:15 : moshe weitzman

well, we can't store it in the sessions table because many crawlers
don't persist cookies and thus get a new session for every request. so
we would have to store in a new table and I figured that I didn't need
to add a new query for every page view. it would be a fast query, so
perhaps a new table is OK. any more comments about variable table
versus new table?




------------------------------------------------------------------------

June 2, 2005 - 07:14 : breyten

Having a table would allow for a simple query like:


$result  = db_query("SELECT * FROM {banned_hosts} WHERE ipaddr = '%s'",
$_SERVER["REMOTE_ADDR"]);


which is probably faster than a combined in_array and array_keys I
suppose.




------------------------------------------------------------------------

June 3, 2005 - 07:02 : Dries

How about including this in the 'administer - access control - account
rules' page (?q=admin/access/rules).  We'd have to rephrase, and
possibly rework it a little to extend the e-mail and username rules
with IP/hostname rules but it sounds like a logical place to maintain
IP-bans.  In addition, we would be able to reuse the existing SQL
table.  (Let's not use variables!)  Thoughts?




------------------------------------------------------------------------

June 3, 2005 - 08:40 : breyten

Only a +1. It sounds very sensible :)




------------------------------------------------------------------------

June 5, 2005 - 17:18 : moshe weitzman

Attachment: http://drupal.org/files/issues/ban_1.diff (10.5 KB)

As suggested, this version integrates nicely into the admin/access/rules
pages. You are now able to ban hosts from statistics page or the access
rules page. I think there are use cases for both.


I moved renamed user_deny() tp drupal_deny() and moved it to
bootstrap.inc in order to let us block hosts during the bootstrap. We
check for a blocked host right after loading the database APi, which is
the earliest possible moment.


This feature makes it possible now to restrict your site to a few
approved IP addresses (or a range of addresses). Might be useful for
high security sites.




------------------------------------------------------------------------

June 5, 2005 - 21:30 : Jaza

This is a great start, but I think that this feature still needs
improvement before it's ready to be used by joe shmoe administrator.
The biggest danger that I foresee is administrators banning IPs that
they really shouldn't be banning. "Great, so you've banned the
aggressive bot from Kazakhstan, that put your throttle up to 3 when it
slammed your site last night. But you've also banned GoogleBot! Now
your site won't get indexed!"


There need to be mechanisms in place to stop this from happening.
Possibilities:


- IP lookup, so you can see the resolved hostname before banning the
host. Perhaps show the resolved hostname on the confirm dialog, e.g.
"Are you sure you want to ban the IP 64.68.82.14
(crawler10.googlebot.com)?"
- Automatic rather than (or as well as) manual banning. This might
create more problems than it solves, but in the long-term it's a better
solution. Perhaps something similar to the filtering mechanisms used by
the spam module?


But overall, +1 for this patch, as long as people know what they're
doing when they use it.




------------------------------------------------------------------------

June 5, 2005 - 22:38 : moshe weitzman

Attachment: http://drupal.org/files/issues/ban_2.diff (10.49 KB)

same patch, but removing some debug comments which slipped in




------------------------------------------------------------------------

June 6, 2005 - 13:13 : Dries

Moshe, can you do one more round of updates?  The patch can and will go
in after that.


1. q=admin/access/rules/check needs to be updated.  Maybe use
form_group() too.


2. The ban/unban URL in statistics.module is really ugly and not very
Drupal-ish.  Mind to tidy up the URL scheme.  (Haven't checked but do
we perform a check_url() on that URL with user data?  Might be taken
care of at a lower level.)


3. I'd rename ip_address to either hostname (what do we use in other
tables?) or ip.


Otherwise this works perfect; I just banned myself. ;)







More information about the drupal-devel mailing list