Issue status update for http://drupal.org/node/29328 Post a follow up: http://drupal.org/project/comments/add/29328 Project: Drupal Version: cvs Component: statistics.module Category: feature requests Priority: normal Assigned to: Anonymous Reported by: mikeryan Updated by: Bèr Kessels Status: patch (code needs review) a big -1. we should STORE all (read absolute all) logs, yet FILTER them in the reports. What makes you think crawlers are not users? Or that I am not interested in crawlers? I think you might be more interested in adding value to xstatistics, which wants to be a more advanced stats module. And last, but not least, adding checks for contrib modules in core (if_module_exists) is a no-go. In that case, you could try to introduce a hook, but hardcoded checks for modules will simply not do. Let us hear moer comments and then decide the status of this patch. Bèr Kessels Previous comments: ------------------------------------------------------------------------ Sun, 21 Aug 2005 18:08:32 +0000 : mikeryan Attachment: http://drupal.org/files/issues/statistics.module_2.patch (6.73 KB) Justification: For all but the most heavily-trafficked sites, the statistics reported by Drupal are severely skewed by visits from crawlers, and from the administrators themselves. Assuming that the purpose of the statistics is to inform administrators about visits from human beings other than themselves, it is highly desirable to do our best to ignore other visits. To that end, I developed the statistics_filter module [1] (and its spinoff, the browscap module [2]). Why core? There's enough concern over the logging the statistics module does in the exit hook for the performance issues to be detailed in the help. To work as a contributed module, the statistics_filter module needs to undo what the statistics module did, essentially doubling the overhead for accesses that are meant to be ignored. If incorporated into the statistics module directly, the filtering functionality will actually reduce the database overhead (no database queries at all for ignored roles). Open issue Ignoring crawlers (which are the biggest part of the issue for most sites - my own site, with modest volume, gets 40% of its raw traffic from the Google crawler) requires the browscap database to identify crawlers. Currently I have maintenance of the browscap data (as well as provision for browser/crawler statistcs) encapsulated in a separate module. Should this support be submitted to core as a separate module, or integrated into the statistics module? Attached is a patch to statistics.module implementing filtering by roles, with filtering out crawlers dependent on an external browscap module. I hope this patch can be accepted into Drupal 4.7 - if the feeling is that the browscap code should be incorporated into statistics.module, I can do that. Thanks. [1] http://drupal.org/node/18013 [2] http://drupal.org/node/26569