[drupal-devel] [feature] Pass include and exclude parameters to cron.php for fine grained cron timing
Issue status update for http://drupal.org/node/19173 Post a follow up: http://drupal.org/project/comments/add/19173 Project: Drupal Version: cvs Component: base system Category: feature requests Priority: normal Assigned to: robertDouglass Reported by: robertDouglass Updated by: Dries Status: patch Not sure this is really needed. Most (if not all) cron hooks are smart enough not to trash your system. If you have to do include/exclude-tricks something is wrong. I'm tempted to say '-1'. There is a typo in the documentation: 'includ' -> 'include'. There is also a tab in the code. Dries Previous comments: ------------------------------------------------------------------------ Sun, 20 Mar 2005 10:08:59 +0000 : robertDouglass Attachment: http://drupal.org/files/issues/cronpatch.txt (1.99 KB) Some sites need fine grained timing for cron runs. For example, a site that relies heavily on moblogging, mailhandler or any other email based function would want to connect with the email server pretty often, say ever 2-5 minutes. On another site I'm working on we generate flat files for all of a certain node type on a regular basis, though not every 2-5 minutes. Thus the need for cron tasks on different schedules. This patch allows you to make a list of modules to either include or exclude when running cron, so that several cron tasks for one site can be defined. The default behavior in the absence of both parameters is to run all, so the patch preserves backwards compatibility. * Examples: * * runs all hooks * http://mysite.com/cron.php * * runs all hooks except the search and aggregator modules * http://mysite.com/cron.php?exclude=search,aggregator * * runs only search, archive and aggregator modules * http://mysite.com/cron.php?includ=search,archive,aggregator ------------------------------------------------------------------------ Wed, 23 Mar 2005 21:23:59 +0000 : Dries Isn't it possible to run all cron functions every 2-5 minutes? The impact of that should be minimal. ------------------------------------------------------------------------ Wed, 23 Mar 2005 21:33:06 +0000 : moshe weitzman an alternative approach is to make a quick PHP page which simply calls mailhandler_cron and any others that need to be called more frequently than normal. then call that php page from cron every couple minutes ... not so sure this complexity is worth having. ------------------------------------------------------------------------ Thu, 24 Mar 2005 07:32:32 +0000 : robertDouglass Whether or not this "complexity" is needed depends a lot on what you expect to do with cron. As soon as one starts aggregating not tens but several hundred RSS feeds or more, it is impossible to run cron every 2 minutes. Yet if the site depends on mailhandler keeping user moblog submissions current, 2 minutes is already about the maximum latency time acceptable. Thus the need for crons on different schedules. In terms of usage, writing ?include=mailhandler,search,event isn't so complex. On the code level, splitting that list on comma and looping over the resultant array also isn't really that complex. The patch defaults to normal behavior for people who don't need the feature, and I've seen more than one site that had to solve this problem. ------------------------------------------------------------------------ Thu, 24 Mar 2005 10:48:09 +0000 : Robert Castelo Could different cron runs be set to run different tasks based on a cron ID number.... For instance, set up 3 separate cron to run every hour, each one calls the cron.php page including a number variable which runs a certain set of tasks. An Admin control panel would list all cron tasks and allow administrators to split them into separate cron runs based on the number variable passed to the script when calling cron.php. ------------------------------------------------------------------------ Thu, 24 Mar 2005 11:24:53 +0000 : robertDouglass Robert, the id parameter is already a practical necessity for this patch if cron_busy is to be effective. I'm open to suggestions as to how we would track various cron tasks and whether they are busy. First things first, though: does anyone except me need this? I'm not going to program an admin-configurable cron mechanism unless there is demand for it. Furthermore, such a system would really be incomplete unless there were some actual mechanism for setting these cron runs up from the admin interface as well. This wouldn't be such a bad idea, I know a lot of people who still type /cron.php into their browser every day or so to updated thier sites (yes I've mentioned poormanscron). Anyway, first reactions to this patch have been cold, so maybe nobody wants it. ------------------------------------------------------------------------ Thu, 24 Mar 2005 11:39:13 +0000 : Robert Castelo +1 from me. I'll be needing something like this once my email newsletter is released and starts hogging cron runs. ------------------------------------------------------------------------ Thu, 24 Mar 2005 16:06:31 +0000 : Chris Johnson It does seem like a good idea to allow running different cron tasks at different intervals, so as to be efficient and reduce web server loads. There is little point in running tasks that do not need to be run. Many cron tasks only need to be run once to a few times per day. Note also that cron job scheduling on most Unix-variants is only within a +/- 1 minute range, so trying to run a cron program every 2 minutes is approaching the crumbling edge of being "on time." ------------------------------------------------------------------------ Thu, 24 Mar 2005 16:37:39 +0000 : robertDouglass I think I am leaning in favor of extending cron.php in the following way: 1) Add admin screen which scans all modules for existence of cron hook 2) set a cron_interval variable for each one 3) allow admin to set an interval 4) check whether the current cron run falls outside of each module's interval before invoking its cron hook 5) before a module's cron hook gets called a {module_name}_cron_busy variable gets set. This would function like the current cron_busy now, just at a modular level. The admin would then set one cron tab to run at the smallest needed interval and could then decide on a module-by-module basis how often each is to run. How does this sound to people? ------------------------------------------------------------------------ Wed, 27 Jul 2005 16:18:23 +0000 : Bèr Kessels -1 on this one. I think the hook_cron should carry the logic whether or not it wants to be ran. Maybe aintroduce an API that allows theis checkto be done easier? drupal_elapsed_run($name, interval) that will check when $name was last updated, if it was longer then $interval ago, return TRUE, and update the last run of $name. ------------------------------------------------------------------------ Wed, 27 Jul 2005 21:32:29 +0000 : robertDouglass Bèr, how can you adjust the interval then? If it is hardcoded how often the cron should run, administrators loose all control and you can't have different sites have different schedules, unless of course you have cron configuration in the interface itself. I think cron scheduling should be completely done with the cron tool itself, and that the URL or command that is given should be able to handle the fine tuning of which module's hooks get called. ------------------------------------------------------------------------ Thu, 28 Jul 2005 05:53:15 +0000 : clydefrog When I last used aggregator (which was probably back in the 4.4 days), it let me set a refresh interval for each feed. Is this no longer the case? Note I have nothing against this patch. It sounds like a good general solution for removing calls to hooks that are resource-heavy every time they run but still allowing some other hooks to be run more often. However, I also like Ber's idea of a convenience function to make delays easier to measure. ------------------------------------------------------------------------ Thu, 28 Jul 2005 07:01:00 +0000 : Uwe Hermann +1 for the include/exclude feature, I'd use this on multiple sites... Uwe. ------------------------------------------------------------------------ Thu, 28 Jul 2005 07:24:01 +0000 : kbahey +1 on an include/exclude mechanism. However, I am not sure that having it in the URL is a good idea. This could expose things that we may not want exposed in the future. Anyone on the net can run your stuff more frequently than you want to. what I would like to see is a cron settings page with all the modules that have a cron hook listed, and a frequency that can be changed. cron.php can then check that setting per module and decide to run it or not. Something along those lines would be better than URL based solutions which requires many entries in crontab, and may open exploits.
participants (1)
-
Dries