[drupal-devel] [feature] Pass include and exclude parameters to cron.php for fine grained cron timing

Dries drupal-devel at drupal.org
Fri Jul 29 07:43:56 UTC 2005


Issue status update for 
http://drupal.org/node/19173
Post a follow up: 
http://drupal.org/project/comments/add/19173

 Project:      Drupal
 Version:      cvs
 Component:    base system
 Category:     feature requests
 Priority:     normal
 Assigned to:  robertDouglass
 Reported by:  robertDouglass
 Updated by:   Dries
 Status:       patch

Not sure this is really needed.  Most (if not all) cron hooks are smart
enough not to trash your system.  If you have to do
include/exclude-tricks something is wrong. I'm tempted to say '-1'.


There is a typo in the documentation: 'includ' -> 'include'.  There is
also a tab in the code.




Dries



Previous comments:
------------------------------------------------------------------------

Sun, 20 Mar 2005 10:08:59 +0000 : robertDouglass

Attachment: http://drupal.org/files/issues/cronpatch.txt (1.99 KB)

Some sites need fine grained timing for cron runs. For example, a site
that relies heavily on moblogging, mailhandler or any other email based
function would want to connect with the email server pretty often, say
ever 2-5 minutes. On another site I'm working on we generate flat files
for all of a certain node type on a regular basis, though not every 2-5
minutes. Thus the need for cron tasks on different schedules. This
patch allows you to make a list of modules to either include or exclude
when running cron, so that several cron tasks for one site can be
defined. The default behavior in the absence of both parameters is to
run all, so the patch preserves backwards compatibility.


 * Examples:
 *
 * runs all hooks
 * http://mysite.com/cron.php
 *
 * runs all hooks except the search and aggregator modules
 * http://mysite.com/cron.php?exclude=search,aggregator
 *
 * runs only search, archive and aggregator modules
 * http://mysite.com/cron.php?includ=search,archive,aggregator




------------------------------------------------------------------------

Wed, 23 Mar 2005 21:23:59 +0000 : Dries

Isn't it possible to run all cron functions every 2-5 minutes?  The
impact of that should be minimal.




------------------------------------------------------------------------

Wed, 23 Mar 2005 21:33:06 +0000 : moshe weitzman

an alternative approach is to make a quick PHP page which simply calls
mailhandler_cron and any others that need to be called more frequently
than normal. then call that php page from cron every couple minutes ...
not so sure this complexity is worth having.




------------------------------------------------------------------------

Thu, 24 Mar 2005 07:32:32 +0000 : robertDouglass

Whether or not this "complexity" is needed depends a lot on what you
expect to do with cron. As soon as one starts aggregating not tens but
several hundred RSS feeds or more, it is impossible to run cron every 2
minutes. Yet if the site depends on mailhandler keeping user moblog
submissions current, 2 minutes is already about the maximum latency
time acceptable. Thus the need for crons on different schedules.


In terms of usage, writing 


  ?include=mailhandler,search,event 


isn't so complex. On the code level, splitting that list on comma and
looping over the resultant array also isn't really that complex. The
patch defaults to normal behavior for people who don't need the
feature, and I've seen more than one site that had to solve this
problem.




------------------------------------------------------------------------

Thu, 24 Mar 2005 10:48:09 +0000 : Robert Castelo

Could different cron runs be set to run different tasks based on a cron
ID number....


For instance, set up 3 separate cron to run every hour, each one calls
the cron.php page including a number variable which runs a certain set
of tasks.


An Admin control panel would list all cron tasks and allow
administrators to split them into separate cron runs based on the
number variable passed to the script when calling cron.php.




------------------------------------------------------------------------

Thu, 24 Mar 2005 11:24:53 +0000 : robertDouglass

Robert,


the id parameter is already a practical necessity for this patch if
cron_busy is to be effective. I'm open to suggestions as to how we
would track various cron tasks and whether they are busy.


First things first, though: does anyone except me need this?


I'm not going to program an admin-configurable cron mechanism unless
there is demand for it. Furthermore, such a system would really be
incomplete unless there were some actual mechanism for setting these
cron runs up from the admin interface as well. This wouldn't be such a
bad idea, I know a lot of people who still type /cron.php into their
browser every day or so to updated thier sites (yes I've mentioned
poormanscron). Anyway, first reactions to this patch have been cold, so
maybe nobody wants it.




------------------------------------------------------------------------

Thu, 24 Mar 2005 11:39:13 +0000 : Robert Castelo

+1 from me.


I'll be needing something like this once my email newsletter is
released and starts hogging cron runs.




------------------------------------------------------------------------

Thu, 24 Mar 2005 16:06:31 +0000 : Chris Johnson

It does seem like a good idea to allow running different cron tasks at
different intervals, so as to be efficient and reduce web server loads.
 There is little point in running tasks that do not need to be run. 
Many cron tasks only need to be run once to a few times per day.


Note also that cron job scheduling on most Unix-variants is only within
a +/- 1 minute range, so trying to run a cron program every 2 minutes is
approaching the crumbling edge of being "on time."




------------------------------------------------------------------------

Thu, 24 Mar 2005 16:37:39 +0000 : robertDouglass

I think I am leaning in favor of extending cron.php in the following
way:


1) Add admin screen which scans all modules for existence of cron hook
2) set a cron_interval variable for each one
3) allow admin to set an interval
4) check whether the current cron run falls outside of each module's
interval before invoking its cron hook
5) before a module's cron hook gets called a {module_name}_cron_busy
variable gets set. This would function like the current cron_busy now,
just at a modular level.


The admin would then set one cron tab to run at the smallest needed
interval and could then decide on a module-by-module basis how often
each is to run.


How does this sound to people?




------------------------------------------------------------------------

Wed, 27 Jul 2005 16:18:23 +0000 : Bèr Kessels

-1 on this one.
I think the hook_cron should carry the logic whether or not it wants to
be ran. Maybe aintroduce an API that allows theis checkto be done
easier? drupal_elapsed_run($name, interval)  that will check when $name
was last updated, if it was longer then $interval ago, return TRUE, and
update the last run of $name.




------------------------------------------------------------------------

Wed, 27 Jul 2005 21:32:29 +0000 : robertDouglass

Bèr, how can you adjust the interval then? If it is hardcoded how often
the cron should run, administrators loose all control and you can't have
different sites have different schedules, unless of course you have cron
configuration in the interface itself. I think cron scheduling should be
completely done with the cron tool itself, and that the URL or command
that is given should be able to handle the fine tuning of which
module's hooks get called.




------------------------------------------------------------------------

Thu, 28 Jul 2005 05:53:15 +0000 : clydefrog

When I last used aggregator (which was probably back in the 4.4 days),
it let me set a refresh interval for each feed. Is this no longer the
case?


Note I have nothing against this patch. It sounds like a good general
solution for removing calls to hooks that are resource-heavy every time
they run but still allowing some other hooks to be run more often.
However, I also like Ber's idea of a convenience function to make
delays easier to measure.




------------------------------------------------------------------------

Thu, 28 Jul 2005 07:01:00 +0000 : Uwe Hermann

+1 for the include/exclude feature, I'd use this on multiple sites...
Uwe.




------------------------------------------------------------------------

Thu, 28 Jul 2005 07:24:01 +0000 : kbahey

+1 on an include/exclude mechanism.


However, I am not sure that having it in the URL is a good idea. This
could expose things that we may not want exposed in the future. Anyone
on the net can run your stuff more frequently than you want to.


what I would like to see is a cron settings page with all the modules
that have a cron hook listed, and a frequency that can be changed.


cron.php can then check that setting per module and decide to run it or
not.


Something along those lines would be better than URL based solutions
which requires many entries in crontab, and may open exploits.







More information about the drupal-devel mailing list