Upgrade Your Drupal Skills

We trained 1,000+ Drupal Developers over the last decade.

See Advanced Courses NAH, I know Enough

Improving the performance of Drupal's cron by using the Elysia cron module

Parent Feed: 

One great feature that Drupal has is the ability to make modules run certain tasks, often heavy ones, in the background at preset intervals. This can be achieved by a module implementing hook_cron.

Core uses this feature to index new content for the search module, ping module to notify remote sites of new content, fetch new release information from drupal.org, poll other sites for RSS feeds, and more.

Various contributed modules use this for various purposes, such as mailing out newsletters, cleaning up logs, synchronizing content with other servers/sites, and much more ...

Core Cron: All or None

This powerful core feature has some limitations though, such as:

  • All hook cron implementations for all modules are run at the same time, in sequence alphabetically or according to module weight.
  • When cron for one module is stuck, all modules following it will not be executed, and cron will not run again until 1 hour has passed.
  • There is no way to know which module is the one that caused the entire cron to get stuck. Moreover, there is no instrumentation information to know which cron hook takes the most time.

2bits.com have proposed core patches to report overcome the lack of instrumentation, by logging the information to the watchdog. The patches are useful only to those who apply them. It is unlikely that they will get in core any time soon.

For a practical example, you can use Job Queue with our own queue mail module to improve end user response time, and avoid timeouts due to sending a lot of emails. This scheme defers sending to when cron is run, and not when a user submits a node or a comment.

This works well, but for core, all cron hooks run at the same time. If you set cron to run every hour, then email sending could be delayed by an hour or even more if job queue cannot send them all in one run. If you make cron run more frequently, e.g. every 15 minutes, then all the heavy hooks such as search indexing and log cleanup will also run every 15 minutes consuming lots of resources.

Enter Elysia Cron ...

With Elysia cron, you can now have the best of both worlds: you can set cron for job_queue to run every minute, and defer other heavy stuff to once a day during off hours, or once an hour. The email is delivered quickly, within minutes, and we don't incur the penalty of long running cron hooks.

Features and Benefits of Elysia Cron

The features that Elysia cron offers are many, the important ones, with a focus on performance, are:

  • You can run different hook_cron implementations for different modules at a different frequency.
  • You are aware what the resource and performance impact of each hook_cron implementation is. This includes the time it took to run it last, the average, and maximum time. This information is very valuable in distributing different hooks across the day, and their frequencies.
  • Set a configurable maximum for cron invocations. Drupal core has a hard coded value of 240 seconds. You can adjust this up or down as per your needs.
  • Handles "stuck" crons better than core. In core, if cron is stuck, it takes one hour for it to automatically recover. In Elysia cron, the other hook invocations continue to run normally.
  • You can set the weight for each module, independent from the weight for the module in the system table. Using this, you can have a different order of execution for modules.
  • You can group modules in "contexts", assigning different run schedules for different contexts, or disable contexts globally.
  • The ability to assign a cron key, or a white list of allowed hosts that can execute cron.
  • Selectively disable cron for one or more modules, but not others, or all cron.
  • Selectively run cron for only one module.
  • Defining a cronapi that developers can use.
  • It requires no patching of core or contributed modules.

Examples of Elysia Cron in action

Here is captcha's cron, which has been configured to run only once a day in the early hours of the morning:

As well, dblog's cron runs once a day too. No need to trigger this every hours or twice an hour.

Search here is shown to be the most heavy of all cron hooks. But still, we run it twice every hour, so that the search is always fresh.

Statistics cleanup is kind of heavy too, so we run it only once a day.

Finally, xmlsitemap is a useful module, yet it is also heavy on a site with lots of nodes. Therefore we run it only once a day.

The above is not cast in stone for these module. They will vary from one site to the other depending on the server configuration, resources available and data set sizes. Moreover, even for the same site, it is recommended to monitor regularly and adjust these on an ongoing basis.

Alternatives to Elysia Cron

Elysia cron is no alone though. There are other modules that have overlapping functions, such as Super Cron, Cron Plus, and even a Cron API module. Super Cron seems promising, but Elysia does everything we need so far, so the motivation to evaluate it low on the list of priorities.

Here is an attempt to compare the various cron modules, but so far it is sparse on information.

A more powerful solution, but also more complex and heavy weight is the use of tools like Hudson Continuous Integration. Since it runs within Java it adds dependencies to the usual LAMP-only server as well as being more demanding on resources. You can read a full article on it here.

Author: 
Original Post: 

About Drupal Sun

Drupal Sun is an Evolving Web project. It allows you to:

  • Do full-text search on all the articles in Drupal Planet (thanks to Apache Solr)
  • Facet based on tags, author, or feed
  • Flip through articles quickly (with j/k or arrow keys) to find what you're interested in
  • View the entire article text inline, or in the context of the site where it was created

See the blog post at Evolving Web

Evolving Web