Upgrade Your Drupal Skills

We trained 1,000+ Drupal Developers over the last decade.

See Advanced Courses NAH, I know Enough

Module Caches – When, Where and How

Parent Feed: 

Posted Oct 8, 2012 // 0 comments

In the Drupal community, you see caching discussions related to pages, blocks, reverse-proxies, opcodes, and everything in between. These are often tied to render- and database-intensive optimizations to decrease the load on a server and increase throughput. However, there is another form of caching that can have a huge impact on your site’s performance – module level data caching. This article explores Drupal 7 core caching mechanisms that modules can take advantage of.

When?

Not all modules require data caching, and in some cases due to “real-time” requirements it might not be an option. However, here are some questions to ask yourself to determine if module-level data caching can help you out:

  • Does the module make queries to an external data provider (e.g. web service API) that returns large datasets?
  • If the module pulls data from an external source, is it a slow or unreliable connection?
  • If calling a web service, are there limits to the number of calls the module can make (hourly, daily, monthly, etc.)? Also, if it is a pay service, is it a variable cost based on number of calls?
  • Does the hosting provider have penalties for large amounts of inbound data?
  • Does the data my module handles require significant processing (e.g. heavy XML parsing)?
  • Is the data the module loads from an external source relatively stable and not change rapidly?

If you answered, “yes,” to more than a third of the questions above, module-level data caching can probably help your module’s performance by providing the following features:

  • Decrease external bandwidth
  • Decrease page load times
  • Reduce load on the site’s server
  • Provide reliable data services

Where?

OK, so you’ve decided your module could probably benefit from some form of module-level data caching. The next thing to determine is where to store it. You can always use some form of file-based caching, but to implement that with the proper abstractions to run on a variety of servers requires calls through the Drupal core File APIs, which can be a bit convoluted at times. File-based caching mechanisms also cannot take advantage of scalable performance solutions like memcache or multiple database server configurations that might be changed at any time.

Luckily, Drupal core provides a cache mechanism available to any module using the cache_get and cache_set functions, fully documented on http://api.drupal.org:

<?php
cache_get
($cid, $bin = 'cache')
cache_set($cid, $data, $bin = 'cache', $expire = CACHE_PERMANENT)
?>

By default, these functions will work with the core cache bin called simply “cache.” This is the main dumping ground for Drupal core for data that can persist in the system for a length of time beyond the one page call, and are not tied to a session. However, many modules define their own cache bins so they can provide their own cache management processes. A few core module ones are:

  • cache_block
  • cache_field
  • cache_filter
  • cache_form
  • cache_menu
  • cache_page

Seeing as how several core Drupal modules implement their own cache bins, the next questions for your new module are:

  • Does the module need to manage its cache in a manner that is not consistent with the main cache bin?
  • Will its cache need to be flushed independently of the main cache at any time, or have some other expiration logic assigned to it that falls outside of the core cron cache clear calls?

If the answer to either of these questions is, “yes,” then a dedicated cache bin is probably a wise idea.

Cache bin management is abstracted in the Drupal system via classes implementing DrupalCacheInterface. The core codebase provides a default database-driven cache mechanism via DrupalDatabaseCache that is used for any cache bin type that has not been overridden with a custom class (see the documentation on DrupalCacheInterface for details on how to do that) and has a table in the database named the same as the bin. This table conforms to the same schema as the core cache tables. For reference, this is the core cache table schema in MySQL that we will use as the base for our module’s cache bin:

+------------+--------------+------+-----+---------+-------+
| Field      | Type         | Null | Key | Default | Extra |
+------------+--------------+------+-----+---------+-------+
| cid        | varchar(255) | NO   | PRI |         |       |
| data       | longblob     | YES  |     | NULL    |       |
| expire     | int(11)      | NO   | MUL | 0       |       |
| created    | int(11)      | NO   |     | 0       |       |
| serialized | smallint(6)  | NO   |     | 0       |       |
+------------+--------------+------+-----+---------+-------+

How?

For the sake of simplicity, we will assume that our module is fine with using the default cache mechanism and database schema. As an exercise, we will also assume that we meet the criteria for defining our own cache bin so we can explore all the hooks required to implement a complete custom bin leveraging the default cache implementation. The sample module is called cachemod, and the cache bin name is cache_cachemod.

Define the cache bin schema

In order to add a table with the correct schema to the system, we borrow from some code found in the block module that copies the schema from the core cache table and add this to our install hooks in cachemod.install:

<?php
/**
* Implements hook_schema
*/
function cachemod_schema() {
 
// Create new cache table using core cache schema
 
$schema['cache_cachemod'] = drupal_get_schema_unprocessed('system', 'cache');
 
$schema['cache_cachemod']['description'] = 'Cache bin for the cachemod module';  return $schema;
}
?>

Now that we have defined a table for our cache bin that replicates the schema of the core cache table, we can make basic set and get calls using the following:

<?php
cache_get
($cid, 'cache_cachemod');
cache_set($cid, $data, 'cache_cachemod');
?>

Using our new cache bin

Notice the CID (cache ID) parameter. This will need to be unique to the data being stored, so in the case of something like a web service, the CID might be built from the arguments being passed to the service and the data will be the returned data. One way to abstract this so you get consistent CID values for calls to cache_get and cache_set is to build a helper function. This sample assumes our service call takes an array of key-value pairs:

<?php
/**
* Util function to generate cid from service call args
*/
function _cachemod_cid($args) {
 
// Make sure we have a valid set of args
 
if (empty($args)) {
    return
NULL;
  } 
// Make sure we are consistently operating on an array
 
If (!is_array($args)) {
   
$args = array($args);
  } 
// Sort the array by key, serialize it, and calc the hash
 
ksort($args);
 
$cid = md5(serialize($args));
  return
$cid;
}
?>

Now we can implement a basic public web service function leveraging our cache like this:

<?php
/**
* Public function to execute web service call
*/
function cachemod_call($args) {
 
// Create our cid from args
 
$cid = _cachemod_cid($args);  // See if we have cached data already
 
$data = cache_get($cid, 'cache_cachemod')
  if (!
$data) {
   
// No such luck, go try to pull it from the web service
   
$data = _cachemod_call_service($args);
    if (
$data) {
     
// Great, we have data!  Store it off in the cache
     
cache_set($cid, $data, 'cache_cachemod');
    }
  }  return
$data;
}
?>

Note that there are several values for the optional expire parameter to the cache_set call that are fully documented in the API docs.

Hooking into the core cache management system

If you want your module’s cache bin to clear out when Drupal executes a cache wipe during cron runs or a general cache_clear_all, set the expire parameter in your cache_set call above to either CACHE_TEMPORARY or a Unix timestamp to expire after, and add the following hook to your module:

<?php
/**
* Implements hook_flush_caches
*/
function cachemod_flush_caches() {
 
$bins = array('cache_cachemod');
  return
$bins;
}
?>

This will add your cache bin to the list of bins that Drupal’s cron task will empty.

Additionally, if you would like to add your cache bin to the list of caches that drush can selectively clear, add the following to your module in a file named cachemod.drush.inc:

<?php
// Implements hook_drush_cache_clear
function cachemod_drush_cache_clear(&$types) {
 
$types['cachemod'] = '_cachemod_cache_clear';
}
// Util function to clear the cachemod bin
function _cachemod_cache_clear() {
 
cache_clear_all('*', 'cache_cachemod', true);
}
?>

Note that if you set the expiration of the cache item to CACHE_PERMANENT (the default), only an explicit call to cache_clear_all with the item’s CID will remove it from the cache.

Conclusion

Sometimes it makes sense to have a module cache data for its own use, and even possibly in its own cache bin to maintain a finer-grained control of the data and cache management if something beyond the core cache management is required. Utilizing the cache abstraction built into Drupal 7 core and some custom classes, hooks, and drush callbacks can give your module a range of options for reducing data calls, processing overhead, and bandwidth consumption. For more detailed info, check out the API pages at http://api.drupal.org for the functions, classes and hooks mentioned above.

As a Senior Developer at Phase2, Robert Bates is able to pursue his interests in solving complex multi-tier integration challenges with elegant solutions. He has experience not only in traditional web programming languages such as PHP and ...

Author: 
Original Post: 

About Drupal Sun

Drupal Sun is an Evolving Web project. It allows you to:

  • Do full-text search on all the articles in Drupal Planet (thanks to Apache Solr)
  • Facet based on tags, author, or feed
  • Flip through articles quickly (with j/k or arrow keys) to find what you're interested in
  • View the entire article text inline, or in the context of the site where it was created

See the blog post at Evolving Web

Evolving Web