Caching Lots of Data at Once in Drupal 7

I'm working on a Drupal project that requires calculating some custom stats for each registered user. To get it working, I ran my SQL queries right in hook_user_load(), which is wildly inefficient; it's much better to run the queries once and cache the results for a while. Drupal has nice cache functions built-in, and I thought I should use them, but wasn't totally clear on how to do it based on the documentation.

Jeff Eaton of Lullabot wrote a really helpful article on how to do basic caching in Drupal 7, which is a good introduction, but didn't really explain how to cache a bunch of related, but separately calculated, pieces of data using a function you'd call repeatedly. Examples module is also great, and does include a cache_example module, but also didn't have an example that was right up my alley.

So, here's how to write a single function that can be called multiple times on the same kind of object. This is not what you'd want to do for users or other built-in entities (for those, you should use Entity Cache module). Thanks to Planet Drupal readers for the feedback and help clarifying this.

What follows is a skeleton function from a module we'll call mymodule. Here are the highlights of this function that are a little different than what Jeff's article showed:

  • Instead of using just __FUNCTION__ as my key in drupal_static(), I'm adding on the object id ($object->id). drupal_static() will staticly cache values during a page load, and I want to make sure that each loaded object is correct if many are loaded during a single page load. Otherwise they could all get the same results!
  • I'm setting a unique cache ID ($cid) for each cached item. Looking at the structure of Drupal's cache table, I saw that many modules use the pattern of modulename:type_of_data:id to name their cids, so that's what I used here.

And here's the actual function. The MySQL queries are left out, on the assumption that you'll fill them in with whatever you need.

<?php
function _mymodule_get_stats($object) {
    // Instead of just __FUNCTION__, since this can be called many times per page load,
    // on a different user each time
    $object_stats = &drupal_static(__FUNCTION__ . $object->id);

    if (!isset($object_stats)) {

        // cid pattern - modulename:datatype:id
        $cid = 'mymodule:stats:' . $object->id;

        if ($cache = cache_get($cid)) {
            $object_stats = $cache->data;
        } else {
            // Do expensive stuff.  In this case, several MySQL queries

            $object_stats = array();

            // TODO - Fill in with what you need
            $result1 = db_query();

            if ($result1) {
                $record = $result1->fetchAssoc();
                $object_stats['key'] = $record['key'];
            }

            $result2 = db_query();

            if ($result2) {
                $record = $result2->fetchAssoc();
                $object_stats['key'] = $record['key'];
            }

            // keep these stats cached for at least 15 minutes (900 seconds)
            cache_set($cid, $object_stats, 'cache', time() + 900);
        }
    }
}
?>

Instead of using this pattern, you could also use cache_get_multiple(), building up an array of $cids and loading all your cached data at once. This complicates the drupal_static bits, so for now I'm leaving this example as-is, but maybe I'll look back at this later. Drupal 7 has many instances of function that can grab many objects at once (node_load_multiple(), user_load_multiple(), and many others), so it's a good idea to use those if you can.

Categories: 
Tags: 

Comments

$user already h...

You could simply add information to the $user object, in hook_user_load(), since that object is cached anyway (statically, and sometimes more persistently (e.g. with Entity Cache).

Joe Chellman

Thanks for the feedback. It seems this might not be the most practical example for when one should maintain their own cache. At least the overall idea still works!

(Figures. An actual human posts a comment, but there's no way for me to contact them.)

Bastien

The instructions about the use of caching is great, but you introduce a very aweful drawback: you should never launch multiple sql queries in a foreach loop. _mymodule_get_stats should take array of users in parameter, and launch the sql queries with a IN where condition. And a cache get multiple strategy. It will upper the difficulty of the static cache, but the overall performance will be incomparable.

Furthermore, I +1 the first comment, and you could use rules + entitycache to cache everything about the user, and clear that cache with some user events on rules (+ cache_action).

Joe Chellman

Thanks, Bastien. I've reworked this example so it leaves in the bits that are helpful, and makes better recommendations for the original use case. Much obliged!

Comments are closed on this post to keep spammers at bay. If you want to chime in, please email or send a tweet.