Customizing GraphQL Cache Keys

WPGraphQL Smart Cache is quickly becoming the standard solution for fast and accurate data when using WPGraphQL for decoupled applications. Since launching in December 2022, it’s grown to 500+ active users!

While it solves a lot of problems for a lot of users, it’s not perfect and might require customization.

In this blog post, we’re going to look at how to customize the keys that are returned with GraphQL responses, which customizes how the GraphQL documents are tagged in the cache, and thus customizes what events will purge the cached document.

How WPGraphQL Smart Cache Invalidation works

Before we dive into customizing WPGraphQL Smart Cache keys, let’s take a look at how caching and cache invalidation with WPGraphQL Smart Cache works.

Understanding how GraphQL Queries are Cached and “Tagged”

When making a query to a WPGraphQL-powered endpoint, WPGraphQL will return headers that caching clients can use to tag the cached document with.

To see this in action, I will use a local WordPress install (powered by Local). In this WordPress install, I have the following plugins installed and activated:

In the WordPress dashboard, I will navigate to the GraphiQL IDE.

I will open Chrome’s dev tools and select the “Network” tab.

Then, in the GraphiQL IDE, I will execute the following query:

query GetPosts {
  posts {
    nodes {
      title
      uri
    }
  }
}

And I see the following response:

Screenshot of the GraphiQL IDE executing a query for posts and showing the JSON response from the query.

With the Network tools open, we can click on the request and inspect the headers:

Here, we see the X-GraphQL-Keys header with a list of keys. If this query were made via a GET request to a site hosted on a host that supports it, the response would be cached and “tagged” with these keys.
For this particular query, we see the following keys:

  • 4382426a7bebd62479da59a06d90ceb12e02967d342afb7518632e26b95acc6f (hash of the query)
  • graphql:Query (operation type)
  • operation:GetPosts (operation name)
  • list:post (identifies the query asked for a list of posts)
  • cG9zdDox (id of the node(s) resolving. In this query response, only 1 node was resolved)

The document will be tagged with these keys, and in turn will be purged (deleted) from the cache if a purge event for one of those tags is triggered.

Understanding Cache Invalidation with WPGraphQL Smart Cache

Above, we looked at how supported hosts will cache GraphQL responses and use the X-GraphQL-Keys to “tag” the cached document.

Now let’s take a look at how WPGraphQL Smart Cache invalidates, or “purges” these tagged documents from the cache.

WPGraphQL Smart Cache listens to events that occur in WordPress, and in response to these events, calls “purge” on specific “tags”, deleting any document from the cache that was tagged with said tag.

The simplified summary of the WPGraphQL Smart Cache invalidation strategy is as follows:

  • Publish events call purge( 'list:$type_name' )
  • Update events call purge( '$nodeId' )
  • Delete events call purge( '$nodeId' )

How Cache Invalidation and Cache Tags work together

Now that we have an idea of how cached documents are tagged, and how events in WordPress call “purge” on different tags, let’s bring it all together.

Above we identified the query keys that the document will be tagged with. And now that we understand the cache invalidation strategy, we know that the GetPosts query will be purged from the cache whenever the following events occur:

  • A new post is published ( purge( 'list:post' ) )
  • The node identified by id cG9zdDox (the “hello world” post) is updated or deleted
  • Clicking “purge cache” in the GraphQL > Settings > Cache page (`purge( ‘graphql:Query’ )` )
  • Any purge event manually added to purge based on operation name or query hash

So far, so good! This all seems reasonable. I think we would probably all want the cache for this query to be invalidated after we updated the “Hello World” post. We would want to see the updated data in the query response.

The Problem

The problem, however, is that the headers that are added to the documents might be more broad than we would like, and we might need to customize this.

For example, let’s take a look at a more complicated query:

query GetPostsWithCategoriesAndTags {
  posts {
    nodes {
      id
      title
      categories {
        nodes {
          id
          name
        }
      }
      tags {
        nodes {
          id
          name
        }
      }
    }
  }
}

In this query, we’ve asked for a list of posts, a list of categories belonging to each post and a list of tags belonging to each post.

In the response, we see that we have Posts, categories and tags (only 1 of each in this case, as we’re on a simple test site with minimal data). And if we inspect the X-GraphQL-Keys header, we will see that it’s tagged with list:post, list:category and list:tag. This is because we asked for each of these types of nodes and when a new post, category or tag is made public, it could be part of the list, so we invalidate this cache. And like magic, we could query this again and get a Cache Miss and see fresh content, like the newly published post.

This all sounds good still, but there is a problem.

The problem is that the list:category and list:tag keys will cause this document to be purged a lot more than it should be.

For example, creating a new tag and assigning it to a post that’s not shown in this query’s results will trigger the purge( 'list:tag' ) and invalidate this query, leading to more cache invalidations than we probably want.

Ideally WPGraphQL and WPGraphQL Smart Cache will be able to better identify when / when not to output the list:$type keys, but for now this is how things work, but there are ways to override it for your specific needs.

Filtering the X-GraphQL-Keys

If we want the GetPostsWithCategoriesAndTags query to NOT output the list:tag and list:category keys, we can filter the keys like so:

// Filter the keys that are returned by the Query Analyzer
add_filter( 'graphql_query_analyzer_graphql_keys', function( $graphql_keys, $return_keys, $skipped_keys, $return_keys_array, $skipped_keys_array ) {
 
        // Convert the keys from a string to an array
	$keys_array = explode( ' ', $return_keys );

        // Only apply the filter to the "GetPostsWithCategoriesAndTags" query
	if ( empty( $keys_array ) || ! in_array( 'operation:GetPostsWithCategoriesAndTags', $keys_array, true )  ) {
		return $graphql_keys;
	}

        // Remove the "list:tag" key from the headers
	if ( ( $key = array_search('list:tag', $keys_array ) ) !== false ) {
		unset( $keys_array[$key] );
	}

        // Remove the "list:category" key from the headers
	if ( ( $key = array_search('list:category', $keys_array ) ) !== false) {
		unset( $keys_array[$key] );
	}

        // Convert the array of keys back to a space separated string
	$graphql_keys['keys'] = implode( ' ', $keys_array );
	
        // Return the "filtered" $graphql_keys
	return $graphql_keys;

}, 10, 5 );

In the snippet above, we target a specific query with the operation name “GetPostsWithCategoriesAndTags” and for that query we remove the list:tag and list:category keys from being returned.

Now, we can execute the same query and inspect the headers, and we will see that list:tag and list:category are both no longer output in the X-GraphQL-Keys header.

This means that publishing new categories and tags, which triggers purge( 'list:category' ) and purge( 'list:tag' ) will not purge this document.

Success!

Now we’re getting the benefits of cached GraphQL documents, and we’re getting the document invalidated when the post is updated or deleted, but we’re letting the cache remain cached when categories or tags are created.

Conclusion

While we ultimately believe WPGraphQL Smart Cache should be, well, smarter, it might take some time to find that perfect solution that works for everyone.

In the mean time, using filters like demonstrated above, can help you tailor your WPGraphQL cache tagging and invalidation strategy to fit your specific project’s needs.

Published by Jason Bahl

Jason is a Principal Software Engineer at WP Engine based in Denver, CO where he maintains WPGraphQL. When he's not writing code or evangelizing about GraphQL to the world, he enjoys escaping from escape rooms, playing soccer, board games and Fortnite.

Leave a comment

Your email address will not be published. Required fields are marked *