Split cache

We want to achieve maximum cache hit ratio but have a large data to be cached. Some or the other entity within that JSON keeps updating and invalidating the cache. For example consider the following structure of cache:

checkout_page_cache

{
  "shopping_cart": {...},
  "orders": {...},
  "user": {...}
}

With this structure, cache will invalidate when any product within shopping cart, order status or user profile change. More invalidations means more cache miss and less performance.

On average, shopping cart changes way more often than user profile. If we split up the cache into separate parts, when shopping cart updates, we don't need to update user JSON. Not only does this increase cache hits, it also reduces DB queries for fetching related entities. So instead of single checkout_page_cache we have three separate ones:

shopping_cart_cache
orders_cache
users_cache

Separate cache store

Applications generally use Redis for multiple purposes apart from caching. eg. storing background jobs for sidekiq. Recommended configuration for cache store usually differs from these tools. eg. Sidekiq recommends noeviction policy, whereas we should always have eviction policy for cache. Also having separate redis databases is better for separation of concerns and avoiding single point of failure.

Configuration

All cache keys should have an expiry. The redis instance should have maxmemory-policy (eviction policy) set so it doesn't run out of memory - you can find all supported eviction policies and their details here. It is also recommended to set read and write timeouts. More info on Redis cache configuration for Rails is available here.

Versioning

Adding versions helps ease cache schema changes. eg: shopping cart cache:

[
  {
    "product_id": 123,
    "quantity": 1,
  },
  {
    "product_id": 129,
    "quantity": 2,
  }
]

We decide to store basic product (name and thumbnail) info to save DB queries:

[
  {
    "product": {
      "id": 123,
      "name": "Water Bottle",
      "thumbnail": "https://example.com/water-bottle.jpg"
    },
    "quantity": 1
  },
  {...}
]

Such a change will make cache inconsistent and will need workarounds in code to handle both cases (existing + new caches). Adding version helps with better handling and debugging which can be done in few different ways:

1. Adding version to JSON body:

[
  {
    "product": {
      "id": 123,
      ...
    },
    "quantity": 1,
    "version": 1
  },
  {
    "product": {
      "id": 129,
      ...
    },
    "quantity": 1,
    "version": 1
  },
]

Clearly this is not the best way since it requires storing version at multiple places (especially in our example of arrays) and increases cache size.

2. Using redis databases:

If our cache store is Redis, we can switch to a new DB when we change the schema. This has a drawback: we start fresh on new DB and our application will receive heavy cache miss and DB load for a brief period during transition.

3. Adding version to cache key:

We can add version number to the cache key in addition to entity ID. eg. user_123_v1. Here's a generic class to make it easier:

class CacheStore
  VERSION = '1'.freeze

  def self.get(key, ttl)
   conn.get(key_name(key), ttl)
  end

  def self.key_name(key)
    "#{key}-#{VERSION}"
  end
end

CacheStore.get("user-#{user.id}")

Tejas Bubane

Contents

Caching optimizations