Skip to main content

Command Palette

Search for a command to run...

Cache Strategies in Distributed Systems

Updated
β€’4 min read
Cache Strategies in Distributed Systems

πŸ€” Why Are These Strategies Even Needed?

The Core Problem

Your system has a cache (like Redis). It stores expensive data β€” DB query results, ML model outputs, user dashboards. Every cache entry has a TTL (Time To Live) β€” after which it expires and gets deleted.

Now here's the danger moment:

The instant that cache entry expires β€” every request that comes in finds nothing in cache and goes straight to the database.

If 10,000 users are hitting that same key β€” that's 10,000 simultaneous DB queries in one second. Your database was never designed to handle that. It collapses. πŸ’₯

🧱 Why Can't You Just… Not Expire the Cache?

Because stale data is a real problem too.

  • Product prices change πŸ’°

  • Stock values update every second πŸ“ˆ

  • User balances must be accurate 🏦

You need expiry. But expiry is exactly what causes the stampede. That's the tension these strategies resolve.

🧱 Why Can't You Just Scale the Database?

Scaling costs money and time. More importantly β€” the stampede hits instantly. Auto-scaling takes 30–60 seconds to spin up new instances. By the time new DB instances are ready, the damage is already done and your system is already down.

You need a strategy that works at the moment of cache miss, not after.

🧱 Why Can't You Just Let Requests Retry?

This actually makes it worse. As we saw in the Thundering Herd blog β€” retries multiply traffic. If 10,000 clients each retry 3 times, you now have 30,000 DB queries instead of 10,000. The system collapses even faster. πŸ“ˆπŸ’€

πŸ§‚ 1. Jitter on TTL β€” Spread the Expiry

What it is: Adding random variation to cache expiration times so entries don't all expire simultaneously.

🏦 Real App Example β€” Banking Dashboard

Imagine 10,000 users have their "account summary" cached with a TTL of 60s. Without jitter, all 10,000 entries expire at exactly the same second β†’ your DB gets 10,000 simultaneous queries πŸ’€

βœ… With jitter (Β±10s), entries expire between 50–70s. Requests spread out naturally. DB breathes easy 😌

πŸ” 2. Mutex Locking β€” One Rebuilds, Rest Wait

What it is: Only one request regenerates the expired cache. All others wait for the fresh value.

1️⃣ Cache miss detected
2️⃣ Request #1 acquires the lock πŸ”’
3️⃣ Request #1 fetches from DB & updates cache
4️⃣ Lock released πŸ”“
5️⃣ Requests #2–5000 read the fresh cached value βœ…

Only 1 DB hit instead of 5,000. πŸ™Œ

🏦 Stock Trading β€” Order Execution

Scenario: User places a "Buy 100 shares of TCS" order. System must not execute it twice even if the request is sent twice due to a network failure.

  • Money is involved β€” absolute correctness needed

  • Duplicate orders = financial loss

🎲 3. Probabilistic Early Expiration (PER) β€” Expire Before You Expire

What it is: PER uses a simple but effective formula derived from the XFetch algorithm. Instead of waiting for the cache to expire, each request has a probability of triggering a refresh that increases as the expiration time approaches.

πŸ“Š Analytics Dashboard

Dashboard widgets cached for 30 seconds. If all widgets expire together β†’ Analytics DB gets flooded.

With PER:

  • Widgets refresh at slightly different times

  • Load spreads naturally

  • Backend remains stable

♻️ 4. Stale-While-Revalidate (SWR)

What it is: Serve old (stale) data immediately β†’ refresh in background β†’ next request gets fresh data.

Use SWR when your users care more about speed than perfect freshness β€” and when showing data that's a few seconds (or minutes) old is completely acceptable.

How it works:

  • πŸ—‚οΈ Cache entry has a defined TTL (Time-To-Live)

  • When TTL expires β€” instead of blocking the request:

    • Return the stale (expired) value immediately

    • Trigger a background refresh to fetch fresh data

    • Once new data is ready, update the cache with the fresh value

πŸ“° News Feed (Reddit, Twitter) β€” Users get an instant feel, no waiting at all.

πŸ›οΈ Product Recommendations β€” ML recompute is slow. Show last known recommendations instantly while the model quietly recomputes in the background.

🏁 Final Thoughts

Cache Stampede is not a rare edge case β€” it's a ticking time bomb in any system that relies on caching at scale. The moment your cache key expires under heavy load, the herd is already at the door. 🐘

Each strategy attacks the problem differently

There is no single silver bullet. The best systems combine these strategies based on how critical freshness is, how expensive the DB call is, and how many concurrent users they serve.

Cache smart. Stay stable. Don't let the herd win. ⚑