HyperRoute

Distributed Caching

HyperRoute includes a built-in distributed caching layer that accelerates query plan resolution, entity lookups, and persisted query retrieval. Three backends are available — use them individually or combined in a layered configuration.


Cache Backends

Memory Cache

In-process, zero-latency cache. Ideal for single-instance deployments or as the L1 layer in a layered setup.

cache:
  backend: memory
  max_entries: 10000

Redis Cache

Shared cache across multiple router instances. Ensures consistency in horizontally-scaled deployments.

cache:
  backend: redis
  url: redis://redis:6379
  pool_size: 20
  connection_timeout: 5s

Layered Cache (L1 + L2)

The recommended production configuration. Combines memory speed with Redis consistency:

cache:
  backend: layered

  l1:
    type: memory
    max_entries: 10000

  l2:
    type: redis
    url: ${REDIS_URL}
    pool_size: 20
    connection_timeout: 5s

How it works:

Request → L1 (Memory) → HIT → Return instantly (~0ms)
                       → MISS → L2 (Redis) → HIT → Populate L1, return (~1ms)
                                            → MISS → Execute, populate L1+L2
LayerSpeedSharedSurvives Restart
L1 (Memory)~0msNo (per-instance)No
L2 (Redis)~1msYes (all instances)Yes

What Gets Cached

Cache TypeKeyDefault TTLDescription
Query PlanQuery hash3600s (1h)Parsed and optimized execution plans
EntityEntity key300s (5m)Resolved entity data from subgraphs
APQHash86400s (24h)Automatic Persisted Queries

TTL Configuration

cache:
  ttl:
    query_plan: 3600     # 1 hour
    entity: 300          # 5 minutes
    apq: 86400           # 24 hours

In-Flight Deduplication

Separate from caching, HyperRoute deduplicates identical in-flight requests. When 10,000 identical requests arrive simultaneously:

10,000 identical requests → 1 upstream call → response shared to all 10,000 clients

This protects your subgraphs from thundering herd events during traffic spikes. The hyperroute_inflight_dedup_hits_total metric tracks how often deduplication fires.


Cache Metrics

Monitor cache effectiveness with built-in Prometheus metrics:

MetricTypeDescription
hyperroute_cache_hits_totalCounterCache hits (plan + entity)
hyperroute_cache_misses_totalCounterCache misses
hyperroute_inflight_dedup_hits_totalCounterIn-flight dedup savings

Cache hit ratio formula:

hit_ratio = hyperroute_cache_hits_total / (hyperroute_cache_hits_total + hyperroute_cache_misses_total)

A healthy production deployment typically sees >95% cache hit rates for query plans.


Complete Cache Config

cache:
  backend: layered

  l1:
    type: memory
    max_entries: 10000

  l2:
    type: redis
    url: ${REDIS_URL}
    pool_size: 20
    connection_timeout: 5s

  ttl:
    query_plan: 3600
    entity: 300
    apq: 86400

Best Practices

  1. Always use layered caching in multi-instance deployments — L1 handles hot data at memory speed, L2 provides cross-instance consistency
  2. Size L1 appropriatelymax_entries: 10000 covers most workloads; increase for APIs with many unique queries
  3. Monitor cache hit ratios — if hit rates drop below 90%, consider increasing TTLs or L1 capacity
  4. Tune entity TTL carefully — shorter TTLs mean fresher data but more subgraph load; longer TTLs reduce load but increase staleness

Next Steps