Response Caching

Dramatically improve performance with in-memory GraphQL response caching.

Enabling Caching

use grpc_graphql_gateway::{Gateway, CacheConfig};
use std::time::Duration;

let gateway = Gateway::builder()
    .with_descriptor_set_bytes(DESCRIPTORS)
    .with_response_cache(CacheConfig {
        max_size: 10_000,                              // Max cached responses
        default_ttl: Duration::from_secs(60),          // 1 minute TTL
        stale_while_revalidate: Some(Duration::from_secs(30)),
        invalidate_on_mutation: true,
    })
    .build()?;

Configuration Options

Option	Type	Description
`max_size`	usize	Maximum number of cached responses
`default_ttl`	Duration	Time before entries expire
`stale_while_revalidate`	Option<Duration>	Serve stale content while refreshing
`invalidate_on_mutation`	bool	Clear cache on mutations
`redis_url`	Option<String>	Redis connection URL for distributed caching
`vary_headers`	Vec<String>	Headers to include in cache key (default: `["Authorization"]`)

Distributed Caching (Redis)

Use Redis for shared caching across multiple gateway instances:

let gateway = Gateway::builder()
    .with_response_cache(CacheConfig {
        redis_url: Some("redis://127.0.0.1:6379".to_string()),
        default_ttl: Duration::from_secs(60),
        ..Default::default()
    })
    .build()?;

Vary Headers

By default, the cache key includes the Authorization header to prevent leaking user data. You can configure which headers affect the cache key:

CacheConfig {
    // Cache per user and per tenant
    vary_headers: vec!["Authorization".to_string(), "X-Tenant-ID".to_string()],
    ..Default::default()
}

How It Works

First Query: Cache miss → Execute gRPC → Cache response → Return
Second Query: Cache hit → Return cached response immediately (<1ms)
Mutation: Execute mutation → Invalidate related cache entries
Next Query: Cache miss (invalidated) → Execute gRPC → Cache → Return

What Gets Cached

Operation	Cached?	Triggers Invalidation?
Query	✅ Yes	No
Mutation	❌ No	✅ Yes
Subscription	❌ No	No

Cache Key Generation

The cache key is a SHA-256 hash of:

Normalized query string
Sorted variables JSON
Operation name (if provided)

Stale-While-Revalidate

Serve stale content immediately while refreshing in the background:

CacheConfig {
    default_ttl: Duration::from_secs(60),
    stale_while_revalidate: Some(Duration::from_secs(30)),
    ..Default::default()
}

Timeline:

0-60s: Fresh content served
60-90s: Stale content served, refresh triggered
90s+: Cache miss, fresh fetch

Mutation Invalidation

When invalidate_on_mutation: true:

// This mutation invalidates cache
mutation { updateUser(id: "123", name: "Alice") { id name } }

// Subsequent queries fetch fresh data
query { user(id: "123") { id name } }

Testing with curl

# 1. First query - cache miss
curl -X POST http://localhost:8888/graphql \
  -H "Content-Type: application/json" \
  -d '{"query": "{ user(id: \"123\") { name } }"}'

# 2. Same query - cache hit (instant)
curl -X POST http://localhost:8888/graphql \
  -H "Content-Type: application/json" \
  -d '{"query": "{ user(id: \"123\") { name } }"}'

# 3. Mutation - invalidates cache
curl -X POST http://localhost:8888/graphql \
  -H "Content-Type: application/json" \
  -d '{"query": "mutation { updateUser(id: \"123\", name: \"Bob\") { name } }"}'

# 4. Query again - cache miss (fresh data)
curl -X POST http://localhost:8888/graphql \
  -H "Content-Type: application/json" \
  -d '{"query": "{ user(id: \"123\") { name } }"}'

Performance Impact

Cache hits: <1ms response time
10-100x fewer gRPC backend calls
Significant reduction in backend load

Keyboard shortcuts

gRPC-GraphQL Gateway