Keyboard shortcuts

Press ← or β†’ to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Introduction

gRPC-GraphQL Gateway

A high-performance Rust gateway that bridges gRPC services to GraphQL with full Apollo Federation v2 support.

Crates.io License: MIT

Transform your gRPC microservices into a unified GraphQL API with zero GraphQL code. This gateway dynamically generates GraphQL schemas from protobuf descriptors and routes requests to your gRPC backends via Tonic, providing a seamless bridge between gRPC and GraphQL ecosystems.

✨ Features

Core Capabilities

  • πŸš€ Dynamic Schema Generation - Automatic GraphQL schema from protobuf descriptors
  • ⚑ Full Operation Support - Queries, Mutations, and Subscriptions
  • πŸ”Œ WebSocket Subscriptions - Real-time data via GraphQL subscriptions (graphql-ws protocol)
  • πŸ“€ File Uploads - Multipart form data support for file uploads
  • 🎯 Type Safety - Leverages Rust’s type system for robust schema generation

Federation & Enterprise

  • 🌐 Apollo Federation v2 - Complete federation support with entity resolution
  • πŸ”„ Entity Resolution - Production-ready resolver with DataLoader batching
  • 🚫 No N+1 Queries - Built-in DataLoader prevents performance issues
  • πŸ”— All Federation Directives - @key, @external, @requires, @provides, @shareable
  • πŸ“Š Batch Operations - Efficient entity resolution with automatic batching

Developer Experience

  • πŸ› οΈ Code Generation - protoc-gen-graphql-template generates starter gateway code
  • πŸ”§ Middleware Support - Extensible middleware for auth, logging, and observability
  • πŸ“ Rich Examples - Complete working examples for all features
  • πŸ§ͺ Well Tested - Comprehensive test coverage

Production Ready

  • πŸ₯ Health Checks - /health and /ready endpoints for Kubernetes
  • πŸ“Š Prometheus Metrics - /metrics endpoint with request counts and latencies
  • πŸ”­ OpenTelemetry Tracing - Distributed tracing with GraphQL and gRPC spans
  • πŸ›‘οΈ DoS Protection - Query depth and complexity limiting
  • πŸ”’ Introspection Control - Disable schema introspection in production
  • πŸ” Query Whitelisting - Restrict to pre-approved queries (PCI-DSS compliant)
  • ⚑ Rate Limiting - Built-in rate limiting middleware
  • πŸ“¦ Automatic Persisted Queries - Reduce bandwidth with query hash caching
  • πŸ”Œ Circuit Breaker - Prevent cascading failures
  • πŸ—„οΈ Response Caching - In-memory LRU cache with TTL
  • πŸ“‹ Batch Queries - Execute multiple operations in one request
  • πŸ›‘ Graceful Shutdown - Clean shutdown with request draining
  • πŸ—œοΈ Response Compression - Automatic gzip/brotli compression
  • πŸ”€ Header Propagation - Forward HTTP headers to gRPC backends
  • 🧩 Multi-Descriptor Support - Combine multiple protobuf descriptors

Why gRPC-GraphQL Gateway?

If you have existing gRPC microservices and want to expose them via GraphQL without writing GraphQL resolvers manually, this gateway is for you. It:

  1. Reads your protobuf definitions - Including custom GraphQL annotations
  2. Generates a GraphQL schema automatically - Types, queries, mutations, subscriptions
  3. Routes requests to your gRPC backends - With full async/await support
  4. Supports federation - Build a unified supergraph from multiple services

Quick Example

use grpc_graphql_gateway::{Gateway, GrpcClient};

const DESCRIPTORS: &[u8] = include_bytes!(concat!(env!("OUT_DIR"), "/descriptor.bin"));

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let gateway = Gateway::builder()
        .with_descriptor_set_bytes(DESCRIPTORS)
        .add_grpc_client(
            "greeter.Greeter",
            GrpcClient::builder("http://127.0.0.1:50051").connect_lazy()?,
        )
        .build()?;

    gateway.serve("0.0.0.0:8888").await?;
    Ok(())
}

That’s it! Your gateway is now running at:

  • GraphQL HTTP: http://localhost:8888/graphql
  • GraphQL WebSocket: ws://localhost:8888/graphql/ws

Getting Started

Ready to dive in? Start with the Installation guide.

Installation

Add to Cargo.toml

[dependencies]
grpc_graphql_gateway = "0.2"
tokio = { version = "1", features = ["full"] }
tonic = "0.12"

Optional Features

The gateway supports optional features that can be enabled in Cargo.toml:

[dependencies]
grpc_graphql_gateway = { version = "0.2", features = ["otlp"] }
FeatureDescription
otlpEnable OpenTelemetry Protocol export for distributed tracing

Prerequisites

Before using the gateway, ensure you have:

  1. Rust 1.70+ - The gateway uses modern Rust features
  2. Protobuf Compiler - protoc for generating descriptor files
  3. gRPC Services - Backend services to proxy requests to

Installing protoc

macOS

brew install protobuf

Ubuntu/Debian

sudo apt-get install protobuf-compiler

Windows

Download from the protobuf releases page.

Proto Annotations

To use the gateway, your .proto files need GraphQL annotations. Copy the graphql.proto file from the repository:

curl -o proto/graphql.proto https://raw.githubusercontent.com/Protocol-Lattice/grpc_graphql_gateway/main/proto/graphql.proto

This file defines the custom options like graphql.schema, graphql.field, and graphql.entity that the gateway uses to generate the GraphQL schema.

Next Steps

Once installed, proceed to the Quick Start guide to create your first gateway.

Quick Start

This guide will get you up and running with a basic gRPC-GraphQL gateway in minutes.

Basic Gateway

use grpc_graphql_gateway::{Gateway, GrpcClient};

const DESCRIPTORS: &[u8] = include_bytes!(concat!(env!("OUT_DIR"), "/graphql_descriptor.bin"));

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let gateway = Gateway::builder()
        .with_descriptor_set_bytes(DESCRIPTORS)
        .add_grpc_client(
            "greeter.Greeter",
            GrpcClient::builder("http://127.0.0.1:50051").connect_lazy()?,
        )
        .build()?;

    gateway.serve("0.0.0.0:8888").await?;
    Ok(())
}

What This Does

  1. Loads protobuf descriptors - The binary descriptor file contains your service definitions
  2. Connects to gRPC backend - Lazily connects to your gRPC service
  3. Generates GraphQL schema - Automatically creates types, queries, and mutations
  4. Starts HTTP server - Serves GraphQL at /graphql

Endpoints

Once running, your gateway exposes:

EndpointDescription
http://localhost:8888/graphqlGraphQL HTTP endpoint (POST)
ws://localhost:8888/graphql/wsGraphQL WebSocket for subscriptions

Testing Your Gateway

Using curl

curl -X POST http://localhost:8888/graphql \
  -H "Content-Type: application/json" \
  -d '{"query": "{ sayHello(name: \"World\") { message } }"}'

Using GraphQL Playground

The gateway includes a built-in GraphQL Playground. Open your browser and navigate to:

http://localhost:8888/graphql

Example Proto File

Here’s a simple proto file that works with the gateway:

syntax = "proto3";

package greeter;

import "graphql.proto";

service Greeter {
  option (graphql.service) = {
    host: "localhost:50051"
    insecure: true
  };

  rpc SayHello(HelloRequest) returns (HelloReply) {
    option (graphql.schema) = {
      type: QUERY
      name: "sayHello"
    };
  }
}

message HelloRequest {
  string name = 1;
}

message HelloReply {
  string message = 1;
}

Next Steps

Generating Descriptors

The gateway reads protobuf descriptor files (.bin) to understand your service definitions. This page explains how to generate them.

Add a build.rs file to your project:

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let out_dir = std::env::var("OUT_DIR")?;
    
    tonic_build::configure()
        .build_server(false)
        .build_client(false)
        .file_descriptor_set_path(
            std::path::PathBuf::from(&out_dir).join("graphql_descriptor.bin")
        )
        .compile_protos(&["proto/your_service.proto"], &["proto"])?;
    
    Ok(())
}

Build Dependencies

Add to your Cargo.toml:

[build-dependencies]
tonic-build = "0.12"

Loading Descriptors

In your main code, load the generated descriptor:

const DESCRIPTORS: &[u8] = include_bytes!(concat!(env!("OUT_DIR"), "/graphql_descriptor.bin"));

let gateway = Gateway::builder()
    .with_descriptor_set_bytes(DESCRIPTORS)
    .build()?;

Using protoc Directly

You can also generate descriptors using protoc directly:

protoc \
  --descriptor_set_out=descriptor.bin \
  --include_imports \
  --include_source_info \
  -I proto \
  proto/your_service.proto

Then load it from a file:

let gateway = Gateway::builder()
    .with_descriptor_set_file("descriptor.bin")?
    .build()?;

Multiple Proto Files

If you have multiple proto files, include them all:

tonic_build::configure()
    .file_descriptor_set_path(
        std::path::PathBuf::from(&out_dir).join("descriptor.bin")
    )
    .compile_protos(
        &[
            "proto/users.proto",
            "proto/products.proto",
            "proto/orders.proto",
        ],
        &["proto"]
    )?;

Multi-Descriptor Support

For microservice architectures where each team owns their proto files, you can combine multiple descriptor sets:

const USERS_DESCRIPTORS: &[u8] = include_bytes!("path/to/users.bin");
const PRODUCTS_DESCRIPTORS: &[u8] = include_bytes!("path/to/products.bin");

let gateway = Gateway::builder()
    .with_descriptor_set_bytes(USERS_DESCRIPTORS)
    .add_descriptor_set_bytes(PRODUCTS_DESCRIPTORS)
    .build()?;

See Multi-Descriptor Support for more details.

Required Proto Imports

Your proto files must import the GraphQL annotations:

import "graphql.proto";

Make sure graphql.proto is in your include path when compiling.

Troubleshooting

Missing graphql.schema extension

If you see this error:

missing graphql.schema extension

Ensure that:

  1. graphql.proto is included in your proto compilation
  2. You’re using --include_imports with protoc
  3. Your tonic-build includes all necessary proto files

Descriptor file not found

If the descriptor file isn’t found at runtime:

  1. Check that OUT_DIR is set correctly
  2. Verify the file was generated during build
  3. Use cargo clean && cargo build to regenerate

Queries, Mutations & Subscriptions

The gateway supports all three GraphQL operation types, automatically derived from your protobuf service definitions.

Annotating Proto Methods

Use the graphql.schema option to define how each RPC method maps to GraphQL:

service UserService {
  option (graphql.service) = {
    host: "localhost:50051"
    insecure: true
  };

  // Query - for fetching data
  rpc GetUser(GetUserRequest) returns (User) {
    option (graphql.schema) = {
      type: QUERY
      name: "user"
    };
  }

  // Mutation - for modifying data
  rpc CreateUser(CreateUserRequest) returns (User) {
    option (graphql.schema) = {
      type: MUTATION
      name: "createUser"
      request { name: "input" }
    };
  }

  // Subscription - for real-time data (server streaming)
  rpc WatchUser(WatchUserRequest) returns (stream User) {
    option (graphql.schema) = {
      type: SUBSCRIPTION
      name: "userUpdates"
    };
  }
}

Operation Type Mapping

Proto RPC TypeGraphQL TypeUse Case
UnaryQuery/MutationFetch or modify data
Server StreamingSubscriptionReal-time updates
Client StreamingNot supported-
BidirectionalNot supported-

Queries

Queries are used for fetching data:

query {
  user(id: "123") {
    id
    name
    email
  }
}

Query Example

Proto:

rpc GetUser(GetUserRequest) returns (User) {
  option (graphql.schema) = {
    type: QUERY
    name: "user"
  };
}

GraphQL:

query GetUser {
  user(id: "123") {
    id
    name
    email
  }
}

Mutations

Mutations are used for creating, updating, or deleting data:

mutation {
  createUser(input: { name: "Alice", email: "alice@example.com" }) {
    id
    name
  }
}

Using Input Types

The request option customizes how the request message is exposed:

rpc CreateUser(CreateUserRequest) returns (User) {
  option (graphql.schema) = {
    type: MUTATION
    name: "createUser"
    request { name: "input" }  // Wrap request fields under "input"
  };
}

This creates a GraphQL mutation with an input argument containing all fields from CreateUserRequest.

Subscriptions

Subscriptions provide real-time updates via WebSocket:

subscription {
  userUpdates(id: "123") {
    id
    name
    status
  }
}

WebSocket Protocol

The gateway supports the graphql-transport-ws protocol. Connect to:

ws://localhost:8888/graphql/ws

Subscription Example

Proto (server streaming RPC):

rpc WatchUser(WatchUserRequest) returns (stream User) {
  option (graphql.schema) = {
    type: SUBSCRIPTION
    name: "userUpdates"
  };
}

JavaScript Client:

import { createClient } from 'graphql-ws';

const client = createClient({
  url: 'ws://localhost:8888/graphql/ws',
});

client.subscribe(
  {
    query: 'subscription { userUpdates(id: "123") { id name status } }',
  },
  {
    next: (data) => console.log('Update:', data),
    error: (err) => console.error('Error:', err),
    complete: () => console.log('Complete'),
  }
);

Multiple Operations

You can run multiple operations in a single request using Batch Queries:

[
  {"query": "{ users { id name } }"},
  {"query": "{ products { upc price } }"}
]

File Uploads

The gateway automatically supports GraphQL file uploads via multipart requests, following the GraphQL multipart request specification.

Proto Definition

Map bytes fields to handle file uploads:

message UploadAvatarRequest {
  string user_id = 1;
  bytes avatar = 2;  // Maps to Upload scalar in GraphQL
}

message UploadAvatarResponse {
  string user_id = 1;
  int64 size = 2;
}

service UserService {
  rpc UploadAvatar(UploadAvatarRequest) returns (UploadAvatarResponse) {
    option (graphql.schema) = {
      type: MUTATION
      name: "uploadAvatar"
      request { name: "input" }
    };
  }
}

GraphQL Mutation

The generated GraphQL schema includes an Upload scalar:

mutation UploadAvatar($file: Upload!) {
  uploadAvatar(input: { userId: "123", avatar: $file }) {
    userId
    size
  }
}

Using curl

curl http://localhost:8888/graphql \
  --form 'operations={"query": "mutation($file: Upload!) { uploadAvatar(input:{userId:\"123\", avatar:$file}) { userId size } }", "variables": {"file": null}}' \
  --form 'map={"0": ["variables.file"]}' \
  --form '0=@avatar.png'

Request Format

  1. operations - JSON containing the query and variables
  2. map - Maps file indices to variable paths
  3. 0, 1, … - The actual file content

JavaScript Client

Using Apollo Client with apollo-upload-client:

import { createUploadLink } from 'apollo-upload-client';
import { ApolloClient, InMemoryCache } from '@apollo/client';

const client = new ApolloClient({
  link: createUploadLink({ uri: 'http://localhost:8888/graphql' }),
  cache: new InMemoryCache(),
});

// Upload mutation
const UPLOAD_AVATAR = gql`
  mutation UploadAvatar($file: Upload!) {
    uploadAvatar(input: { userId: "123", avatar: $file }) {
      userId
      size
    }
  }
`;

// Trigger upload
const file = document.querySelector('input[type="file"]').files[0];
client.mutate({
  mutation: UPLOAD_AVATAR,
  variables: { file },
});

Multiple Files

Upload multiple files by adding more entries to the map:

curl http://localhost:8888/graphql \
  --form 'operations={"query": "mutation($files: [Upload!]!) { uploadFiles(files: $files) { count } }", "variables": {"files": [null, null]}}' \
  --form 'map={"0": ["variables.files.0"], "1": ["variables.files.1"]}' \
  --form '0=@file1.pdf' \
  --form '1=@file2.pdf'

File Size Limits

By default, uploads are limited by your web server configuration. For large files, consider:

  1. Streaming uploads to avoid memory pressure
  2. Setting appropriate timeouts
  3. Using a CDN or object storage for very large files

Backend Handling

On the gRPC backend, the file is received as bytes. Example in Rust:

async fn upload_avatar(
    &self,
    request: Request<UploadAvatarRequest>,
) -> Result<Response<UploadAvatarResponse>, Status> {
    let req = request.into_inner();
    let file_data = req.avatar;  // Vec<u8>
    let size = file_data.len() as i64;
    
    // Save file, upload to S3, etc.
    
    Ok(Response::new(UploadAvatarResponse {
        user_id: req.user_id,
        size,
    }))
}

Field-Level Control

Use the graphql.field option to customize how individual fields are exposed in the GraphQL schema.

Basic Field Options

message User {
  string id = 1 [(graphql.field) = { required: true }];
  string email = 2 [(graphql.field) = { name: "emailAddress" }];
  string internal_id = 3 [(graphql.field) = { omit: true }];
  string password_hash = 4 [(graphql.field) = { omit: true }];
}

Available Options

OptionTypeDescription
namestringOverride the GraphQL field name
omitboolExclude this field from GraphQL schema
requiredboolMark field as non-nullable (!)
shareableboolFederation: field can be resolved by multiple subgraphs
externalboolFederation: field is defined in another subgraph
requiresstringFederation: fields needed from other subgraphs
providesstringFederation: fields this resolver provides

Renaming Fields

Use name to map protobuf field names to GraphQL conventions:

message User {
  string user_name = 1 [(graphql.field) = { name: "username" }];
  string email_address = 2 [(graphql.field) = { name: "email" }];
  int64 created_at_unix = 3 [(graphql.field) = { name: "createdAt" }];
}

Generated GraphQL:

type User {
  username: String!
  email: String!
  createdAt: Int!
}

Omitting Fields

Hide sensitive or internal fields:

message User {
  string id = 1;
  string name = 2;
  string password_hash = 3 [(graphql.field) = { omit: true }];
  string internal_notes = 4 [(graphql.field) = { omit: true }];
}

Generated GraphQL:

type User {
  id: String!
  name: String!
  # password_hash and internal_notes are not exposed
}

Required Fields

Mark fields as non-nullable in GraphQL:

message CreateUserInput {
  string name = 1 [(graphql.field) = { required: true }];
  string email = 2 [(graphql.field) = { required: true }];
  string bio = 3;  // Optional
}

Generated GraphQL:

input CreateUserInput {
  name: String!
  email: String!
  bio: String
}

Federation Directives

For Apollo Federation, use field-level directives:

message User {
  string id = 1 [(graphql.field) = { 
    required: true
    shareable: true 
  }];
  
  string email = 2 [(graphql.field) = { 
    external: true 
  }];
  
  repeated Review reviews = 3 [(graphql.field) = { 
    requires: "id" 
  }];
}

See Federation Directives for more details.

Combining Options

Options can be combined:

message Product {
  string upc = 1 [(graphql.field) = { 
    required: true
    name: "id"
    shareable: true 
  }];
}

Default Values

Protobuf fields have default values (empty string, 0, false). In GraphQL:

  • Fields with defaults may still be nullable
  • Use required: true to make them non-nullable
  • The gateway handles type conversion automatically

Multi-Descriptor Support

Combine multiple protobuf descriptor sets from different microservices into a unified GraphQL schema. This is essential for large microservice architectures where each team owns their proto files.

Overview

Instead of maintaining a single monolithic proto file, you can:

  1. Let each team generate their own descriptor file
  2. Combine them at gateway startup
  3. Serve a unified GraphQL API

Basic Usage

use grpc_graphql_gateway::{Gateway, GrpcClient};

// Load descriptor sets from different microservices
const USERS_DESCRIPTORS: &[u8] = include_bytes!("path/to/users.bin");
const PRODUCTS_DESCRIPTORS: &[u8] = include_bytes!("path/to/products.bin");
const ORDERS_DESCRIPTORS: &[u8] = include_bytes!("path/to/orders.bin");

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let gateway = Gateway::builder()
        // Primary descriptor set
        .with_descriptor_set_bytes(USERS_DESCRIPTORS)
        // Add additional services from other teams
        .add_descriptor_set_bytes(PRODUCTS_DESCRIPTORS)
        .add_descriptor_set_bytes(ORDERS_DESCRIPTORS)
        // Add clients for each service
        .add_grpc_client("users.UserService", 
            GrpcClient::builder("http://users:50051").connect_lazy()?)
        .add_grpc_client("products.ProductService", 
            GrpcClient::builder("http://products:50052").connect_lazy()?)
        .add_grpc_client("orders.OrderService", 
            GrpcClient::builder("http://orders:50053").connect_lazy()?)
        .build()?;

    gateway.serve("0.0.0.0:8888").await?;
    Ok(())
}

File-Based Loading

Load descriptors from files instead of embedding:

let gateway = Gateway::builder()
    .with_descriptor_set_file("path/to/users.bin")?
    .add_descriptor_set_file("path/to/products.bin")?
    .add_descriptor_set_file("path/to/orders.bin")?
    .build()?;

API Methods

MethodDescription
with_descriptor_set_bytes(bytes)Set primary descriptor (clears existing)
add_descriptor_set_bytes(bytes)Add additional descriptor
with_descriptor_set_file(path)Set primary descriptor from file
add_descriptor_set_file(path)Add additional descriptor from file
descriptor_count()Get number of configured descriptors

Use Cases

Microservice Architecture

Each team generates their own descriptor:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Users Team    β”‚     β”‚ Products Team   β”‚     β”‚  Orders Team    β”‚
β”‚                 β”‚     β”‚                 β”‚     β”‚                 β”‚
β”‚  users.proto    β”‚     β”‚ products.proto  β”‚     β”‚  orders.proto   β”‚
β”‚       ↓         β”‚     β”‚       ↓         β”‚     β”‚       ↓         β”‚
β”‚   users.bin     β”‚     β”‚  products.bin   β”‚     β”‚   orders.bin    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚                       β”‚                       β”‚
         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                 β”‚
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚    GraphQL Gateway      β”‚
                    β”‚                         β”‚
                    β”‚  Unified GraphQL Schema β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Schema Stitching

Combine services at the gateway level:

# From users.bin
type Query {
  user(id: ID!): User
}

# From products.bin  
type Query {
  product(upc: String!): Product
}

# From orders.bin
type Query {
  order(id: ID!): Order
}

# Unified Schema (automatic)
type Query {
  user(id: ID!): User
  product(upc: String!): Product
  order(id: ID!): Order
}

Independent Deployments

Update individual service descriptors without restarting:

// Hot-reload could be implemented by watching descriptor files
let gateway = Gateway::builder()
    .with_descriptor_set_file("/config/users.bin")?
    .add_descriptor_set_file("/config/products.bin")?
    .build()?;

How It Works

  1. Primary descriptor is loaded with with_descriptor_set_bytes/file
  2. Additional descriptors are merged using add_descriptor_set_bytes/file
  3. Duplicate files are automatically skipped (same filename)
  4. Services and types from all descriptors are combined
  5. GraphQL schema is generated from the merged pool

Requirements

  • All descriptors must include graphql.proto with annotations
  • Service names should be unique across descriptors
  • Type names are namespaced by their proto package

Logging

The gateway logs merge information:

INFO Merged 3 descriptor sets into unified schema (5 services, 42 types)
DEBUG Merged descriptor set #2 (15234 bytes) into schema pool
DEBUG Merged descriptor set #3 (8921 bytes) into schema pool

Error Handling

Common errors and solutions:

ErrorCauseSolution
at least one descriptor set is requiredNo descriptors providedAdd at least one with with_descriptor_set_bytes
failed to merge descriptor set #NInvalid protobuf dataVerify the descriptor file is valid
missing graphql.schema extensionAnnotations not foundEnsure graphql.proto is included

Apollo Federation Overview

Build federated GraphQL architectures with multiple subgraphs. The gateway supports Apollo Federation v2, allowing you to compose a supergraph from multiple gRPC services.

What is Federation?

Apollo Federation is an architecture for building a distributed GraphQL API. Instead of a monolithic schema, you have:

  • Subgraphs: Individual GraphQL services that own part of the schema
  • Supergraph: The composed schema combining all subgraphs
  • Router: Distributes queries to appropriate subgraphs

Gateway as Subgraph

The gRPC-GraphQL Gateway can act as a federation subgraph:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              Apollo Router / Gateway            β”‚
β”‚               (Supergraph Router)               β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                  β”‚                 β”‚
     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
     β”‚  gRPC-GraphQL     β”‚  β”‚  Traditional       β”‚
     β”‚  Gateway          β”‚  β”‚  GraphQL Service   β”‚
     β”‚  (Subgraph)       β”‚  β”‚  (Subgraph)        β”‚
     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                  β”‚
     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
     β”‚         gRPC Services         β”‚
     β”‚   Users β”‚ Products β”‚ Orders   β”‚
     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Enabling Federation

let gateway = Gateway::builder()
    .with_descriptor_set_bytes(DESCRIPTORS)
    .enable_federation()  // Enable federation features
    .add_grpc_client("users.UserService", user_client)
    .build()?;

Federation Features

When federation is enabled, the gateway:

  1. Adds _service query - Returns the SDL for schema composition
  2. Adds _entities query - Resolves entity references from other subgraphs
  3. Applies directives - @key, @shareable, @external, etc.

Schema Composition

Your proto files define entities with keys:

message User {
  option (graphql.entity) = {
    keys: "id"
    resolvable: true
  };
  
  string id = 1 [(graphql.field) = { required: true }];
  string name = 2;
  string email = 3;
}

This generates:

type User @key(fields: "id") {
  id: ID!
  name: String
  email: String
}

Running with Apollo Router

  1. Start your federation subgraphs
  2. Compose the supergraph schema
  3. Run Apollo Router

See Running with Apollo Router for detailed instructions.

Next Steps

Defining Entities

Entities are the building blocks of Apollo Federation. They’re types that can be resolved across multiple subgraphs using a unique key.

Basic Entity Definition

Use the graphql.entity option on your protobuf messages:

message User {
  option (graphql.entity) = {
    keys: "id"
    resolvable: true
  };
  
  string id = 1 [(graphql.field) = { required: true }];
  string name = 2;
  string email = 3 [(graphql.field) = { shareable: true }];
}

Entity Options

OptionTypeDescription
keysstringThe field(s) that uniquely identify this entity
resolvableboolWhether this subgraph can resolve the entity
extendboolWhether this extends an entity from another subgraph

Generated GraphQL

The above proto generates:

type User @key(fields: "id") {
  id: ID!
  name: String
  email: String @shareable
}

Composite Keys

Use space-separated fields for composite keys:

message Product {
  option (graphql.entity) = {
    keys: "sku region"
    resolvable: true
  };
  
  string sku = 1 [(graphql.field) = { required: true }];
  string region = 2 [(graphql.field) = { required: true }];
  string name = 3;
}

Generated:

type Product @key(fields: "sku region") {
  sku: ID!
  region: ID!
  name: String
}

Multiple Keys

Define multiple key sets by repeating the graphql.entity option or using multiple key definitions:

message User {
  option (graphql.entity) = {
    keys: "id"
    resolvable: true
  };
  
  string id = 1;
  string email = 2;  // Could also be a key
}

Resolvable vs Non-Resolvable

Resolvable Entities

When resolvable: true, this subgraph can fully resolve the entity:

message User {
  option (graphql.entity) = {
    keys: "id"
    resolvable: true  // Can resolve User by id
  };
  
  string id = 1;
  string name = 2;
  string email = 3;
}

Stub Entities

When resolvable: false, this subgraph only references the entity:

message User {
  option (graphql.entity) = {
    keys: "id"
    resolvable: false  // Cannot resolve, just references
  };
  
  string id = 1 [(graphql.field) = { external: true }];
}

Real-World Example

Users Service (owns User entity):

message User {
  option (graphql.entity) = {
    keys: "id"
    resolvable: true
  };
  
  string id = 1 [(graphql.field) = { required: true }];
  string name = 2 [(graphql.field) = { shareable: true }];
  string email = 3 [(graphql.field) = { shareable: true }];
}

Reviews Service (references User):

message Review {
  string id = 1;
  string body = 2;
  User author = 3;  // Reference to User from Users service
}

message User {
  option (graphql.entity) = {
    keys: "id"
    extend: true  // Extending User from another subgraph
  };
  
  string id = 1 [(graphql.field) = { external: true, required: true }];
  repeated Review reviews = 2 [(graphql.field) = { requires: "id" }];
}

Key Field Requirements

Key fields should be:

  1. Marked as required - Use required: true
  2. Non-null in responses - Always return a value
  3. Consistent across subgraphs - Same type everywhere

Entity Resolution

When Apollo Router receives a query that spans multiple subgraphs, it needs to resolve entity references. The gateway includes production-ready entity resolution with DataLoader batching.

How Entity Resolution Works

  1. Router sends _entities query with representations
  2. Gateway receives representations (e.g., { __typename: "User", id: "123" })
  3. Gateway calls your gRPC backend to resolve the entity
  4. Gateway returns the resolved entity data

Configuring Entity Resolution

use grpc_graphql_gateway::{
    Gateway, GrpcClient, EntityResolverMapping, GrpcEntityResolver
};
use std::sync::Arc;

// Configure entity resolver with DataLoader batching
let resolver = GrpcEntityResolver::builder(client_pool)
    .register_entity_resolver(
        "User",
        EntityResolverMapping {
            service_name: "UserService".to_string(),
            method_name: "GetUser".to_string(),
            key_field: "id".to_string(),
        }
    )
    .register_entity_resolver(
        "Product",
        EntityResolverMapping {
            service_name: "ProductService".to_string(),
            method_name: "GetProduct".to_string(),
            key_field: "upc".to_string(),
        }
    )
    .build();

let gateway = Gateway::builder()
    .with_descriptor_set_bytes(DESCRIPTORS)
    .enable_federation()
    .with_entity_resolver(Arc::new(resolver))
    .add_grpc_client("UserService", user_client)
    .add_grpc_client("ProductService", product_client)
    .build()?;

DataLoader Batching

The built-in GrpcEntityResolver uses DataLoader to batch entity requests:

Query requests:
  - User(id: "1")
  - User(id: "2")  
  - User(id: "3")

Without DataLoader: 3 gRPC calls
With DataLoader: 1 batched gRPC call

Benefits

  • βœ… No N+1 Queries - Concurrent requests are batched
  • βœ… Automatic Coalescing - Duplicate keys are deduplicated
  • βœ… Per-Request Caching - Same entity isn’t fetched twice per request

Custom Entity Resolver

Implement the EntityResolver trait for custom logic:

use grpc_graphql_gateway::federation::{EntityConfig, EntityResolver};
use async_graphql::{Value, indexmap::IndexMap, Name};
use async_trait::async_trait;

struct MyEntityResolver {
    // Your dependencies
}

#[async_trait]
impl EntityResolver for MyEntityResolver {
    async fn resolve_entity(
        &self,
        config: &EntityConfig,
        representation: &IndexMap<Name, Value>,
    ) -> Result<Value, Box<dyn std::error::Error + Send + Sync>> {
        let typename = &config.type_name;
        
        match typename.as_str() {
            "User" => {
                let id = representation.get(&Name::new("id"))
                    .and_then(|v| v.as_str())
                    .ok_or("missing id")?;
                
                // Fetch from your backend
                let user = self.fetch_user(id).await?;
                
                Ok(Value::Object(indexmap! {
                    Name::new("id") => Value::String(user.id),
                    Name::new("name") => Value::String(user.name),
                    Name::new("email") => Value::String(user.email),
                }))
            }
            _ => Err(format!("Unknown entity type: {}", typename).into()),
        }
    }
}

EntityResolverMapping

Configure how each entity type maps to a gRPC method:

FieldDescription
service_nameThe gRPC service name
method_nameThe RPC method to call
key_fieldThe field in the request message that holds the key

Query Example

When Router sends:

query {
  _entities(representations: [
    { __typename: "User", id: "123" }
    { __typename: "User", id: "456" }
  ]) {
    ... on User {
      id
      name
      email
    }
  }
}

The gateway:

  1. Extracts the representations
  2. Groups by __typename
  3. Batches calls to the appropriate gRPC services
  4. Returns resolved entities

Error Handling

Entity resolution errors are returned per-entity:

{
  "data": {
    "_entities": [
      { "id": "123", "name": "Alice", "email": "alice@example.com" },
      null
    ]
  },
  "errors": [
    {
      "message": "User not found: 456",
      "path": ["_entities", 1]
    }
  ]
}

Performance Tips

  1. Use DataLoader - Always batch entity requests
  2. Implement bulk fetch - Have gRPC methods that fetch multiple entities
  3. Cache wisely - Consider caching frequently accessed entities
  4. Monitor - Track entity resolution latency with metrics

Extending Entities

Extend entities defined in other subgraphs to add fields that your service owns.

Basic Extension

Use extend: true to extend an entity from another subgraph:

// In Reviews service - extending User from Users service
message User {
  option (graphql.entity) = {
    extend: true
    keys: "id"
  };
  
  // Key field from the original entity
  string id = 1 [(graphql.field) = {
    external: true
    required: true
  }];
  
  // Fields this service adds
  repeated Review reviews = 2 [(graphql.field) = {
    requires: "id"
  }];
}

Generated Schema

The above generates federation-compatible schema:

type User @key(fields: "id") @extends {
  id: ID! @external
  reviews: [Review] @requires(fields: "id")
}

Extension Pattern

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                         Supergraph                              β”‚
β”‚                                                                 β”‚
β”‚  type User @key(fields: "id") {                                 β”‚
β”‚    id: ID!           # From Users Service                       β”‚
β”‚    name: String      # From Users Service                       β”‚
β”‚    email: String     # From Users Service                       β”‚
β”‚    reviews: [Review] # From Reviews Service (extension)         β”‚
β”‚  }                                                              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β–²                                    β–²
         β”‚                                    β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Users Service  β”‚              β”‚    Reviews Service       β”‚
β”‚                 β”‚              β”‚                          β”‚
β”‚  type User      β”‚              β”‚  type User @extends      β”‚
β”‚    id: ID!      β”‚              β”‚    id: ID! @external     β”‚
β”‚    name: String β”‚              β”‚    reviews: [Review]     β”‚
β”‚    email: Stringβ”‚              β”‚                          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

External Fields

Mark fields owned by another subgraph as external:

message User {
  option (graphql.entity) = {
    extend: true
    keys: "id"
  };
  
  string id = 1 [(graphql.field) = { 
    external: true  // This field comes from another subgraph
    required: true 
  }];
  
  string name = 2 [(graphql.field) = { 
    external: true  // Also external
  }];
  
  // This service's contribution
  int32 review_count = 3;
}

Requires Directive

Use requires when you need data from external fields to resolve a local field:

message Product {
  option (graphql.entity) = {
    extend: true
    keys: "upc"
  };
  
  string upc = 1 [(graphql.field) = { external: true }];
  float price = 2 [(graphql.field) = { external: true }];
  float weight = 3 [(graphql.field) = { external: true }];
  
  // Needs price and weight to calculate
  float shipping_estimate = 4 [(graphql.field) = { 
    requires: "price weight" 
  }];
}

The federation router will fetch price and weight from the owning subgraph before calling your resolver for shipping_estimate.

Provides Directive

Use provides to indicate which nested fields your resolver provides:

message Review {
  string id = 1;
  string body = 2;
  
  // When resolving author, we also provide their username
  User author = 3 [(graphql.field) = {
    provides: "username"
  }];
}

Complete Example

Users Subgraph (owns User):

message User {
  option (graphql.entity) = {
    keys: "id"
    resolvable: true
  };
  
  string id = 1 [(graphql.field) = { required: true }];
  string name = 2 [(graphql.field) = { shareable: true }];
  string email = 3;
}

Reviews Subgraph (extends User):

message User {
  option (graphql.entity) = {
    extend: true
    keys: "id"
  };
  
  string id = 1 [(graphql.field) = { external: true, required: true }];
  repeated Review reviews = 2;
}

message Review {
  string id = 1;
  string body = 2;
  int32 rating = 3;
  User author = 4;
}

Composed Query:

query {
  user(id: "123") {
    id
    name          # Resolved by Users subgraph
    email         # Resolved by Users subgraph
    reviews {     # Resolved by Reviews subgraph (extension)
      id
      body
      rating
    }
  }
}

Federation Directives

The gateway supports all Apollo Federation v2 directives through proto annotations.

Directive Reference

DirectiveProto OptionPurpose
@keygraphql.entity.keysDefine entity key fields
@shareablegraphql.field.shareableField resolvable from multiple subgraphs
@externalgraphql.field.externalField defined in another subgraph
@requiresgraphql.field.requiresFields needed from other subgraphs
@providesgraphql.field.providesFields this resolver provides
@extendsgraphql.entity.extendExtending entity from another subgraph

@key

Defines how an entity is uniquely identified:

message User {
  option (graphql.entity) = {
    keys: "id"
  };
  string id = 1;
}

Generated:

type User @key(fields: "id") {
  id: ID!
}

Multiple Keys

message Product {
  option (graphql.entity) = {
    keys: "upc"  // Primary key
  };
  string upc = 1;
  string sku = 2;
}

Composite Keys

message Inventory {
  option (graphql.entity) = {
    keys: "warehouseId productId"
  };
  string warehouse_id = 1;
  string product_id = 2;
  int32 quantity = 3;
}

@shareable

Marks fields that can be resolved by multiple subgraphs:

message User {
  string id = 1;
  string name = 2 [(graphql.field) = { shareable: true }];
  string email = 3 [(graphql.field) = { shareable: true }];
}

Generated:

type User {
  id: ID!
  name: String @shareable
  email: String @shareable
}

When to Use

Use @shareable when:

  • Multiple subgraphs can resolve the same field
  • You want redundancy for a commonly accessed field
  • Different subgraphs have the same data source

@external

Marks fields defined in another subgraph that you need to reference:

message User {
  option (graphql.entity) = { extend: true, keys: "id" };
  
  string id = 1 [(graphql.field) = { external: true }];
  string name = 2 [(graphql.field) = { external: true }];
  repeated Review reviews = 3;  // Your field
}

Generated:

type User @extends @key(fields: "id") {
  id: ID! @external
  name: String @external
  reviews: [Review]
}

@requires

Declares that a field requires data from external fields:

message Product {
  option (graphql.entity) = { extend: true, keys: "upc" };
  
  string upc = 1 [(graphql.field) = { external: true }];
  float price = 2 [(graphql.field) = { external: true }];
  float weight = 3 [(graphql.field) = { external: true }];
  
  float shipping_cost = 4 [(graphql.field) = { 
    requires: "price weight" 
  }];
}

Generated:

type Product @extends @key(fields: "upc") {
  upc: ID! @external
  price: Float @external
  weight: Float @external
  shippingCost: Float @requires(fields: "price weight")
}

How It Works

  1. Router fetches price and weight from the owning subgraph
  2. Router sends those values to your subgraph
  3. Your resolver uses them to calculate shippingCost

@provides

Hints that a resolver provides additional fields on referenced entities:

message Review {
  string id = 1;
  string body = 2;
  
  User author = 3 [(graphql.field) = {
    provides: "name email"
  }];
}

Generated:

type Review {
  id: ID!
  body: String
  author: User @provides(fields: "name email")
}

When to Use

Use @provides when:

  • Your resolver already has the nested entity’s data
  • You want to avoid an extra subgraph hop
  • You’re denormalizing for performance

Complete Example

Products Subgraph:

message Product {
  option (graphql.entity) = {
    keys: "upc"
    resolvable: true
  };
  
  string upc = 1 [(graphql.field) = { required: true }];
  string name = 2 [(graphql.field) = { shareable: true }];
  float price = 3 [(graphql.field) = { shareable: true }];
}

Inventory Subgraph:

message Product {
  option (graphql.entity) = {
    extend: true
    keys: "upc"
  };
  
  string upc = 1 [(graphql.field) = { external: true }];
  float price = 2 [(graphql.field) = { external: true }];
  float weight = 3 [(graphql.field) = { external: true }];
  
  int32 stock = 4;
  bool in_stock = 5;
  float shipping_estimate = 6 [(graphql.field) = {
    requires: "price weight"
  }];
}

Running with Apollo Router

Compose your gRPC-GraphQL Gateway subgraphs with Apollo Router to create a federated supergraph.

Prerequisites

  • Apollo Router installed
  • Federation-enabled gateway subgraphs running

Step 1: Start Your Subgraphs

Start each gRPC-GraphQL Gateway as a federation subgraph:

Users Subgraph (port 8891):

let gateway = Gateway::builder()
    .with_descriptor_set_bytes(USERS_DESCRIPTORS)
    .enable_federation()
    .add_grpc_client("users.UserService", user_client)
    .build()?;

gateway.serve("0.0.0.0:8891").await?;

Products Subgraph (port 8892):

let gateway = Gateway::builder()
    .with_descriptor_set_bytes(PRODUCTS_DESCRIPTORS)
    .enable_federation()
    .add_grpc_client("products.ProductService", product_client)
    .build()?;

gateway.serve("0.0.0.0:8892").await?;

Step 2: Create Supergraph Configuration

Create supergraph.yaml:

federation_version: =2.3.2

subgraphs:
  users:
    routing_url: http://localhost:8891/graphql
    schema:
      subgraph_url: http://localhost:8891/graphql
  
  products:
    routing_url: http://localhost:8892/graphql
    schema:
      subgraph_url: http://localhost:8892/graphql

Step 3: Compose the Supergraph

Install Rover CLI:

curl -sSL https://rover.apollo.dev/nix/latest | sh

Compose the supergraph:

rover supergraph compose --config supergraph.yaml > supergraph.graphql

Step 4: Run Apollo Router

router --supergraph supergraph.graphql --dev

Or with configuration:

router \
  --supergraph supergraph.graphql \
  --config router.yaml

Router Configuration

Create router.yaml for production:

supergraph:
  listen: 0.0.0.0:4000
  introspection: true

cors:
  origins:
    - https://studio.apollographql.com

telemetry:
  exporters:
    tracing:
      otlp:
        enabled: true
        endpoint: http://jaeger:4317

health_check:
  listen: 0.0.0.0:8088
  enabled: true
  path: /health

Querying the Supergraph

Once running, query through the router:

query {
  user(id: "123") {
    id
    name
    email
    orders {      # From Orders subgraph
      id
      total
      products {  # From Products subgraph
        upc
        name
        price
      }
    }
  }
}

Docker Compose Example

version: '3.8'

services:
  router:
    image: ghcr.io/apollographql/router:v1.25.0
    ports:
      - "4000:4000"
    volumes:
      - ./supergraph.graphql:/supergraph.graphql
      - ./router.yaml:/router.yaml
    command: --supergraph /supergraph.graphql --config /router.yaml

  users-gateway:
    build: ./users-gateway
    ports:
      - "8891:8888"
    depends_on:
      - users-grpc

  products-gateway:
    build: ./products-gateway
    ports:
      - "8892:8888"
    depends_on:
      - products-grpc

  users-grpc:
    build: ./users-service
    ports:
      - "50051:50051"

  products-grpc:
    build: ./products-service
    ports:
      - "50052:50052"

Continuous Composition

For production environments, we recommend using Apollo GraphOS for managed federation and continuous delivery.

See the GraphOS & Schema Registry guide for detailed instructions on publishing subgraphs and setting up CI/CD pipelines.

Troubleshooting

Subgraph Schema Fetch Fails

Ensure the subgraph introspection is enabled and accessible:

curl http://localhost:8891/graphql \
  -H "Content-Type: application/json" \
  -d '{"query": "{ _service { sdl } }"}'

Entity Resolution Errors

Check that:

  1. Entity resolvers are configured for all entity types
  2. gRPC clients are connected
  3. Key fields match between subgraphs

Composition Errors

Run composition with verbose output:

rover supergraph compose --config supergraph.yaml --log debug

Apollo GraphOS & Schema Registry

Apollo GraphOS is a platform for building, managing, and scaling your supergraph. It provides a Schema Registry that acts as the source of truth for your supergraph’s schema, enabling Managed Federation.

Why use GraphOS?

  • Managed Federation: GraphOS handles supergraph composition for you.
  • Schema Checks: Validate changes against production traffic before deploying.
  • Explorer: A powerful IDE for your supergraph.
  • Metrics: Detailed usage statistics and performance monitoring.

Prerequisites

  1. An Apollo Studio account.
  2. The Rover CLI installed.
  3. A created Graph in Apollo Studio (of type β€œSupergraph”).

Publishing Subgraphs

Instead of composing the supergraph locally, you publish each subgraph’s schema to the GraphOS Registry. GraphOS then composes them into a supergraph schema.

1. Introspect Your Subgraph

First, start your grpc-graphql-gateway instance. Then, verify you can fetch the SDL:

# Example for the 'users' subgraph running on port 8891
rover subgraph introspect http://localhost:8891/graphql > users.graphql

2. Publish the Subgraph

Use rover to publish the schema to your graph variant (e.g., current or production).

# Replace MY_GRAPH with your Graph ID and 'users' with your subgraph name
rover subgraph publish MY_GRAPH@current \
  --name users \
  --schema ./users.graphql \
  --routing-url http://users-service:8891/graphql

Repeat this for all your subgraphs (e.g., products, reviews).

Automatic Composition

Once subgraphs are published, GraphOS automatically composes the supergraph schema.

You can view the status and build errors in the Build tab in Apollo Studio.

Fetching the Supergraph Schema

Your Apollo Router (or Gateway) needs the composed supergraph schema. With GraphOS, you have two options:

Configure Apollo Router to fetch the configuration directly from GraphOS. This allows for dynamic updates without restarting the router.

Set the APOLLO_KEY and APOLLO_GRAPH_REF environment variables:

export APOLLO_KEY=service:MY_GRAPH:your-api-key
export APOLLO_GRAPH_REF=MY_GRAPH@current

./router

Option B: CI/CD Fetch

Fetch the supergraph schema during your build process:

rover supergraph fetch MY_GRAPH@current > supergraph.graphql

./router --supergraph supergraph.graphql

Schema Checks

Before deploying a change, run a schema check to ensure it doesn’t break existing clients.

rover subgraph check MY_GRAPH@current \
  --name users \
  --schema ./users.graphql

GitHub Actions Example

Here is an example workflow to check and publish your schema:

name: Schema Registry

on:
  push:
    branches: [ main ]
  pull_request:

env:
  APOLLO_KEY: ${{ secrets.APOLLO_KEY }}
  APOLLO_VCS_COMMIT: ${{ github.event.pull_request.head.sha }}

jobs:
  schema-check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Install Rover
        run: curl -sSL https://rover.apollo.dev/nix/latest | sh
      
      - name: Start Gateway (Background)
        run: cargo run --bin users-service &
      
      - name: Introspect Schema
        run: |
          sleep 10
          ~/.rover/bin/rover subgraph introspect http://localhost:8891/graphql > users.graphql

      - name: Check Schema
        if: github.event_name == 'pull_request'
        run: |
          ~/.rover/bin/rover subgraph check MY_GRAPH@current \
            --name users \
            --schema ./users.graphql

      - name: Publish Schema
        if: github.event_name == 'push'
        run: |
          ~/.rover/bin/rover subgraph publish MY_GRAPH@current \
            --name users \
            --schema ./users.graphql \
            --routing-url http://users-service/graphql

Authentication

The gateway provides a robust, built-in Enhanced Authentication Middleware designed for production use. It supports multiple authentication schemes, flexible token validation, and rich user context propagation.

Quick Start

use grpc_graphql_gateway::{
    Gateway, 
    EnhancedAuthMiddleware, 
    AuthConfig, 
    AuthClaims,
    TokenValidator,
    Result
};
use std::sync::Arc;
use async_trait::async_trait;

// 1. Define your token validator
struct MyJwtValidator;

#[async_trait]
impl TokenValidator for MyJwtValidator {
    async fn validate(&self, token: &str) -> Result<AuthClaims> {
        // Implement your JWT validation logic here
        // e.g., decode(token, &decoding_key, &validation)...
        
        Ok(AuthClaims {
            sub: Some("user_123".to_string()),
            roles: vec!["admin".to_string()],
            ..Default::default()
        })
    }
}

// 2. Configure and build the gateway
let auth_middleware = EnhancedAuthMiddleware::new(
    AuthConfig::required()
        .with_scheme(AuthScheme::Bearer),
    Arc::new(MyJwtValidator),
);

let gateway = Gateway::builder()
    .with_descriptor_set_bytes(DESCRIPTORS)
    .add_middleware(auth_middleware)
    .build()?;

Configuration

The AuthConfig builder allows you to customize how authentication is handled:

use grpc_graphql_gateway::{AuthConfig, AuthScheme};

let config = AuthConfig::required()
    // Allow multiple schemes
    .with_scheme(AuthScheme::Bearer)
    .with_scheme(AuthScheme::ApiKey)
    .with_api_key_header("x-service-token")
    
    // Public paths that don't need auth
    .skip_path("/health")
    .skip_path("/metrics")
    
    // Whether to require auth for introspection (default: true)
    .with_skip_introspection(false);

// Or create an optional config (allow unauthenticated requests)
let optional_config = AuthConfig::optional();

Supported Schemes

SchemeDescriptionHeader Example
AuthScheme::BearerStandard Bearer tokenAuthorization: Bearer <token>
AuthScheme::BasicBasic auth credentialsAuthorization: Basic <base64>
AuthScheme::ApiKeyCustom header API keyx-api-key: <key>
AuthScheme::CustomCustom prefixAuthorization: Custom <token>

Token Validation

You can implement the TokenValidator trait for reusable logic, or use a closure for simple cases.

Using a Closure

let auth = EnhancedAuthMiddleware::with_fn(
    AuthConfig::required(),
    |token| Box::pin(async move {
        if token == "secret-password" {
            Ok(AuthClaims {
                sub: Some("admin".to_string()),
                ..Default::default()
            })
        } else {
            Err(Error::Unauthorized("Invalid token".into()))
        }
    })
);

User Context (AuthClaims)

The middleware extracts user information into AuthClaims, which are available in the GraphQL context.

FieldTypeDescription
subOption<String>Subject (User ID)
rolesVec<String>User roles
issOption<String>Issuer
audOption<Vec<String>>Audience
expOption<i64>Expiration (Unix timestamp)
customHashMapCustom claims

Accessing Claims in Resolvers

In your custom resolvers or middleware, you can access these claims via the context:

async fn my_resolver(ctx: &Context) -> Result<String> {
    // Convenience methods
    let user_id = ctx.user_id();     // Option<String>
    let roles = ctx.user_roles();    // Vec<String>
    
    // Check authentication status
    if ctx.get("auth.authenticated") == Some(&serde_json::json!(true)) {
        // ...
    }
    
    // Access full claims
    if let Some(claims) = ctx.get_typed::<AuthClaims>("auth.claims") {
        println!("User: {:?}", claims.sub);
    }
}

Error Handling

  • Missing Token: If AuthConfig::required() is used, returns 401 Unauthorized immediately.
  • Invalid Token: Returns 401 Unauthorized with error details.
  • Expired Token: Automatically checks exp claim and returns 401 if expired.

To permit unauthenticated access (e.g. for public parts of the graph), use AuthConfig::optional(). The request will proceed, but ctx.user_id() will be None.

Authorization

Once a user is authenticated, Authorization determines what they are allowed to do. The gateway facilitates this by making user roles and claims available to your resolvers and downstream services.

Role-Based Access Control (RBAC)

The AuthClaims object includes a roles field (Vec<String>) which works out-of-the-box for RBAC.

Checking Roles in Logic

You can check roles programmatically within your custom resolvers or middleware:

async fn delete_user(ctx: &Context, id: String) -> Result<String> {
    let claims = ctx.get_typed::<AuthClaims>("auth.claims")
        .ok_or(Error::Unauthorized("No claims found".into()))?;
        
    if !claims.has_role("admin") {
        return Err(Error::Forbidden("Admins only".into()));
    }
    
    // Proceed with deletion...
}

Propagating Auth to Backends

The most common pattern in a gateway is to offload fine-grained authorization to the backend services. The gateway’s job is to securely propagate the identity.

Header Propagation

You can forward authentication headers directly to your gRPC services:

// Forward the 'Authorization' header automatically
let gateway = Gateway::builder()
    .with_header_propagation(HeaderPropagationConfig {
        forward_headers: vec!["authorization".to_string()],
        ..Default::default()
    })
    // ...

Metadata Propagation

Alternatively, you can extract claims and inject them as gRPC metadata (headers) for your backends. EnhancedAuthMiddleware does not do this automatically, but you can write a custom middleware to run after it:

struct AuthPropagationMiddleware;

#[async_trait]
impl Middleware for AuthPropagationMiddleware {
    async fn call(&self, ctx: &mut Context) -> Result<()> {
        if let Some(user_id) = ctx.user_id() {
            // Add to headers that will be sent to gRPC backend
            ctx.headers.insert("x-user-id", user_id.parse()?);
        }
        
        if let Some(roles) = ctx.get("auth.roles") {
             // Serialize roles to a header
             let roles_str = serde_json::to_string(roles)?;
             ctx.headers.insert("x-user-roles", roles_str.parse()?);
        }
        
        Ok(())
    }
}

Query Whitelisting

For strict control over what operations can be executed, see the Query Whitelisting feature. This acts as a coarse-grained authorization layer, preventing unauthorized query shapes entirely.

Field-Level Authorization

For advanced field-level authorization (e.g., hiding specific fields based on roles), you currently need to implement this logic in your custom resolvers or within the backend services themselves. The gateway ensures the necessary identity data is present for these decisions to be made.

Security Headers

The gateway automatically adds comprehensive security headers to all HTTP responses, providing defense-in-depth protection for production deployments.

Headers Applied

HTTP Strict Transport Security (HSTS)

Strict-Transport-Security: max-age=31536000; includeSubDomains

Forces browsers to only communicate over HTTPS for one year, including all subdomains. This prevents protocol downgrade attacks and cookie hijacking.

Content Security Policy (CSP)

Content-Security-Policy: default-src 'self'; script-src 'self'; style-src 'self' 'unsafe-inline'

Restricts resource loading to same-origin, preventing XSS attacks by blocking inline scripts and external script sources.

X-Content-Type-Options

X-Content-Type-Options: nosniff

Prevents browsers from MIME-sniffing responses, protecting against drive-by download attacks.

X-Frame-Options

X-Frame-Options: DENY

Prevents the page from being embedded in iframes, protecting against clickjacking attacks.

X-XSS-Protection

X-XSS-Protection: 1; mode=block

Enables browser’s built-in XSS filtering (for legacy browsers).

Referrer-Policy

Referrer-Policy: strict-origin-when-cross-origin

Controls referrer information sent with requests, limiting data leakage to third parties.

Cache-Control

Cache-Control: no-store, no-cache, must-revalidate

Prevents caching of sensitive GraphQL responses by browsers and proxies.

CORS Configuration

The gateway handles CORS preflight requests automatically:

OPTIONS Requests

Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: GET, POST, OPTIONS
Access-Control-Allow-Headers: Content-Type, Authorization, X-Request-ID
Access-Control-Max-Age: 86400

Customizing CORS

For production deployments, you may want to restrict the Access-Control-Allow-Origin to specific domains. This can be configured in your gateway setup.

Security Test Verification

The gateway includes a comprehensive security test suite (test_security.sh) that verifies all security headers:

./test_security.sh

# Expected output:
[PASS] T1: X-Content-Type-Options: nosniff
[PASS] T2: X-Frame-Options: DENY
[PASS] T12: HSTS Enabled
[PASS] T13: No X-Powered-By Header
[PASS] T14: Server Header Hidden
[PASS] T15: TRACE Rejected (405)
[PASS] T16: OPTIONS/CORS Supported (204)

Best Practices

For Production

  1. Always use HTTPS - HSTS is automatically enabled
  2. Configure specific CORS origins - Replace * with your domain
  3. Review CSP rules - Adjust based on your frontend requirements
  4. Monitor security headers - Use tools like securityheaders.com

Additional Recommendations

  • Enable TLS 1.3 on your reverse proxy (nginx/Cloudflare)
  • Use certificate pinning for high-security applications
  • Implement rate limiting at the edge
  • Enable audit logging for security events

DoS Protection

Protect your gateway and gRPC backends from denial-of-service attacks with query depth and complexity limiting.

Query Depth Limiting

Prevent deeply nested queries that could overwhelm your backends:

let gateway = Gateway::builder()
    .with_descriptor_set_bytes(DESCRIPTORS)
    .with_query_depth_limit(10)  // Max 10 levels of nesting
    .build()?;

What It Prevents

# This would be blocked if depth exceeds limit
query {
  users {           # depth 1
    friends {       # depth 2
      friends {     # depth 3
        friends {   # depth 4
          friends { # depth 5 - blocked if limit < 5
            name
          }
        }
      }
    }
  }
}

Error Response

{
  "errors": [
    {
      "message": "Query is nested too deep",
      "extensions": {
        "code": "QUERY_TOO_DEEP"
      }
    }
  ]
}

Query Complexity Limiting

Limit the total β€œcost” of a query:

let gateway = Gateway::builder()
    .with_descriptor_set_bytes(DESCRIPTORS)
    .with_query_complexity_limit(100)  // Max complexity of 100
    .build()?;

How Complexity is Calculated

Each field adds to the complexity:

# Complexity = 4 (users + friends + name + email)
query {
  users {        # +1
    friends {    # +1
      name       # +1
      email      # +1
    }
  }
}

Error Response

{
  "errors": [
    {
      "message": "Query is too complex",
      "extensions": {
        "code": "QUERY_TOO_COMPLEX"
      }
    }
  ]
}
Use CaseDepth LimitComplexity Limit
Public API5-1050-100
Authenticated Users10-15100-500
Internal/Trusted15-25500-1000

Combining Limits

Use both limits together for comprehensive protection:

let gateway = Gateway::builder()
    .with_descriptor_set_bytes(DESCRIPTORS)
    .with_query_depth_limit(10)
    .with_query_complexity_limit(100)
    .build()?;

Environment-Based Configuration

Adjust limits based on environment:

let depth_limit = std::env::var("QUERY_DEPTH_LIMIT")
    .ok()
    .and_then(|s| s.parse().ok())
    .unwrap_or(10);

let complexity_limit = std::env::var("QUERY_COMPLEXITY_LIMIT")
    .ok()
    .and_then(|s| s.parse().ok())
    .unwrap_or(100);

let gateway = Gateway::builder()
    .with_query_depth_limit(depth_limit)
    .with_query_complexity_limit(complexity_limit)
    .build()?;

Query Whitelisting

Query Whitelisting (also known as Stored Operations or Persisted Operations) is a critical security feature that restricts which GraphQL queries can be executed. This is essential for public-facing GraphQL APIs and required for many compliance standards.

Why Query Whitelisting?

Security Benefits

  • Prevents Arbitrary Queries: Only pre-approved queries can be executed
  • Reduces Attack Surface: Prevents schema exploration and DoS attacks
  • Compliance: Required for PCI-DSS, HIPAA, SOC 2, and other standards
  • Performance: Known queries can be optimized and monitored
  • Audit Trail: Track exactly which queries are being used

Common Use Cases

  1. Public APIs: Prevent malicious actors from crafting expensive queries
  2. Mobile Applications: Apps typically have a fixed set of queries
  3. Third-Party Integrations: Control exactly what partners can query
  4. Compliance Requirements: Meet security standards for regulated industries

Configuration

Basic Setup

use grpc_graphql_gateway::{Gateway, QueryWhitelistConfig, WhitelistMode};
use std::collections::HashMap;

let mut allowed_queries = HashMap::new();
allowed_queries.insert(
    "getUserById".to_string(),
    "query getUserById($id: ID!) { user(id: $id) { id name } }".to_string()
);

let gateway = Gateway::builder()
    .with_query_whitelist(QueryWhitelistConfig {
        mode: WhitelistMode::Enforce,
        allowed_queries,
        allow_introspection: false,
    })
    .build()?;

Loading from JSON File

For production deployments, it’s recommended to load queries from a configuration file:

let config = QueryWhitelistConfig::from_json_file(
    "config/allowed_queries.json",
    WhitelistMode::Enforce
)?;

let gateway = Gateway::builder()
    .with_query_whitelist(config)
    .build()?;

Example JSON file (allowed_queries.json):

{
  "getUserById": "query getUserById($id: ID!) { user(id: $id) { id name email } }",
  "listProducts": "query { products { id name price } }",
  "createOrder": "mutation createOrder($input: OrderInput!) { createOrder(input: $input) { id } }"
}

Enforcement Modes

Enforce Mode (Production)

Rejects non-whitelisted queries with an error.

QueryWhitelistConfig {
    mode: WhitelistMode::Enforce,
    // ...
}

Error response:

{
  "errors": [{
    "message": "Query not in whitelist: Operation 'unknownQuery' (hash: 1234abcd...)",
    "extensions": {
      "code": "QUERY_NOT_WHITELISTED"
    }
  }]
}

Warn Mode (Staging)

Logs warnings but allows all queries. Useful for testing and identifying missing queries.

QueryWhitelistConfig {
    mode: WhitelistMode::Warn,
    // ...
}

Server log:

WARN grpc_graphql_gateway::query_whitelist: Query not in whitelist (allowed in Warn mode): Query hash: 0eb2d2f2e9111722

Disabled Mode (Development)

No whitelist checking. Same as not configuring a whitelist.

QueryWhitelistConfig::disabled()

Validation Methods

The whitelist supports two validation methods that can be used together:

1. Hash-Based Validation

Queries are validated by their SHA-256 hash. This is automatic and requires no client changes.

# This query's hash is calculated automatically
query { user(id: "123") { name } }

Query Normalization (v0.3.7+)

The gateway normalizes queries before hashing, so semantically equivalent queries produce the same hash. This means the following queries all match the same whitelist entry:

# Original
query { hello(name: "World") { message } }

# With extra whitespace
query   {   hello( name: "World" )   { message } }

# With comments stripped
query { # This is ignored
  hello(name: "World") { message }
}

# Multi-line format
query {
  hello(name: "World") {
    message
  }
}

Normalization rules:

  • Comments (# line comments and """ block comments) are removed
  • Whitespace is collapsed (multiple spaces β†’ single space)
  • Whitespace around punctuation ({, }, (, ), :, etc.) is removed
  • String literals are preserved exactly
  • Newlines are treated as whitespace

2. Operation ID Validation

Clients can explicitly reference queries by ID using GraphQL extensions:

Client request:

{
  "query": "query getUserById($id: ID!) { user(id: $id) { name } }",
  "variables": {"id": "123"},
  "extensions": {
    "operationId": "getUserById"
  }
}

The gateway validates the operationId against the whitelist.

Introspection Control

You can optionally allow introspection queries even in Enforce mode:

QueryWhitelistConfig {
    mode: WhitelistMode::Enforce,
    allowed_queries: queries,
    allow_introspection: true,  // Allow __schema and __type queries
}

This is useful for development and staging environments where developers need to explore the schema.

Runtime Management

The whitelist supports runtime modifications for dynamic use cases:

// Get whitelist reference
let whitelist = gateway.mux().query_whitelist().unwrap();

// Register new query at runtime
whitelist.register_query(
    "newQuery".to_string(),
    "query { newField }".to_string()
);

// Remove a query
whitelist.remove_query("oldQuery");

// Get statistics
let stats = whitelist.stats();
println!("Total allowed queries: {}", stats.total_queries);
println!("Mode: {:?}", stats.mode);

Best Practices

1. Use Enforce Mode in Production

Always use WhitelistMode::Enforce in production environments:

let mode = if std::env::var("ENV")? == "production" {
    WhitelistMode::Enforce
} else {
    WhitelistMode::Warn
};

2. Start with Warn Mode

When first implementing whitelisting:

  1. Deploy with Warn mode in staging
  2. Monitor logs to identify all queries
  3. Add missing queries to whitelist
  4. Switch to Enforce mode once complete

3. Version Control Your Whitelist

Store allowed_queries.json in version control alongside your application code.

4. Automated Query Extraction

For frontend applications, consider using tools to automatically extract queries from your codebase:

  • GraphQL Code Generator: Extract queries from React/Vue components
  • Apollo CLI: Generate persisted query manifests
  • Relay Compiler: Built-in persisted query support

5. CI/CD Integration

Validate the whitelist file in your CI pipeline:

# Validate JSON syntax
jq empty allowed_queries.json

# Run gateway with test queries
cargo test --test query_whitelist_validation

Working with APQ

Query Whitelisting and Automatic Persisted Queries (APQ) serve different purposes and work well together:

FeaturePurposeSecurity Level
APQBandwidth optimization (caches any query)Low
WhitelistSecurity (only allows pre-approved queries)High
BothBandwidth savings + SecurityMaximum

Example configuration with both:

Gateway::builder()
    // APQ for bandwidth optimization
    .with_persisted_queries(PersistedQueryConfig {
        cache_size: 1000,
        ttl: Some(Duration::from_secs(3600)),
    })
    // Whitelist for security
    .with_query_whitelist(QueryWhitelistConfig {
        mode: WhitelistMode::Enforce,
        allowed_queries: load_queries()?,
        allow_introspection: false,
    })
    .build()?

Migration Guide

Step 1: Inventory Queries

Use Warn mode to identify all queries currently in use:

.with_query_whitelist(QueryWhitelistConfig {
    mode: WhitelistMode::Warn,
    allowed_queries: HashMap::new(),
    allow_introspection: true,
})

Monitor logs for 1-2 weeks to capture all query variations.

Step 2: Build Whitelist

Extract unique query hashes from logs and build your whitelist file.

Step 3: Test in Staging

Deploy with the whitelist in Warn mode to staging:

# Monitor for any warnings
grep "Query not in whitelist" /var/log/gateway.log

Step 4: Production Deployment

Once confident, switch to Enforce mode:

.with_query_whitelist(QueryWhitelistConfig {
    mode: WhitelistMode::Enforce,
    allowed_queries: load_queries()?,
    allow_introspection: false,  // Disable in production
})

Troubleshooting

Query Rejected Despite Being in Whitelist

Problem: Query is in the whitelist but still gets rejected.

Solution: Ensure the query string exactly matches, including whitespace. Consider normalizing queries or using operation IDs.

Too Many Warnings in Warn Mode

Problem: Logs are flooded with warnings.

Solution: This is expected when first implementing. Collect all unique queries and add them to the whitelist.

Performance Impact

Problem: Concerned about validation overhead.

Solution: Hash calculation is fast (SHA-256). For 1000 RPS, overhead is <1ms. Consider caching if needed.

Example: Complete Production Setup

use grpc_graphql_gateway::{Gateway, QueryWhitelistConfig, WhitelistMode};
use std::path::Path;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Determine mode from environment
    let is_production = std::env::var("ENV")
        .map(|e| e == "production")
        .unwrap_or(false);
    
    // Load whitelist configuration
    let whitelist_config = if Path::new("config/allowed_queries.json").exists() {
        QueryWhitelistConfig::from_json_file(
            "config/allowed_queries.json",
            if is_production {
                WhitelistMode::Enforce
            } else {
                WhitelistMode::Warn
            }
        )?
    } else {
        QueryWhitelistConfig::disabled()
    };
    
    // Build gateway with production settings
    let gateway = Gateway::builder()
        .with_descriptor_set_bytes(DESCRIPTORS)
        .with_query_whitelist(whitelist_config)
        .with_response_cache(CacheConfig::default())
        .with_circuit_breaker(CircuitBreakerConfig::default())
        .with_compression(CompressionConfig::default())
        .build()?;
    
    gateway.serve("0.0.0.0:8888").await?;
    Ok(())
}

See Also

Middleware

The gateway supports an extensible middleware system for authentication, logging, rate limiting, and custom request processing.

Built-in Middleware

Rate Limiting

use grpc_graphql_gateway::{Gateway, RateLimitMiddleware};
use std::time::Duration;

let gateway = Gateway::builder()
    .with_descriptor_set_bytes(DESCRIPTORS)
    .add_middleware(RateLimitMiddleware::new(
        100,                          // Max requests
        Duration::from_secs(60),      // Per time window
    ))
    .build()?;

Custom Middleware

Implement the Middleware trait:

use grpc_graphql_gateway::middleware::{Middleware, Context};
use async_trait::async_trait;
use futures::future::BoxFuture;

struct AuthMiddleware {
    secret_key: String,
}

#[async_trait]
impl Middleware for AuthMiddleware {
    async fn call(
        &self,
        ctx: &mut Context,
        next: Box<dyn Fn(&mut Context) -> BoxFuture<'_, Result<()>>>,
    ) -> Result<()> {
        // Extract token from headers
        let token = ctx.headers()
            .get("authorization")
            .and_then(|v| v.to_str().ok())
            .ok_or_else(|| Error::Unauthorized)?;
        
        // Validate token
        let user = validate_jwt(token, &self.secret_key)?;
        
        // Add user info to context extensions
        ctx.extensions_mut().insert(user);
        
        // Continue to next middleware/handler
        next(ctx).await
    }
}

let gateway = Gateway::builder()
    .add_middleware(AuthMiddleware { secret_key: "secret".into() })
    .build()?;

Middleware Chain

Middlewares execute in order of registration:

Gateway::builder()
    .add_middleware(LoggingMiddleware)      // 1st: Log request
    .add_middleware(AuthMiddleware)         // 2nd: Authenticate
    .add_middleware(RateLimitMiddleware)    // 3rd: Rate limit
    .build()?

Context Object

The Context provides access to:

MethodDescription
headers()HTTP request headers
extensions()Shared data between middlewares
extensions_mut()Mutable access to extensions

Logging Middleware Example

struct LoggingMiddleware;

#[async_trait]
impl Middleware for LoggingMiddleware {
    async fn call(
        &self,
        ctx: &mut Context,
        next: Box<dyn Fn(&mut Context) -> BoxFuture<'_, Result<()>>>,
    ) -> Result<()> {
        let start = std::time::Instant::now();
        
        let result = next(ctx).await;
        
        tracing::info!(
            duration_ms = start.elapsed().as_millis(),
            success = result.is_ok(),
            "GraphQL request completed"
        );
        
        result
    }
}

Error Handling

Return errors from middleware to reject requests:

#[async_trait]
impl Middleware for AuthMiddleware {
    async fn call(
        &self,
        ctx: &mut Context,
        next: Box<dyn Fn(&mut Context) -> BoxFuture<'_, Result<()>>>,
    ) -> Result<()> {
        if !self.is_authorized(ctx) {
            return Err(Error::new("Unauthorized").extend_with(|_, e| {
                e.set("code", "UNAUTHORIZED");
            }));
        }
        
        next(ctx).await
    }
}

Error Handler

Set a global error handler for logging or transforming errors:

Gateway::builder()
    .with_error_handler(|errors| {
        for error in &errors {
            tracing::error!(
                message = %error.message,
                "GraphQL error"
            );
        }
    })
    .build()?

Health Checks

Enable Kubernetes-compatible health check endpoints for container orchestration.

Enabling Health Checks

let gateway = Gateway::builder()
    .with_descriptor_set_bytes(DESCRIPTORS)
    .enable_health_checks()
    .add_grpc_client("service", client)
    .build()?;

Endpoints

EndpointPurposeSuccess Response
GET /healthLiveness probe200 OK if server is running
GET /readyReadiness probe200 OK if gRPC clients configured

Response Format

{
  "status": "healthy",
  "components": {
    "grpc_clients": {
      "status": "healthy",
      "count": 3
    }
  }
}

Kubernetes Configuration

apiVersion: apps/v1
kind: Deployment
metadata:
  name: graphql-gateway
spec:
  template:
    spec:
      containers:
        - name: gateway
          image: your-gateway:latest
          ports:
            - containerPort: 8888
          livenessProbe:
            httpGet:
              path: /health
              port: 8888
            initialDelaySeconds: 5
            periodSeconds: 10
            failureThreshold: 3
          readinessProbe:
            httpGet:
              path: /ready
              port: 8888
            initialDelaySeconds: 5
            periodSeconds: 10
            failureThreshold: 3

Health States

StateDescription
healthyAll components working
degradedPartial functionality
unhealthyService unavailable

Custom Health Checks

The gateway automatically checks:

  • Server is running (liveness)
  • gRPC clients are configured (readiness)

For additional checks, consider using middleware or external health check services.

Load Balancer Integration

Health endpoints work with:

  • AWS ALB/NLB health checks
  • Google Cloud Load Balancer
  • Azure Load Balancer
  • HAProxy/Nginx health checks

Testing Health Endpoints

# Liveness check
curl http://localhost:8888/health
# {"status":"healthy"}

# Readiness check  
curl http://localhost:8888/ready
# {"status":"healthy","components":{"grpc_clients":{"status":"healthy","count":2}}}

Prometheus Metrics

Enable a /metrics endpoint exposing Prometheus-compatible metrics.

Enabling Metrics

let gateway = Gateway::builder()
    .with_descriptor_set_bytes(DESCRIPTORS)
    .enable_metrics()
    .build()?;

Available Metrics

MetricTypeLabelsDescription
graphql_requests_totalCounteroperation_typeTotal GraphQL requests
graphql_request_duration_secondsHistogramoperation_typeRequest latency
graphql_errors_totalCountererror_typeTotal GraphQL errors
grpc_backend_requests_totalCounterservice, methodgRPC backend calls
grpc_backend_duration_secondsHistogramservice, methodgRPC latency

Prometheus Scrape Configuration

scrape_configs:
  - job_name: 'graphql-gateway'
    static_configs:
      - targets: ['gateway:8888']
    metrics_path: '/metrics'
    scrape_interval: 15s

Example Metrics Output

# HELP graphql_requests_total Total number of GraphQL requests
# TYPE graphql_requests_total counter
graphql_requests_total{operation_type="query"} 1523
graphql_requests_total{operation_type="mutation"} 234
graphql_requests_total{operation_type="subscription"} 56

# HELP graphql_request_duration_seconds Request duration in seconds
# TYPE graphql_request_duration_seconds histogram
graphql_request_duration_seconds_bucket{operation_type="query",le="0.01"} 1200
graphql_request_duration_seconds_bucket{operation_type="query",le="0.05"} 1480
graphql_request_duration_seconds_bucket{operation_type="query",le="0.1"} 1510
graphql_request_duration_seconds_bucket{operation_type="query",le="+Inf"} 1523

# HELP grpc_backend_requests_total Total gRPC backend calls
# TYPE grpc_backend_requests_total counter
grpc_backend_requests_total{service="UserService",method="GetUser"} 892
grpc_backend_requests_total{service="ProductService",method="GetProduct"} 631

Grafana Dashboard

Create dashboards for:

  • Request rate and latency percentiles
  • Error rates by type
  • gRPC backend health
  • Operation type distribution

Example Queries

Request Rate:

rate(graphql_requests_total[5m])

P99 Latency:

histogram_quantile(0.99, rate(graphql_request_duration_seconds_bucket[5m]))

Error Rate:

rate(graphql_errors_total[5m]) / rate(graphql_requests_total[5m])

Programmatic Access

Use the metrics API directly:

use grpc_graphql_gateway::{GatewayMetrics, RequestTimer};

// Record custom metrics
let timer = GatewayMetrics::global().start_request_timer("query");
// ... process request
timer.observe_duration();

// Record gRPC calls
let grpc_timer = GatewayMetrics::global().start_grpc_timer("UserService", "GetUser");
// ... make gRPC call
grpc_timer.observe_duration();

OpenTelemetry Tracing

Enable distributed tracing for end-to-end visibility across your system.

Setting Up Tracing

use grpc_graphql_gateway::{Gateway, TracingConfig, init_tracer, shutdown_tracer};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Initialize the tracer
    let config = TracingConfig::new()
        .with_service_name("my-gateway")
        .with_sample_ratio(1.0);  // Sample all requests

    let _provider = init_tracer(&config);

    let gateway = Gateway::builder()
        .with_descriptor_set_bytes(DESCRIPTORS)
        .enable_tracing()
        .build()?;

    gateway.serve("0.0.0.0:8888").await?;

    // Shutdown on exit
    shutdown_tracer();
    Ok(())
}

Spans Created

SpanKindDescription
graphql.queryServerGraphQL query operation
graphql.mutationServerGraphQL mutation operation
grpc.callClientgRPC backend call

Span Attributes

GraphQL Spans

AttributeDescription
graphql.operation.nameThe operation name if provided
graphql.operation.typequery, mutation, or subscription
graphql.documentThe GraphQL query (truncated)

gRPC Spans

AttributeDescription
rpc.servicegRPC service name
rpc.methodgRPC method name
rpc.grpc.status_codegRPC status code

OTLP Export

Enable OTLP export by adding the feature:

[dependencies]
grpc_graphql_gateway = { version = "0.2", features = ["otlp"] }

Then configure the exporter:

use grpc_graphql_gateway::TracingConfig;

let config = TracingConfig::new()
    .with_service_name("my-gateway")
    .with_otlp_endpoint("http://jaeger:4317");

Jaeger Integration

Run Jaeger locally:

docker run -d --name jaeger \
  -p 4317:4317 \
  -p 16686:16686 \
  jaegertracing/jaeger:1.47

View traces at: http://localhost:16686

Sampling Configuration

Sample RatioDescription
1.0Sample all requests (dev)
0.1Sample 10% (staging)
0.01Sample 1% (production)
TracingConfig::new()
    .with_sample_ratio(0.1)  // 10% sampling

Context Propagation

The gateway automatically propagates trace context:

  • Incoming HTTP headers (traceparent, tracestate)
  • Outgoing gRPC metadata

Enable Header Propagation for distributed tracing headers.

Introspection Control

Disable GraphQL introspection in production to prevent schema discovery attacks.

Disabling Introspection

let gateway = Gateway::builder()
    .with_descriptor_set_bytes(DESCRIPTORS)
    .disable_introspection()
    .build()?;

What It Blocks

When introspection is disabled, these queries return errors:

# Blocked
{
  __schema {
    types {
      name
    }
  }
}

# Blocked
{
  __type(name: "User") {
    fields {
      name
    }
  }
}

Error Response

{
  "errors": [
    {
      "message": "Introspection is disabled",
      "extensions": {
        "code": "INTROSPECTION_DISABLED"
      }
    }
  ]
}

Environment-Based Toggle

Enable introspection only in development:

let is_production = std::env::var("ENV")
    .map(|e| e == "production")
    .unwrap_or(false);

let mut builder = Gateway::builder()
    .with_descriptor_set_bytes(DESCRIPTORS);

if is_production {
    builder = builder.disable_introspection();
}

let gateway = builder.build()?;

Security Benefits

Disabling introspection:

  • Prevents attackers from discovering your schema structure
  • Reduces attack surface for GraphQL-specific exploits
  • Hides internal type names and field descriptions

When to Disable

EnvironmentIntrospection
Developmentβœ… Enabled
Staging⚠️ Consider disabling
Production❌ Disabled

Alternative: Authorization

Instead of fully disabling, you can selectively allow introspection:

struct IntrospectionMiddleware {
    allowed_keys: HashSet<String>,
}

impl Middleware for IntrospectionMiddleware {
    async fn call(&self, ctx: &mut Context, next: ...) -> Result<()> {
        // Check if request is introspection
        if is_introspection_query(ctx) {
            let api_key = ctx.headers().get("x-api-key");
            if !self.allowed_keys.contains(api_key) {
                return Err(Error::new("Introspection not allowed"));
            }
        }
        next(ctx).await
    }
}

See Also

  • Query Whitelisting - For maximum security, combine introspection control with query whitelisting to restrict both schema discovery and query execution
  • DoS Protection - Query depth and complexity limits

REST API Connectors

The gateway supports REST API Connectors, enabling hybrid architectures where GraphQL fields can resolve data from both gRPC services and REST APIs. This is perfect for gradual migrations, integrating third-party APIs, or bridging legacy systems.

Quick Start

use grpc_graphql_gateway::{Gateway, RestConnector, RestEndpoint, HttpMethod};
use std::time::Duration;

let rest_connector = RestConnector::builder()
    .base_url("https://api.example.com")
    .timeout(Duration::from_secs(30))
    .default_header("Accept", "application/json")
    .add_endpoint(RestEndpoint::new("getUser", "/users/{id}")
        .method(HttpMethod::GET)
        .response_path("$.data")
        .description("Fetch a user by ID"))
    .add_endpoint(RestEndpoint::new("createUser", "/users")
        .method(HttpMethod::POST)
        .body_template(r#"{"name": "{name}", "email": "{email}"}"#))
    .build()?;

let gateway = Gateway::builder()
    .with_descriptor_set_bytes(DESCRIPTORS)
    .add_rest_connector("users_api", rest_connector)
    .add_grpc_client("UserService", grpc_client)
    .build()?;

GraphQL Schema Integration

REST endpoints are automatically exposed as GraphQL fields. The gateway generates:

  • Query fields for GET endpoints
  • Mutation fields for POST/PUT/PATCH/DELETE endpoints

Field names use the endpoint name directly (e.g., getUser, createPost).

Example GraphQL Queries

# Query a REST endpoint (GET /users/{id})
query {
  getUser(id: "123")
}

# Mutation to create via REST (POST /users)
mutation {
  createUser(name: "Alice", email: "alice@example.com")
}

Example Response

{
  "data": {
    "getUser": {
      "id": 123,
      "name": "Alice",
      "email": "alice@example.com"
    }
  }
}

REST responses are returned as the JSON scalar type, preserving the full structure from the API.

RestConnector

The RestConnector is the main entry point for REST API integration.

Builder Methods

MethodDescription
base_url(url)Required. Base URL for all endpoints
timeout(duration)Default timeout (default: 30s)
default_header(key, value)Add header to all requests
retry(config)Custom retry configuration
no_retry()Disable retries
log_bodies(true)Enable request/response body logging
with_cache(size)Enable LRU response cache for GET requests
interceptor(interceptor)Add request interceptor
transformer(transformer)Custom response transformer
add_endpoint(endpoint)Add a REST endpoint

RestEndpoint

Define individual REST endpoints with flexible configuration.

use grpc_graphql_gateway::{RestEndpoint, HttpMethod};

let endpoint = RestEndpoint::new("getUser", "/users/{id}")
    .method(HttpMethod::GET)
    .header("X-Custom-Header", "value")
    .query_param("include", "profile")
    .response_path("$.data.user")
    .timeout(Duration::from_secs(10))
    .description("Fetch a user by ID")
    .return_type("User");

Path Templates

Use {variable} placeholders in paths:

RestEndpoint::new("getOrder", "/users/{userId}/orders/{orderId}")

When called with { "userId": "123", "orderId": "456" }, resolves to:

/users/123/orders/456

Query Parameters

Add templated query parameters:

RestEndpoint::new("searchUsers", "/users")
    .query_param("q", "{query}")
    .query_param("limit", "{limit}")

Body Templates

For POST/PUT/PATCH, define request body templates:

RestEndpoint::new("createUser", "/users")
    .method(HttpMethod::POST)
    .body_template(r#"{
        "name": "{name}",
        "email": "{email}",
        "role": "{role}"
    }"#)

If no body template is provided, arguments are automatically serialized as JSON.

Response Extraction

Extract nested data from responses using JSONPath:

// API returns: { "status": "ok", "data": { "user": { "id": "123" } } }
RestEndpoint::new("getUser", "/users/{id}")
    .response_path("$.data.user")  // Returns just the user object

Supported JSONPath:

  • $.field - Access field
  • $.field.nested - Nested access
  • $.array[0] - Array index
  • $.array[0].field - Combined

Typed Responses

By default, REST endpoints return a JSON scalar blob. To enable field selection in GraphQL queries (e.g. { getUser { name email } }), you can define a response schema:

use grpc_graphql_gateway::{RestResponseSchema, RestResponseField};

RestEndpoint::new("getUser", "/users/{id}")
    .with_response_schema(RestResponseSchema::new("User")
        .field(RestResponseField::int("id"))
        .field(RestResponseField::string("name"))
        .field(RestResponseField::string("email"))
        // Define a nested object field
        .field(RestResponseField::object("address", "Address"))
    )

This registers a User type in the schema and allows clients to select only the fields they need.

Mutations vs Queries

Endpoints are automatically classified:

  • Queries: GET requests
  • Mutations: POST, PUT, PATCH, DELETE

Override explicitly:

// Force a POST to be a query (e.g., search endpoint)
RestEndpoint::new("searchUsers", "/users/search")
    .method(HttpMethod::POST)
    .as_query()

HTTP Methods

use grpc_graphql_gateway::HttpMethod;

HttpMethod::GET      // Read operations
HttpMethod::POST     // Create operations
HttpMethod::PUT      // Full update
HttpMethod::PATCH    // Partial update
HttpMethod::DELETE   // Delete operations

Authentication

Bearer Token

use grpc_graphql_gateway::{RestConnector, BearerAuthInterceptor};
use std::sync::Arc;

let connector = RestConnector::builder()
    .base_url("https://api.example.com")
    .interceptor(Arc::new(BearerAuthInterceptor::new("your-token")))
    .build()?;

The interceptor adds: Authorization: Bearer your-token

API Key

use grpc_graphql_gateway::{RestConnector, ApiKeyInterceptor};
use std::sync::Arc;

let connector = RestConnector::builder()
    .base_url("https://api.example.com")
    .interceptor(Arc::new(ApiKeyInterceptor::x_api_key("your-api-key")))
    .build()?;

The interceptor adds: X-API-Key: your-api-key

Custom Interceptor

Implement the RequestInterceptor trait for custom auth:

use grpc_graphql_gateway::{RequestInterceptor, RestRequest, Result};
use async_trait::async_trait;

struct CustomAuthInterceptor {
    // Your auth logic
}

#[async_trait]
impl RequestInterceptor for CustomAuthInterceptor {
    async fn intercept(&self, request: &mut RestRequest) -> Result<()> {
        // Add custom headers, modify URL, etc.
        request.headers.insert(
            "X-Custom-Auth".to_string(),
            "custom-value".to_string()
        );
        Ok(())
    }
}

Retry Configuration

Configure automatic retries with exponential backoff:

use grpc_graphql_gateway::{RestConnector, RetryConfig};
use std::time::Duration;

let connector = RestConnector::builder()
    .base_url("https://api.example.com")
    .retry(RetryConfig {
        max_retries: 3,
        initial_backoff: Duration::from_millis(100),
        max_backoff: Duration::from_secs(10),
        multiplier: 2.0,
        retry_statuses: vec![429, 500, 502, 503, 504],
    })
    .build()?;

Preset Configurations

// Disable retries
RetryConfig::disabled()

// Aggressive retries for critical endpoints
RetryConfig::aggressive()

Response Caching

Enable LRU caching for GET requests:

let connector = RestConnector::builder()
    .base_url("https://api.example.com")
    .with_cache(1000)  // Cache up to 1000 responses
    .build()?;

// Clear cache manually
connector.clear_cache().await;

Cache keys are based on endpoint name + arguments.

Multiple Connectors

Register multiple REST connectors for different services:

let users_api = RestConnector::builder()
    .base_url("https://users.example.com")
    .add_endpoint(RestEndpoint::new("getUser", "/users/{id}"))
    .build()?;

let products_api = RestConnector::builder()
    .base_url("https://products.example.com")
    .add_endpoint(RestEndpoint::new("getProduct", "/products/{id}"))
    .build()?;

let orders_api = RestConnector::builder()
    .base_url("https://orders.example.com")
    .add_endpoint(RestEndpoint::new("getOrder", "/orders/{id}"))
    .build()?;

let gateway = Gateway::builder()
    .add_rest_connector("users", users_api)
    .add_rest_connector("products", products_api)
    .add_rest_connector("orders", orders_api)
    .build()?;

Executing Endpoints

Execute endpoints programmatically:

use std::collections::HashMap;
use serde_json::json;

let mut args = HashMap::new();
args.insert("id".to_string(), json!("123"));

let result = connector.execute("getUser", args).await?;

Custom Response Transformer

Transform responses before returning to GraphQL:

use grpc_graphql_gateway::{ResponseTransformer, RestResponse, Result};
use async_trait::async_trait;
use serde_json::Value as JsonValue;
use std::sync::Arc;

struct SnakeToCamelTransformer;

#[async_trait]
impl ResponseTransformer for SnakeToCamelTransformer {
    async fn transform(&self, endpoint: &str, response: RestResponse) -> Result<JsonValue> {
        // Transform snake_case keys to camelCase
        Ok(transform_keys(response.body))
    }
}

let connector = RestConnector::builder()
    .base_url("https://api.example.com")
    .transformer(Arc::new(SnakeToCamelTransformer))
    .build()?;

Use Cases

ScenarioDescription
Hybrid ArchitectureMix gRPC and REST backends in one GraphQL API
Gradual MigrationMigrate from REST to gRPC incrementally
Third-Party APIsIntegrate external REST APIs (Stripe, Twilio, etc.)
Legacy SystemsBridge legacy REST services with modern infrastructure
Multi-ProtocolSupport teams using different backend technologies

Best Practices

  1. Set Appropriate Timeouts: Use shorter timeouts for internal services, longer for external APIs.

  2. Enable Retries for Idempotent Operations: GET, PUT, DELETE are typically safe to retry.

  3. Use Response Extraction: Extract only needed data with response_path to reduce payload size.

  4. Cache Read-Heavy Endpoints: Enable caching for frequently-accessed, rarely-changing data.

  5. Secure Credentials: Use environment variables for API keys and tokens, not hardcoded values.

  6. Log Bodies in Development Only: Enable log_bodies only in development to avoid leaking sensitive data.

See Also

OpenAPI Integration

The gateway can automatically generate REST connectors from OpenAPI (Swagger) specification files. This enables quick integration of REST APIs without manual endpoint configuration.

Supported Formats

FormatExtensionFeature Required
OpenAPI 3.0.x.jsonNone
OpenAPI 3.1.x.jsonNone
Swagger 2.0.jsonNone
YAML (any version).yaml, .ymlyaml

Quick Start

use grpc_graphql_gateway::{Gateway, OpenApiParser};

// Parse OpenAPI spec and create REST connector
let connector = OpenApiParser::from_file("petstore.yaml")?
    .with_base_url("https://api.petstore.io/v2")
    .build()?;

let gateway = Gateway::builder()
    .add_rest_connector("petstore", connector)
    .build()?;

Loading Options

From a File

// JSON file
let connector = OpenApiParser::from_file("api.json")?.build()?;

// YAML file (requires 'yaml' feature)
let connector = OpenApiParser::from_file("api.yaml")?.build()?;

From a URL

let connector = OpenApiParser::from_url("https://api.example.com/openapi.json")
    .await?
    .build()?;

From a String

let json_content = r#"{"openapi": "3.0.0", ...}"#;
let connector = OpenApiParser::from_string(json_content, false)?.build()?;

// For YAML content
let yaml_content = "openapi: '3.0.0'\n...";
let connector = OpenApiParser::from_string(yaml_content, true)?.build()?;

From JSON Value

let json_value: serde_json::Value = serde_json::from_str(content)?;
let connector = OpenApiParser::from_json(json_value)?.build()?;

Configuration Options

Base URL Override

Override the server URL from the spec:

let connector = OpenApiParser::from_file("api.json")?
    .with_base_url("https://api.staging.example.com")  // Use staging
    .build()?;

Timeout

Set a default timeout for all endpoints:

use std::time::Duration;

let connector = OpenApiParser::from_file("api.json")?
    .with_timeout(Duration::from_secs(60))
    .build()?;

Operation Prefix

Add a prefix to all operation names to avoid conflicts:

let connector = OpenApiParser::from_file("petstore.json")?
    .with_prefix("petstore_")  // listPets -> petstore_listPets
    .build()?;

Filtering Operations

By Tags

Only include operations with specific tags:

let connector = OpenApiParser::from_file("api.json")?
    .with_tags(vec!["pets".to_string(), "store".to_string()])
    .build()?;

Custom Filter

Use a predicate function for fine-grained control:

let connector = OpenApiParser::from_file("api.json")?
    .filter_operations(|operation_id, path| {
        // Only include non-deprecated v2 endpoints
        !operation_id.contains("deprecated") && path.starts_with("/api/v2")
    })
    .build()?;

What Gets Generated

The parser automatically generates:

Endpoints

Each path operation becomes a GraphQL field:

OpenAPIGraphQL
GET /petslistPets query
POST /petscreatePet mutation
GET /pets/{petId}getPet query
DELETE /pets/{petId}deletePet mutation

Arguments

  • Path parameters β†’ Required field arguments
  • Query parameters β†’ Optional field arguments
  • Request body β†’ Input arguments (auto-templated)

Response Types

Response schemas are converted to GraphQL types:

# OpenAPI
Pet:
  type: object
  properties:
    id:
      type: integer
    name:
      type: string
    tag:
      type: string

Becomes a GraphQL type with field selection.

Listing Operations

Before building, you can list all available operations:

let parser = OpenApiParser::from_file("api.json")?;

for op in parser.list_operations() {
    println!("{}: {} {} (tags: {:?})", 
        op.operation_id, 
        op.method, 
        op.path,
        op.tags
    );
    if let Some(summary) = op.summary {
        println!("  {}", summary);
    }
}

Accessing Spec Information

let parser = OpenApiParser::from_file("api.json")?;

let info = parser.info();
println!("API: {} v{}", info.title, info.version);
if let Some(desc) = &info.description {
    println!("Description: {}", desc);
}

YAML Support

To enable YAML parsing, add the yaml feature:

[dependencies]
grpc_graphql_gateway = { version = "0.3", features = ["yaml"] }

Example: Petstore Integration

use grpc_graphql_gateway::{Gateway, OpenApiParser};
use std::time::Duration;

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    // Parse the Petstore OpenAPI spec
    let petstore = OpenApiParser::from_url(
        "https://petstore3.swagger.io/api/v3/openapi.json"
    )
    .await?
    .with_base_url("https://petstore3.swagger.io/api/v3")
    .with_timeout(Duration::from_secs(30))
    .with_tags(vec!["pet".to_string()])  // Only pet operations
    .with_prefix("pet_")                  // Namespace operations
    .build()?;

    // Create the gateway
    let gateway = Gateway::builder()
        .with_descriptor_set_bytes(DESCRIPTORS)
        .add_grpc_client("service", grpc_client)
        .add_rest_connector("petstore", petstore)
        .serve("0.0.0.0:8888".to_string())
        .await?;

    Ok(())
}

Multiple REST APIs

Combine multiple OpenAPI specs:

// Payment API
let stripe = OpenApiParser::from_file("stripe-openapi.json")?
    .with_prefix("stripe_")
    .build()?;

// Email API
let sendgrid = OpenApiParser::from_file("sendgrid-openapi.json")?
    .with_prefix("email_")
    .build()?;

// User service (gRPC)
let gateway = Gateway::builder()
    .with_descriptor_set_bytes(USER_DESCRIPTORS)
    .add_grpc_client("users", users_client)
    .add_rest_connector("stripe", stripe)
    .add_rest_connector("sendgrid", sendgrid)
    .build()?;

Best Practices

  1. Use prefixes when combining multiple APIs to avoid naming conflicts

  2. Filter by tags to include only the operations you need

  3. Override base URLs for different environments (dev, staging, prod)

  4. Check available operations before building to understand what will be generated

  5. Enable YAML feature only if you need it (adds serde_yaml dependency)

Limitations

  • Authentication: OpenAPI security schemes are not automatically applied. Use request interceptors for auth.
  • Complex schemas: Very complex schemas (allOf, oneOf, anyOf) may be simplified.
  • Webhooks: OpenAPI 3.1 webhooks are not supported.
  • Callbacks: Async callbacks are not supported.

Query Cost Analysis

This guide explains how to use the Query Cost Analyzer to track, analyze, and enforce cost budgets for GraphQL queries, preventing expensive queries from spiking infrastructure costs.

Overview

The Query Cost Analyzer assigns a β€œcost” to each GraphQL query based on its complexity and enforces budgets at both the query level and per-user level. This prevents expensive queries from overwhelming your infrastructure and helps maintain predictable costs.

Key Benefits

  • Prevent Cost Spikes: Block queries that exceed cost thresholds
  • User Budget Enforcement: Limit query costs per user over time windows
  • Adaptive Cost Multipliers: Automatically increase costs during high system load
  • Cost Analytics: Track query costs and identify expensive patterns
  • Database Protection: Prevent over-provisioning by blocking runaway queries

Configuration

use grpc_graphql_gateway::{Gateway, QueryCostConfig};
use std::time::Duration;
use std::collections::HashMap;

let mut field_multipliers = HashMap::new();
field_multipliers.insert("user.posts".to_string(), 50); // 50x cost multiplier
field_multipliers.insert("posts.comments".to_string(), 100); // 100x cost multiplier

let cost_config = QueryCostConfig {
    max_cost_per_query: 1000,        // Reject queries above this cost
    base_cost_per_field: 1,          // Base cost per field
    field_cost_multipliers: field_multipliers,
    user_cost_budget: 10_000,        // Max cost per user per window
    budget_window: Duration::from_secs(60), // 1 minute window
    track_expensive_queries: true,   // Log costly queries
    expensive_percentile: 0.95,      // 95th percentile = "expensive"
    adaptive_costs: true,            // Increase costs during high load
    high_load_multiplier: 2.0,       // 2x cost during peak load
};

Basic Usage

Calculate Query Cost

use grpc_graphql_gateway::QueryCostAnalyzer;

let analyzer = QueryCostAnalyzer::new(cost_config);

let query = r#"
    query {
        user(id: 1) {
            id
            name
            posts {
                id
                title
                comments {
                    id
                    text
                }
            }
        }
    }
"#;

// Calculate cost
match analyzer.calculate_query_cost(query).await {
    Ok(result) => {
        println!("Query cost: {}", result.total_cost);
        println!("Field count: {}", result.field_count);
        println!("Complexity: {}", result.complexity);
    }
    Err(e) => {
        println!("Query rejected: {}", e);
        // Return error to client
    }
}

Enforce User Budgets

let user_id = "user_123";
let query_cost = 250;

// Check if user has budget remaining
match analyzer.check_user_budget(user_id, query_cost).await {
    Ok(()) => {
        // User has budget, execute query
    }
    Err(e) => {
        // User exceeded budget
        println!("Budget exceeded: {}", e);
        // Return rate limit error to client
    }
}

Integration with Gateway

Integrate the cost analyzer into your gateway middleware:

use grpc_graphql_gateway::{
    Gateway, QueryCostAnalyzer, QueryCostConfig, Middleware,
};
use axum::{
    extract::Extension,
    http::StatusCode,
    response::{IntoResponse, Response},
    Json,
};
use std::sync::Arc;

// Create cost analyzer
let cost_analyzer = Arc::new(QueryCostAnalyzer::new(QueryCostConfig::default()));

// Middleware to check query costs
async fn cost_check_middleware(
    Extension(analyzer): Extension<Arc<QueryCostAnalyzer>>,
    query: String,
    user_id: String,
) -> Result<(), Response> {
    // Calculate query cost
    let cost_result = match analyzer.calculate_query_cost(&query).await {
        Ok(result) => result,
        Err(e) => {
            return Err((
                StatusCode::BAD_REQUEST,
                Json(serde_json::json!({
                    "error": "Query too complex",
                    "message": e,
                })),
            ).into_response());
        }
    };

    // Check user budget
    if let Err(e) = analyzer.check_user_budget(&user_id, cost_result.total_cost).await {
        return Err((
            StatusCode::TOO_MANY_REQUESTS,
            Json(serde_json::json!({
                "error": "Budget exceeded",
                "message": e,
            })),
        ).into_response());
    }

    Ok(())
}

Field Cost Multipliers

Assign higher costs to expensive fields:

let mut multipliers = HashMap::new();

// Relationship fields (can cause N+1 queries)
multipliers.insert("user.posts".to_string(), 50);
multipliers.insert("user.followers".to_string(), 100);
multipliers.insert("post.comments".to_string(), 50);

// Aggregation fields (expensive computations)
multipliers.insert("analytics".to_string(), 200);
multipliers.insert("statistics".to_string(), 150);

// External API calls
multipliers.insert("thirdPartyData".to_string(), 500);

let config = QueryCostConfig {
    field_cost_multipliers: multipliers,
    ..Default::default()
};

Adaptive Cost Multipliers

Automatically increase costs during high system load:

// Update load factor based on system metrics
let cpu_usage = 0.85; // 85% CPU
let memory_usage = 0.75; // 75% memory

analyzer.update_load_factor(cpu_usage, memory_usage).await;

// Costs will be automatically multiplied by high_load_multiplier
// when average load > 80%

Cost Analytics

Track and analyze query costs:

// Get analytics
let analytics = analyzer.get_analytics().await;

println!("Total queries tracked: {}", analytics.total_queries);
println!("Average cost: {}", analytics.average_cost);
println!("Median cost: {}", analytics.median_cost);
println!("P95 cost: {}", analytics.p95_cost);
println!("P99 cost: {}", analytics.p99_cost);
println!("Max cost: {}", analytics.max_cost);

// Get threshold for "expensive" queries
let expensive_threshold = analyzer.get_expensive_threshold().await;
println!("Queries above {} are considered expensive", expensive_threshold);

Periodic Cleanup

Clean up expired user budgets to prevent memory growth:

use tokio::time::{interval, Duration};

// Run cleanup every 5 minutes
let mut cleanup_interval = interval(Duration::from_secs(300));

tokio::spawn({
    let analyzer = Arc::clone(&cost_analyzer);
    async move {
        loop {
            cleanup_interval.tick().await;
            analyzer.cleanup_expired_budgets().await;
        }
    }
});

Cost Optimization Strategies

1. Set Appropriate Base Costs

QueryCostConfig {
    base_cost_per_field: 1,  // Start with 1, adjust based on your schema
    max_cost_per_query: 1000, // Tune based on 95th percentile
    ..Default::default()
}

2. Identify Expensive Fields

// Get analytics to find expensive query patterns
let analytics = analyzer.get_analytics().await;

// Queries above P95 should be investigated
if query_cost > analytics.p95_cost {
    // Log for review
    println!("Expensive query detected: cost={}, query={}", query_cost, query);
}

3. Use Query Whitelisting

Combine with query whitelisting for production:

// Pre-calculate costs for whitelisted queries
// Reject ad-hoc expensive queries
Gateway::builder()
    .with_query_cost_config(cost_config)
    .with_query_whitelist(whitelist_config)
    .build()?

Cost Impact

By implementing query cost analysis, you can:

BenefitImpact
Prevent Runaway QueriesAvoid database overload
Predictable CostsNo surprise cost spikes
Fair Resource AllocationPer-user budgets prevent abuse
Right-Size InfrastructureAvoid over-provisioning databases

Estimated Monthly Savings: $200-500 by preventing over-provisioning and database spikes

Example: E-commerce Schema

let mut multipliers = HashMap::new();

// Products (relatively cheap)
multipliers.insert("products".to_string(), 1);
multipliers.insert("product.reviews".to_string(), 20);

// Users (moderate cost)
multipliers.insert("user.orders".to_string(), 50);
multipliers.insert("user.wishlist".to_string(), 10);

// Analytics (expensive)
multipliers.insert("salesAnalytics".to_string(), 500);
multipliers.insert("trendingProducts".to_string(), 200);

let config = QueryCostConfig {
    base_cost_per_field: 1,
    max_cost_per_query: 2000,
    field_cost_multipliers: multipliers,
    user_cost_budget: 20_000,
    budget_window: Duration::from_secs(60),
    ..Default::default()
};

Monitoring

Export cost metrics to Prometheus:

// Add to your metrics collection
gauge!("graphql_query_cost_p95", analytics.p95_cost as f64);
gauge!("graphql_query_cost_p99", analytics.p99_cost as f64);
counter!("graphql_queries_rejected_cost_limit", 1);
counter!("graphql_users_rate_limited_budget", 1);

Best Practices

  1. Start Conservative: Begin with high limits, then tune down based on analytics
  2. Monitor P95/P99: Use percentiles to set thresholds, not max values
  3. Whitelist Production Queries: Pre-approve and optimize expensive queries
  4. Test Under Load: Verify adaptive cost multipliers work as expected
  5. Budget Windows: Use 1-minute windows for APIs, 5-minute+ for dashboards

Live Queries

Live queries provide real-time data updates to clients using the @live directive. When underlying data changes, connected clients automatically receive updated results.

Overview

Unlike traditional GraphQL subscriptions that push specific events, live queries automatically re-execute the query when relevant data mutations occur, sending the complete updated result to the client.

Key Features

  • @live Directive: Add to any query to make it β€œlive”
  • WebSocket Delivery: Real-time updates via /graphql/live endpoint
  • Invalidation-Based: Mutations trigger query re-execution
  • Configurable Strategies: Invalidation, polling, or hash-diff modes
  • Throttling: Prevent flooding clients with too many updates

Quick Start

1. Client: Send a Live Query

Connect to the WebSocket endpoint and subscribe with the @live directive:

const ws = new WebSocket('ws://localhost:9000/graphql/live', 'graphql-transport-ws');

ws.onopen = () => {
  // Initialize connection
  ws.send(JSON.stringify({ type: 'connection_init' }));
};

ws.onmessage = (event) => {
  const msg = JSON.parse(event.data);
  
  if (msg.type === 'connection_ack') {
    // Subscribe with @live query
    ws.send(JSON.stringify({
      id: 'users-live',
      type: 'subscribe',
      payload: {
        query: `query @live {
          users {
            id
            name
            status
          }
        }`
      }
    }));
  }
  
  if (msg.type === 'next') {
    console.log('Received update:', msg.payload.data);
  }
};

2. Proto: Configure Live Query Support

Mark RPC methods as live query compatible:

service UserService {
  rpc ListUsers(Empty) returns (UserList) {
    option (graphql.schema) = { 
      type: QUERY 
      name: "users" 
    };
    option (graphql.live_query) = {
      enabled: true
      strategy: INVALIDATION
      triggers: ["User.create", "User.update", "User.delete"]
      throttle_ms: 100
    };
  }
  
  rpc CreateUser(CreateUserRequest) returns (User) {
    option (graphql.schema) = { 
      type: MUTATION 
      name: "createUser" 
    };
    // Mutations don't need live_query config - they trigger invalidation
  }
}

3. Server: Trigger Invalidation

After mutations, trigger invalidation to notify live queries:

use grpc_graphql_gateway::{InvalidationEvent, LiveQueryStore};

// In your mutation handler
async fn create_user(&self, req: CreateUserRequest) -> Result<User, Status> {
    // ... create user logic ...
    
    // Notify live queries that User data changed
    if let Some(store) = &self.live_query_store {
        store.invalidate(InvalidationEvent::new("User", "create"));
    }
    
    Ok(user)
}

Live Query Strategies

Re-execute query only when relevant mutations occur:

option (graphql.live_query) = {
  enabled: true
  strategy: INVALIDATION
  triggers: ["User.update", "User.delete"]
};

Polling

Periodically re-execute query at fixed intervals:

option (graphql.live_query) = {
  enabled: true
  strategy: POLLING
  poll_interval_ms: 5000  // Every 5 seconds
};

Hash Diff

Only send updates if result actually changed:

option (graphql.live_query) = {
  enabled: true
  strategy: HASH_DIFF
  poll_interval_ms: 1000
};

Configuration Options

OptionTypeDescription
enabledboolEnable live query for this operation
strategyenumINVALIDATION, POLLING, or HASH_DIFF
triggersstring[]Invalidation event patterns (e.g., β€œUser.update”)
throttle_msuint32Minimum time between updates (default: 100ms)
poll_interval_msuint32Polling interval for POLLING/HASH_DIFF strategies
ttl_secondsuint32Auto-expire subscription after N seconds

API Reference

Public Functions

// Check if query contains @live directive
pub fn has_live_directive(query: &str) -> bool;

// Strip @live directive for execution
pub fn strip_live_directive(query: &str) -> String;

// Create a shared live query store
pub fn create_live_query_store() -> SharedLiveQueryStore;

// Create with custom config
pub fn create_live_query_store_with_config(config: LiveQueryConfig) -> SharedLiveQueryStore;

LiveQueryStore Methods

impl LiveQueryStore {
    // Register a new live query subscription
    pub fn register(&self, query: ActiveLiveQuery, sender: Sender<LiveQueryUpdate>) -> Result<(), LiveQueryError>;
    
    // Unregister a subscription
    pub fn unregister(&self, subscription_id: &str) -> Option<ActiveLiveQuery>;
    
    // Trigger invalidation for matching subscriptions
    pub fn invalidate(&self, event: InvalidationEvent) -> usize;
    
    // Get current statistics
    pub fn stats(&self) -> LiveQueryStats;
}

InvalidationEvent

// Create an invalidation event
let event = InvalidationEvent::new("User", "update");

// With specific entity ID
let event = InvalidationEvent::with_id("User", "update", "user-123");

WebSocket Protocol

The /graphql/live endpoint uses the graphql-transport-ws protocol:

Client β†’ Server

Message TypeDescription
connection_initInitialize connection
subscribeStart a live query subscription
completeEnd a subscription
pingKeep-alive ping

Server β†’ Client

Message TypeDescription
connection_ackConnection accepted
nextQuery result (initial or update)
errorError occurred
completeSubscription ended
pongKeep-alive response

Example: Full CRUD with Live Updates

See the complete example at examples/live_query/:

# Run the example
cargo run --example live_query

# In another terminal, run the WebSocket test
node examples/live_query/test_ws.js

The test demonstrates:

  1. Initial live query returning 3 users
  2. Delete mutation removing a user
  3. Re-query showing 2 users
  4. Create mutation adding a new user
  5. Final query showing updated user list

Best Practices

  1. Use Specific Triggers: Only subscribe to relevant entity types
  2. Set Appropriate Throttle: Prevent overwhelming clients (100-500ms)
  3. Use TTL for Temporary Subscriptions: Auto-cleanup inactive queries
  4. Prefer Invalidation over Polling: More efficient for most use cases
  5. Handle Reconnection: Clients should re-subscribe after disconnect

Advanced Features

The live query system includes 4 advanced features for optimizing bandwidth, performance, and user experience.

1. Filtered Live Queries

Apply server-side filtering to live queries to receive only relevant updates.

Usage

# Only receive updates for online users
query @live {
  users(status: ONLINE) {
    users { id name }
    total_count
  }
}

Implementation

use grpc_graphql_gateway::{parse_query_arguments, matches_filter};

// Parse filter from query
let args = parse_query_arguments("users(status: ONLINE) @live");
// β†’ { "status": "ONLINE" }

// Check if entity matches filter
let user = json!({"id": "1", "status": "ONLINE", "name": "Alice"});
if matches_filter(&args, &user) {
    // Include in live query results
}

Benefits

  • 50-90% bandwidth reduction for filtered datasets
  • Natural GraphQL query syntax
  • No client-side filtering needed

2. Field-Level Invalidation

Track which specific fields changed and communicate this to clients for surgical updates.

Response Format

{
  id: "sub-123",
  data: { user: { id: "1", name: "Alice Smith", age: 31 } },
  changed_fields: ["user.name", "user.age"],  // ← Only these changed!
  is_initial: false,
  revision: 5
}

Implementation

use grpc_graphql_gateway::detect_field_changes;

let old_data = json!({"user": {"name": "Alice", "age": 30}});
let new_data = json!({"user": {"name": "Alice Smith", "age": 31}});

let changes = detect_field_changes(&old_data, &new_data, "", 0, 10);

// changes = [
//   FieldChange { field_path: "user.name", old_value: "Alice", new_value: "Alice Smith" },
//   FieldChange { field_path: "user.age", old_value: 30, new_value: 31 }
// ]

Client-Side Usage

ws.onmessage = (event) => {
  const msg = JSON.parse(event.data);
  
  if (msg.type === 'next' && msg.payload.changed_fields) {
    // Only update changed fields in UI
    msg.payload.changed_fields.forEach(field => {
      updateFieldInDOM(field, msg.payload.data);
    });
  }
};

Benefits

  • 30-70% bandwidth reduction when few fields change
  • Surgical UI updates - only re-render changed components
  • Reduced client-side processing overhead

3. Batch Invalidation

Merge multiple rapid invalidation events into a single update to reduce network traffic.

Configuration

use grpc_graphql_gateway::BatchInvalidationConfig;

let config = BatchInvalidationConfig {
    enabled: true,
    debounce_ms: 50,        // Wait 50ms before flushing
    max_batch_size: 100,     // Auto-flush at 100 events
    max_wait_ms: 500,        // Force flush after 500ms max
};

How It Works

Without batching:
━━━━━━━━━━━━━━━━━━━━━━━
Event 1 (0ms)   β†’ Update 1
Event 2 (10ms)  β†’ Update 2
Event 3 (20ms)  β†’ Update 3
Event 4 (30ms)  β†’ Update 4
Event 5 (40ms)  β†’ Update 5
━━━━━━━━━━━━━━━━━━━━━━━
Result: 5 updates sent

With batching (100ms throttle):
━━━━━━━━━━━━━━━━━━━━━━━
Events 1-5 (0-40ms)
  ↓ (wait 100ms)
Single merged update
━━━━━━━━━━━━━━━━━━━━━━━
Result: 1 update sent

Proto Configuration

option (graphql.live_query) = {
  enabled: true
  strategy: INVALIDATION
  throttle_ms: 100  // ← Enables batching
  triggers: ["User.create", "User.update"]
};

Benefits

  • 70-95% fewer network requests during high-frequency updates
  • Lower client processing overhead
  • Better performance for rapidly changing data

4. Client-Side Caching Hints

Send cache control directives to help clients optimize caching based on data volatility.

Response Format

{
  id: "sub-123",
  data: { user: { name: "Alice" } },
  cache_control: {
    max_age: 300,          // Cache for 5 minutes
    must_revalidate: true,
    etag: "abc123def456"   // For efficient revalidation
  }
}

Implementation

use grpc_graphql_gateway::{generate_cache_control, DataVolatility};

// Generate cache control based on data type
let cache = generate_cache_control(
    DataVolatility::Low,  // User profiles change infrequently
    Some("etag-user-123".to_string())
);

// Result:
// CacheControl {
//     max_age: 300,  // 5 minutes
//     must_revalidate: true,
//     etag: Some("etag-user-123")
// }

Data Volatility Levels

VolatilityCache DurationUse Case
VeryHigh0s (no cache)Stock prices, real-time metrics
High5sUser online status, live counts
Medium30sNotification counts, activity feeds
Low5 minutesUser profiles, post content
VeryLow1 hourSettings, configuration data

Client-Side Implementation

const cache = new Map();

ws.onmessage = (event) => {
  const msg = JSON.parse(event.data);
  
  if (msg.type === 'next' && msg.payload.cache_control) {
    const { max_age, etag } = msg.payload.cache_control;
    
    // Store in cache with expiration
    cache.set(msg.id, {
      data: msg.payload.data,
      etag: etag,
      expires: Date.now() + (max_age * 1000)
    });
  }
};

Benefits

  • 40-80% reduced server load through client caching
  • Faster perceived performance
  • Automatic cache invalidation on updates

Advanced Features API Reference

Functions

// Filter Support - Feature #1
pub fn parse_query_arguments(query: &str) -> HashMap<String, String>;
pub fn matches_filter(filter: &HashMap<String, String>, data: &Value) -> bool;

// Field-Level Changes - Feature #2
pub fn detect_field_changes(
    old: &Value,
    new: &Value,
    path: &str,
    depth: usize,
    max_depth: usize
) -> Vec<FieldChange>;

// Cache Control - Feature #4
pub fn generate_cache_control(
    volatility: DataVolatility,
    etag: Option<String>
) -> CacheControl;

Types

// Cache Control
pub struct CacheControl {
    pub max_age: u32,
    pub public: bool,
    pub must_revalidate: bool,
    pub etag: Option<String>,
}

// Field Change
pub struct FieldChange {
    pub field_path: String,
    pub old_value: Option<Value>,
    pub new_value: Value,
}

// Batch Configuration
pub struct BatchInvalidationConfig {
    pub enabled: bool,
    pub debounce_ms: u64,
    pub max_batch_size: usize,
    pub max_wait_ms: u64,
}

// Data Volatility
pub enum DataVolatility {
    VeryHigh,  // Changes multiple times per second
    High,      // Changes every few seconds
    Medium,    // Changes every minute
    Low,       // Changes hourly
    VeryLow,   // Changes daily or less
}

Enhanced LiveQueryUpdate

pub struct LiveQueryUpdate {
    pub id: String,
    pub data: serde_json::Value,
    pub is_initial: bool,
    pub revision: u64,
    
    // Advanced features (all optional)
    pub cache_control: Option<CacheControl>,
    pub changed_fields: Option<Vec<String>>,
    pub batched: Option<bool>,
    pub timestamp: Option<u64>,
}

Performance Comparison

Real-World Scenario

Setup: Live dashboard with 1000 users, 10 fields each, 60 updates/minute

MetricWithout FeaturesWith FeaturesImprovement
Users sent1000100 (filtered)90% reduction
Fields/user102 (changed only)80% reduction
Updates/min6010 (batched)83% reduction
Cache hits0%50%50% less load
Total data/min~2.3 MB~23 KB99% reduction

Complete Example

A comprehensive example demonstrating all 4 features is available:

# Run the server
cargo run --example live_query

# Test all advanced features
cd examples/live_query
node test_advanced_features.js

For detailed documentation and examples, see:


Migration Guide

Adding Filtered Queries

Before:

query @live {
  users {
    users { id name status }
  }
}

After:

query @live {
  users(status: ONLINE) {  // ← Add filter
    users { id name status }
  }
}

Using Field-Level Updates

Before:

if (msg.type === 'next') {
  // Update entire component
  updateUserComponent(msg.payload.data.user);
}

After:

if (msg.type === 'next') {
  if (msg.payload.changed_fields) {
    // Update only changed fields
    msg.payload.changed_fields.forEach(field => {
      updateField(field, msg.payload.data);
    });
  } else {
    // Initial load
    updateUserComponent(msg.payload.data.user);
  }
}

Enabling Batching

Simply increase the throttle in your proto config:

option (graphql.live_query) = {
  throttle_ms: 100  // ← Increase from 0 to enable batching
};

Troubleshooting

Filtered queries not working?

  • Verify filter syntax: key: value format (e.g., status: ONLINE)
  • Filters are case-sensitive
  • Check that entity data contains the filter fields

Too many updates still?

  • Increase throttle_ms for more aggressive batching
  • Add more specific filters to reduce result set
  • Review your invalidation triggers

Cache not working?

  • Ensure client respects max_age header
  • Check that cache_control is present in response
  • Verify ETag handling on client side

Changed fields not showing?

  • Feature requires throttle_ms > 0
  • Check that data actually changes between updates
  • Ensure client is checking changed_fields property

High Performance Optimization

The grpc_graphql_gateway is designed for extreme throughput requirements, capable of handling 100,000+ requests per second (RPS) per instance. To achieve these targets, the gateway employs several advanced architectural optimizations.

Performance Targets

With High-Performance mode enabled:

  • 100K+ RPS: For cached queries serving from memory.
  • 50K+ RPS: For uncached queries performing gRPC backend calls.
  • Sub-millisecond P99: Latency for cache hits.

Key Optimizations

SIMD-Accelerated JSON Parsing

Standard JSON parsing is often the primary bottleneck in GraphQL gateways. We use simd-json, which employs SIMD (Single Instruction, Multiple Data) instructions (AVX2, SSE4.2, NEON) to parse JSON.

  • 2x – 5x faster than serde_json for typical payloads.
  • Reduced CPU cycles per request, allowing more concurrency on the same hardware.

Lock-Free Sharded Caching

Global locks cause severe contention as CPU core counts increase. Our ShardedCache implementation:

  • Splits the cache into 64 – 128 independent shards.
  • Uses lock-free reads and independent write locks per shard.
  • Eliminates the β€œGlobal Lock” bottleneck.

Object Pooling

Memory allocation is expensive at 100K RPS. We use high-performance object pools for request/response buffers:

  • Zero-allocation steady state for many request patterns.
  • Pre-allocated buffers are returned to a lock-free ArrayQueue for reuse.

Connection Pool Tuning

The gateway automatically tunes gRPC and HTTP/2 settings for maximum throughput:

  • HTTP/2 Prior Knowledge: Skips nested version negotiation.
  • Adaptive Window Sizes: Optimizes flow control for high-bandwidth/low-latency local networks.
  • TCP NoDelay: Disables Nagle’s algorithm for immediate packet dispatch.

Configuration

Enable High-Performance mode in your GatewayBuilder:

use grpc_graphql_gateway::{Gateway, HighPerfConfig};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    Gateway::builder()
        // ... standard config ...
        .with_high_performance(HighPerfConfig::ultra_fast())
        .build()?
        .serve("0.0.0.0:8888")
        .await
}

Configuration Profiles

We provide three pre-tuned profiles:

ProfileUse Case
ultra_fast()Maximum Throughput: Optimized for 100K+ RPS.
balanced()Balanced: Good mix of throughput and latency.
low_latency()Low Latency: Optimized for minimal response time over raw RPS.

Benchmarking

We include a performance benchmark suite in the repository.

# Start the example server
cargo run --example greeter --release

# Run the benchmark
cargo run --bin benchmark --release -- --concurrency=200 --duration=30

For a complete automated test, use ./benchmark.sh which handles builds and runs multiple profiles.

Response Caching

Dramatically improve performance with in-memory GraphQL response caching.

Enabling Caching

use grpc_graphql_gateway::{Gateway, CacheConfig};
use std::time::Duration;

let gateway = Gateway::builder()
    .with_descriptor_set_bytes(DESCRIPTORS)
    .with_response_cache(CacheConfig {
        max_size: 10_000,                              // Max cached responses
        default_ttl: Duration::from_secs(60),          // 1 minute TTL
        stale_while_revalidate: Some(Duration::from_secs(30)),
        invalidate_on_mutation: true,
    })
    .build()?;

Configuration Options

OptionTypeDescription
max_sizeusizeMaximum number of cached responses
default_ttlDurationTime before entries expire
stale_while_revalidateOption<Duration>Serve stale content while refreshing
invalidate_on_mutationboolClear cache on mutations
redis_urlOption<String>Redis connection URL for distributed caching
vary_headersVec<String>Headers to include in cache key (default: ["Authorization"])

Distributed Caching (Redis)

Use Redis for shared caching across multiple gateway instances:

let gateway = Gateway::builder()
    .with_response_cache(CacheConfig {
        redis_url: Some("redis://127.0.0.1:6379".to_string()),
        default_ttl: Duration::from_secs(60),
        ..Default::default()
    })
    .build()?;

Vary Headers

By default, the cache key includes the Authorization header to prevent leaking user data. You can configure which headers affect the cache key:

CacheConfig {
    // Cache per user and per tenant
    vary_headers: vec!["Authorization".to_string(), "X-Tenant-ID".to_string()],
    ..Default::default()
}

How It Works

  1. First Query: Cache miss β†’ Execute gRPC β†’ Cache response β†’ Return
  2. Second Query: Cache hit β†’ Return cached response immediately (<1ms)
  3. Mutation: Execute mutation β†’ Invalidate related cache entries
  4. Next Query: Cache miss (invalidated) β†’ Execute gRPC β†’ Cache β†’ Return

What Gets Cached

OperationCached?Triggers Invalidation?
Queryβœ… YesNo
Mutation❌ Noβœ… Yes
Subscription❌ NoNo

Cache Key Generation

The cache key is a SHA-256 hash of:

  • Normalized query string
  • Sorted variables JSON
  • Operation name (if provided)

Stale-While-Revalidate

Serve stale content immediately while refreshing in the background:

CacheConfig {
    default_ttl: Duration::from_secs(60),
    stale_while_revalidate: Some(Duration::from_secs(30)),
    ..Default::default()
}

Timeline:

  • 0-60s: Fresh content served
  • 60-90s: Stale content served, refresh triggered
  • 90s+: Cache miss, fresh fetch

Mutation Invalidation

When invalidate_on_mutation: true:

// This mutation invalidates cache
mutation { updateUser(id: "123", name: "Alice") { id name } }

// Subsequent queries fetch fresh data
query { user(id: "123") { id name } }

Testing with curl

# 1. First query - cache miss
curl -X POST http://localhost:8888/graphql \
  -H "Content-Type: application/json" \
  -d '{"query": "{ user(id: \"123\") { name } }"}'

# 2. Same query - cache hit (instant)
curl -X POST http://localhost:8888/graphql \
  -H "Content-Type: application/json" \
  -d '{"query": "{ user(id: \"123\") { name } }"}'

# 3. Mutation - invalidates cache
curl -X POST http://localhost:8888/graphql \
  -H "Content-Type: application/json" \
  -d '{"query": "mutation { updateUser(id: \"123\", name: \"Bob\") { name } }"}'

# 4. Query again - cache miss (fresh data)
curl -X POST http://localhost:8888/graphql \
  -H "Content-Type: application/json" \
  -d '{"query": "{ user(id: \"123\") { name } }"}'

Performance Impact

  • Cache hits: <1ms response time
  • 10-100x fewer gRPC backend calls
  • Significant reduction in backend load

Smart TTL Management

This guide explains how to use Smart TTL Management to intelligently optimize cache durations based on query patterns and data volatility, maximizing cache hit rates and reducing infrastructure costs.

Overview

Instead of using a single TTL for all cached responses, Smart TTL Management dynamically adjusts cache durations based on:

  • Query Type: Different TTLs for user profiles, static content, real-time data, etc.
  • Data Volatility: Automatically learns how often data changes
  • Mutation Patterns: Tracks which mutations affect which queries
  • Cache Control Hints: Respects @cacheControl directives from your schema

Key Benefits

  • Higher Cache Hit Rates: Increase from 75% to 90%+ by optimizing TTLs
  • Reduced Database Load: 15% additional reduction in database queries
  • Automatic Optimization: ML-based volatility detection learns optimal TTLs
  • Cost Savings: $100-200/month in reduced database costs

Configuration

use grpc_graphql_gateway::{SmartTtlConfig, SmartTtlManager};
use std::time::Duration;
use std::collections::HashMap;

let mut custom_patterns = HashMap::new();
custom_patterns.insert("specialQuery".to_string(), Duration::from_secs(7200));

let config = SmartTtlConfig {
    default_ttl: Duration::from_secs(300),              // 5 minutes
    user_profile_ttl: Duration::from_secs(900),         // 15 minutes
    static_content_ttl: Duration::from_secs(86400),     // 24 hours
    real_time_data_ttl: Duration::from_secs(5),         // 5 seconds
    aggregated_data_ttl: Duration::from_secs(1800),     // 30 minutes
    list_query_ttl: Duration::from_secs(600),           // 10 minutes
    item_query_ttl: Duration::from_secs(300),           // 5 minutes
    auto_detect_volatility: true,                       // Enable ML-based learning
    min_observations: 10,                               // Learn after 10 executions
    max_adjustment_factor: 2.0,                         // Can double or halve TTL
    custom_patterns,                                    // Custom query patterns
    respect_cache_hints: true,                          // Honor @cacheControl
};

let ttl_manager = SmartTtlManager::new(config);

Basic Usage

Calculate Optimal TTL

use grpc_graphql_gateway::SmartTtlManager;

let query = r#"
    query {
        categories {
            id
            name
        }
    }
"#;

// Calculate TTL
let ttl_result = ttl_manager.calculate_ttl(
    query,
    "categories",
    None, // No cache hint
).await;

println!("TTL: {:?}", ttl_result.ttl);
println!("Strategy: {:?}", ttl_result.strategy);
println!("Confidence: {}", ttl_result.confidence);

Query Type Detection

Smart TTL automatically detects query types and applies appropriate TTLs:

Real-Time Data (5 seconds)

query {
  liveScores { team score }      # Contains "live"
  currentPrice { symbol price }  # Contains "current"
  realtimeData { value }         # Contains "realtime"
}

Static Content (24 hours)

query {
  categories { id name }         # Contains "categories"
  tags { name }                  # Contains "tags"
  settings { key value }         # Contains "settings"
  appConfig { version }          # Contains "config"
}

User Profiles (15 minutes)

query {
  profile { name email }         # Contains "profile"
  user(id: 1) { name }          # Contains "user"
  me { id name }                # Contains "me"
  account { settings }          # Contains "account"
}

Aggregated Data (30 minutes)

query {
  statistics { count average }   # Contains "statistics"
  analytics { views clicks }     # Contains "analytics"
  aggregateData { sum }         # Contains "aggregate"
}

List Queries (10 minutes)

query {
  listUsers(limit: 10) { id }   # Contains "list"
  posts(page: 1) { title }      # Contains "page"
  itemsWithOffset(offset: 20) { id } # Contains "offset"
}

Single Item Queries (5 minutes)

query {
  getUserById(id: 1) { name }   # Contains "byid"
  getPost(id: 123) { title }    # Contains "get"
  findProduct(id: 42) { name }  # Contains "find"
}

Volatility-Based Learning

Smart TTL learns from query execution patterns:

// Record query results to track changes
let query = "query { user(id: 1) { name } }";
let result_hash = calculate_hash(&result); // Your hash function

ttl_manager.record_query_result(query, result_hash).await;

// After 10+ executions, TTL will auto-adjust based on volatility
let ttl_result = ttl_manager.calculate_ttl(query, "user", None).await;

match ttl_result.strategy {
    TtlStrategy::VolatilityBased { base_ttl, volatility_score } => {
        println!("Base TTL: {:?}", base_ttl);
        println!("Volatility: {:.2}%", volatility_score * 100.0);
        println!("Adjusted TTL: {:?}", ttl_result.ttl);
    }
    _ => {}
}

Volatility Adjustment

Volatility ScoreData BehaviorTTL Adjustment
> 0.7Changes 70%+ of time0.5x (halve TTL)
0.3 - 0.7Moderate changes0.75x
0.1 - 0.3Stable1.5x
< 0.1Very stable (< 10%)2.0x (double TTL)

Cache Control Hints

Respect @cacheControl directives from your GraphQL schema:

type Query {
  # Cache for 1 hour
  products: [Product!]! @cacheControl(maxAge: 3600)
  
  # Don't cache
  liveData: LiveData! @cacheControl(maxAge: 0)
}
// Parse cache hint from schema metadata
use grpc_graphql_gateway::parse_cache_hint;

let schema_meta = "@cacheControl(maxAge: 3600)";
let hint = parse_cache_hint(schema_meta);

let ttl_result = ttl_manager.calculate_ttl(
    query,
    "products",
    hint, // Will use 3600 seconds
).await;

Mutation Tracking

Track which mutations affect which queries to invalidate caches intelligently:

// When a mutation occurs
let mutation_type = "updateUser";
let affected_queries = vec![
    "user(id: 1)".to_string(),
    "me".to_string(),
    "userProfile".to_string(),
];

ttl_manager.record_mutation(mutation_type, affected_queries).await;

// Affected queries will have shorter TTLs based on mutation frequency

Custom Pattern Matching

Define custom TTLs for specific query patterns:

let mut custom_patterns = HashMap::new();

// VIP queries get longer cache
custom_patterns.insert("premiumData".to_string(), Duration::from_secs(7200));

// Expensive queries get aggressive caching
custom_patterns.insert("complexReport".to_string(), Duration::from_secs(3600));

// Frequently updated data gets short cache
custom_patterns.insert("inventory".to_string(), Duration::from_secs(30));

let config = SmartTtlConfig {
    custom_patterns,
    ..Default::default()
};

Integration with Response Cache

Integrate with the existing response cache:

use grpc_graphql_gateway::{Gateway, CacheConfig, SmartTtlManager};
use std::sync::Arc;

// Create TTL manager
let ttl_manager = Arc::new(SmartTtlManager::new(SmartTtlConfig::default()));

// Modify cache lookup to use smart TTL
async fn cache_with_smart_ttl(
    cache: &ResponseCache,
    ttl_manager: &SmartTtlManager,
    query: &str,
    query_type: &str,
) -> Option<CachedResponse> {
    // Get optimal TTL
    let ttl_result = ttl_manager.calculate_ttl(query, query_type, None).await;
    
    // Check cache
    if let Some(cached) = cache.get(query).await {
        // Use smart TTL for freshness check
        if cached.age() < ttl_result.ttl {
            return Some(cached);
        }
    }
    
    None
}

TTL Analytics

Monitor TTL effectiveness:

let analytics = ttl_manager.get_analytics().await;

println!("Total query patterns tracked: {}", analytics.total_queries);
println!("Average volatility: {:.2}%", analytics.avg_volatility_score * 100.0);
println!("Average recommended TTL: {:?}", analytics.avg_recommended_ttl);
println!("Highly volatile queries: {}", analytics.highly_volatile_queries);
println!("Stable queries: {}", analytics.stable_queries);

Periodic Cleanup

Clean up old statistics to prevent memory growth:

use tokio::time::{interval, Duration};

// Run cleanup every hour
let mut cleanup_interval = interval(Duration::from_secs(3600));

tokio::spawn({
    let ttl_manager = Arc::clone(&ttl_manager);
    async move {
        loop {
            cleanup_interval.tick().await;
            
            // Keep stats for last 24 hours
            ttl_manager.cleanup_old_stats(Duration::from_secs(86400)).await;
        }
    }
});

Cost Impact

Before Smart TTL (Single 5-minute TTL)

MetricValue
Cache hit rate75%
Database queries (100k req/s)25k/s
Database instancedb.t3.medium ($72/mo)

After Smart TTL (Intelligent TTLs)

MetricValue
Cache hit rate90%
Database queries (100k req/s)10k/s
Database instancedb.t3.small ($36/mo)
Monthly savings$36-100/mo

Best Practices

1. Start with Conservative Defaults

SmartTtlConfig {
    default_ttl: Duration::from_secs(300),  // 5 minutes
    auto_detect_volatility: false,          // Disable learning initially
    ..Default::default()
}

2. Enable Learning After Understanding Patterns

SmartTtlConfig {
    auto_detect_volatility: true,
    min_observations: 20,  // More observations = better learning
    ..Default::default()
}

3. Monitor Analytics Regularly

// Log analytics daily
let analytics = ttl_manager.get_analytics().await;
info!("Smart TTL Analytics: {:#?}", analytics);

4. Combine with Static Patterns

// Use both automatic learning AND manual patterns
let mut custom_patterns = HashMap::new();
custom_patterns.insert("criticalData".to_string(), Duration::from_secs(30));

SmartTtlConfig {
    custom_patterns,
    auto_detect_volatility: true,
    ..Default::default()
}

5. Respect Cache Hints in Production

SmartTtlConfig {
    respect_cache_hints: true,  // Always honor developer intent
    ..Default::default()
}

Example: Multi-Tier TTL Strategy

use grpc_graphql_gateway::{SmartTtlConfig, SmartTtlManager};
use std::time::Duration;

let config = SmartTtlConfig {
    // Core content (balance freshness vs load)
    default_ttl: Duration::from_secs(300),
    
    // User data (moderate freshness)
    user_profile_ttl: Duration::from_secs(900),
    
    // Reference data (cache aggressively)
    static_content_ttl: Duration::from_secs(86400),
    
    // Live data (very short cache)
    real_time_data_ttl: Duration::from_secs(5),
    
    // Reports (expensive to compute, cache longer)
    aggregated_data_ttl: Duration::from_secs(1800),
    
    // Lists (ok to be slightly stale)
    list_query_ttl: Duration::from_secs(600),
    
    // Details (fresher data)
    item_query_ttl: Duration::from_secs(300),
    
    // Learn and optimize
    auto_detect_volatility: true,
    min_observations: 15,
    max_adjustment_factor: 2.0,
    
    // Honor developer hints
    respect_cache_hints: true,
    
    // Custom overrides
    custom_patterns: {
        let mut patterns = HashMap::new();
        patterns.insert("dashboard".to_string(), Duration::from_secs(60));
        patterns.insert("search".to_string(), Duration::from_secs(300));
        patterns
    },
};

let ttl_manager = SmartTtlManager::new(config);

Monitoring

Export TTL metrics to Prometheus:

// Export analytics as metrics
let analytics = ttl_manager.get_analytics().await;

gauge!("smart_ttl_avg_volatility", analytics.avg_volatility_score);
gauge!("smart_ttl_avg_ttl_seconds", analytics.avg_recommended_ttl.as_secs() as f64);
gauge!("smart_ttl_volatile_queries", analytics.highly_volatile_queries as f64);
gauge!("smart_ttl_stable_queries", analytics.stable_queries as f64);

Integrating Smart TTL with Cache

Quick Start Example

use grpc_graphql_gateway::{
    Gateway, CacheConfig, SmartTtlManager, SmartTtlConfig
};
use std::sync::Arc;
use std::time::Duration;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create Smart TTL Manager
    let smart_ttl_config = SmartTtlConfig {
        default_ttl: Duration::from_secs(300),              // 5 minutes
        user_profile_ttl: Duration::from_secs(900),         // 15 minutes  
        static_content_ttl: Duration::from_secs(86400),     // 24 hours
        real_time_data_ttl: Duration::from_secs(5),         // 5 seconds
        auto_detect_volatility: true,                       // Learn optimal TTLs
        ..Default::default()
    };
    
    let smart_ttl = Arc::new(SmartTtlManager::new(smart_ttl_config));
    
    // Create Cache Config with Smart TTL
    let cache_config = CacheConfig {
        max_size: 50_000,
        default_ttl: Duration::from_secs(300),  // Fallback TTL
        smart_ttl_manager: Some(Arc::clone(&smart_ttl)),
        redis_url: Some("redis://127.0.0.1:6379".to_string()),
        stale_while_revalidate: Some(Duration::from_secs(60)),
        invalidate_on_mutation: true,
        vary_headers: vec!["Authorization".to_string()],
    };
    
    // Build Gateway
    let gateway = Gateway::builder()
        .with_descriptor_set_bytes(DESCRIPTORS)
        .add_grpc_client("service", grpc_client)
        .with_response_cache(cache_config)
        .build()?;
    
    gateway.serve("0.0.0.0:8888").await?;
    Ok(())
}

How It Works

When Smart TTL is enabled:

  1. Cache Lookup: Normal cache lookup (no change)
  2. Cache Miss - Calculate Smart TTL:
    • Detect query type (user profile, static content, etc.)
    • Check historical volatility data
    • Apply custom pattern rules
    • Respect @cacheControl hints
  3. Store with Optimal TTL: Cache response with calculated TTL
  4. Learning: Track query results to improve TTL predictions

Cost Impact

Before Smart TTL (Static 5-minute TTL for all queries):

  • Cache hit rate: 75%
  • Database load: 25k queries/s (for 100k req/s)
  • Database cost: ~$72/mo

After Smart TTL (Intelligent per-query TTLs):

  • Cache hit rate: 90% (+15%)
  • Database load: 10k queries/s (-60%)
  • Database cost: ~$36/mo (-50%)

Monthly Savings: $36-100/mo

Usage Patterns

Pattern 1: Static + Auto-Learning

SmartTtlConfig {
    // Define base TTLs for query types
    user_profile_ttl: Duration::from_secs(900),
    static_content_ttl: Duration::from_secs(86400),
    
    // Enable learning to fine-tune
    auto_detect_volatility: true,
    min_observations: 20,
    
    ..Default::default()
}

Pattern 2: Custom Patterns Only

let mut custom_patterns = HashMap::new();
custom_patterns.insert("dashboard".to_string(), Duration::from_secs(60));
custom_patterns.insert("reports".to_string(), Duration::from_secs(1800));

SmartTtlConfig {
    custom_patterns,
    auto_detect_volatility: false,  // Disable learning
    ..Default::default()
}

Pattern 3: Full Auto-Optimization

SmartTtlConfig {
    auto_detect_volatility: true,
    min_observations: 10,           // Learn quickly
    max_adjustment_factor: 3.0,     // Allow aggressive adjustments
    ..Default::default()
}

Monitoring

Track Smart TTL effectiveness:

// Get analytics
let analytics = smart_ttl.get_analytics().await;

println!("Query patterns tracked: {}", analytics.total_queries);
println!("Average volatility: {:.2}%", analytics.avg_volatility_score * 100.0);
println!("Average TTL: {:?}", analytics.avg_recommended_ttl);

Response Compression

Reduce bandwidth with automatic response compression.

Enabling Compression

use grpc_graphql_gateway::{Gateway, CompressionConfig, CompressionLevel};

let gateway = Gateway::builder()
    .with_descriptor_set_bytes(DESCRIPTORS)
    .with_compression(CompressionConfig {
        enabled: true,
        level: CompressionLevel::Default,
        min_size_bytes: 1024,  // Only compress responses > 1KB
        algorithms: vec!["br".into(), "gzip".into()],
    })
    .build()?;

Preset Configurations

// Fast compression for low latency
Gateway::builder().with_compression(CompressionConfig::fast())

// Best compression for bandwidth savings
Gateway::builder().with_compression(CompressionConfig::best())

// Default balanced configuration
Gateway::builder().with_compression(CompressionConfig::default())

// Disable compression
Gateway::builder().with_compression(CompressionConfig::disabled())

Supported Algorithms

AlgorithmAccept-EncodingCompression RatioSpeed
BrotlibrBestSlower
GzipgzipGoodFast
DeflatedeflateGoodFast
ZstdzstdExcellentFast

Algorithm Selection

The gateway selects the best algorithm based on client Accept-Encoding:

Accept-Encoding: br, gzip, deflate

Priority order matches your algorithms configuration.

Compression Levels

LevelDescriptionUse Case
FastMinimal compression, fastLow latency APIs
DefaultBalancedMost applications
BestMaximum compressionBandwidth-constrained

Configuration Options

OptionTypeDescription
enabledboolEnable/disable compression
levelCompressionLevelCompression speed vs ratio
min_size_bytesusizeSkip compression for small responses
algorithmsVec<String>Enabled algorithms in priority order

Testing Compression

# Request with brotli
curl -H "Accept-Encoding: br" \
  -X POST http://localhost:8888/graphql \
  -H "Content-Type: application/json" \
  -d '{"query": "{ users { id name email } }"}' \
  --compressed -v

# Check Content-Encoding header in response
< Content-Encoding: br

Performance Considerations

  • JSON responses typically compress 50-90%
  • Set min_size_bytes to skip small responses
  • Use CompressionLevel::Fast for latency-sensitive apps
  • Balance CPU cost vs. bandwidth savings

GBP+LZ4 Ultra-Fast Compression

GraphQL Binary Protocol (GBP) combined with LZ4 provides a novel, ultra-high-performance binary encoding for GraphQL responses. While standard LZ4 is fast, GBP+LZ4 achieves near-maximal compression ratios (up to 99%) by exploiting the structural redundancy of GraphQL data before applying block compression.

Benefits

FeatureGBP+LZ4 (Turbo O(1))Standard LZ4GzipBrotli
Compression Ratio50-99%50-60%70-80%75-85%
Compression SpeedUltra Fast (O(1))Ultra FastFastSlow
DeduplicationZero-Clone StructuralByte-levelByte-levelByte-level
Scale Support1GB+ PayloadsGeneric BinaryBrowsersStatic Assets

Compression Scenarios

GBP compression effectiveness depends on the repetitiveness of your data. Here’s what to expect:

Data Pattern Guide

Data PatternCompressionExample
Highly Repetitive95-99%Lists where most fields repeat (same status, permissions, metadata)
Moderately Repetitive70-85%Typical production data with shared types and enums
Unique/Varied50%Unique strings per item (names, descriptions, unique IDs)

Scenario 1: Highly Repetitive (99% Compression)

Best case for GBP - data with structural repetition:

{
  "products": [
    { "id": 1, "status": "ACTIVE", "category": "Electronics", "org": { "id": "org-1", "name": "Acme" } },
    { "id": 2, "status": "ACTIVE", "category": "Electronics", "org": { "id": "org-1", "name": "Acme" } },
    // ... 20,000 more items with same status, category, org
  ]
}

Result: 41 MB β†’ 266 KB (99.37% reduction)

GBP leverages:

  • String interning for repeated values (β€œACTIVE”, β€œElectronics”, β€œAcme”)
  • Shape deduplication for identical object structures
  • Columnar encoding for arrays of objects
  • Run-length encoding for consecutive identical values

Scenario 2: Moderately Repetitive (70-85% Compression)

Typical production data with some variation:

{
  "users": [
    { "id": 1, "name": "Alice", "role": "ADMIN", "status": "ACTIVE", "region": "US" },
    { "id": 2, "name": "Bob", "role": "USER", "status": "ACTIVE", "region": "EU" },
    // ... users with unique names but repeated roles/statuses/regions
  ]
}

Result: ~75% compression typical

GBP benefits from:

  • Repeated enum values (role, status, region)
  • Shape deduplication (all User objects have same structure)
  • __typename field repetition

Scenario 3: Unique/Varied (50% Compression)

Worst case - highly unique data:

{
  "logs": [
    { "id": "uuid-1", "message": "Unique log message 1", "timestamp": "2024-01-01T00:00:01Z" },
    { "id": "uuid-2", "message": "Different log message 2", "timestamp": "2024-01-01T00:00:02Z" },
    // ... every field is unique
  ]
}

Result: ~50% compression

GBP still provides:

  • Binary encoding (smaller than JSON text)
  • LZ4 block compression
  • Shape deduplication (structure is same even if values differ)

Real-World Expectations

Most production GraphQL responses fall into the moderately repetitive category:

Repeated ElementsUnique Elements
__typename valuesEntity IDs
Enum values (status, role)Timestamps
Nested references (org, category)User-generated content
Boolean flagsUnique identifiers

Realistic expectation: 70-85% compression for typical production workloads.

Maximizing Compression

To achieve higher compression rates:

  1. Use enums instead of freeform strings for status fields
  2. Normalize data with shared references (e.g., all products reference same category object)
  3. Batch similar queries to increase repetition within responses
  4. Design schemas with repeated metadata objects

Why GBP? (O(1) Turbo Mode)

Standard compression algorithms (Gzip, Brotli, LZ4) treat the response as a bucket of bytes. GBP (GraphQL Binary Protocol) v9 understands the GraphQL structure at the memory level:

  1. Positional References (O(1)): Starting in v0.5.9, GBP eliminates expensive value cloning. It uses buffer position references for deduplication, resulting in constant-time lookups and zero additional memory overhead per duplicate.
  2. Shallow Hashing: Replaced recursive tree-walking hashes with O(1) shallow hashing for large structures. This enables massive 1GB+ payloads to be processed without quadratic performance degradation.
  3. Structural Templates (Shapes): It identifies that users { id name } always has the same keys and only encodes the β€œshape” once.
  4. Columnar Storage: Lists of objects are transformed into columns, allowing the compression algorithm to see similar data types together, which drastically increases the compression ratio.

Quick Start

Basic Configuration

use grpc_graphql_gateway::{Gateway, CompressionConfig};

let gateway = Gateway::builder()
    // ultra_fast() now defaults to GBP+LZ4
    .with_compression(CompressionConfig::ultra_fast())
    .build()?;

Manual Configuration

use grpc_graphql_gateway::CompressionConfig;

let config = CompressionConfig {
    enabled: true,
    min_size_bytes: 128, // GBP is efficient even for small fragments
    algorithms: vec!["gbp-lz4".into(), "lz4".into()],
    ..Default::default()
};

Client Support

Accept-Encoding Header

Clients must opt-in to the binary protocol by sending the following header:

Accept-Encoding: gbp-lz4, lz4, gzip

The gateway will respond with:

  • Content-Encoding: gbp-lz4
  • Content-Type: application/graphql-response+gbp

Decoding in Rust

use grpc_graphql_gateway::gbp::GbpDecoder;

let bytes = response.bytes().await?;
let mut decoder = GbpDecoder::new();
let json_value = decoder.decode_lz4(&bytes)?;

Decoding in Browser (TypeScript/JavaScript)

Use the official @protocol-lattice/gbp-decoder library:

npm install @protocol-lattice/gbp-decoder
import { GbpDecoder } from '@protocol-lattice/gbp-decoder';

const decoder = new GbpDecoder();

// Recommended for browsers: Gzip-compressed GBP
const decoded = decoder.decodeGzip(uint8Array);

// For ultra-performance: LZ4-compressed GBP
const decodedLz4 = decoder.decodeLz4(uint8Array);

Performance Benchmarks

100MB+ GraphQL Behemoth (200k Users)

MetricOriginal JSONStandard Gzip (Est.)GBP+LZ4 (Turbo O(1))
Size107.1 MB~22.0 MB804 KB
Reduction0%~79%99.25%
Throughput-~25 MB/s195.7 MB/s
Integrity--100% Verified

Result: With Turbo O(1) Mode, GBP+LZ4 is 133x smaller than the original JSON and scales effortlessly to 1GB+ payloads with minimal CPU and memory overhead.

Use Cases

βœ… Internal Microservices: Use GBP+LZ4 for all internal service-to-service GraphQL communication to minimize network overhead and CPU usage. βœ… High-Density Mobile Apps: Large lists of data can be sent to mobile clients in a fraction of the time, saving battery and data plans (requires custom decoder). βœ… Cache Optimization: Store GBP-encoded data in Redis or in-memory caches to fit 10-50x more data in the same memory space.

LZ4 Ultra-Fast Compression

LZ4 is an extremely fast compression algorithm ideal for high-throughput scenarios where CPU time is more valuable than bandwidth.

Benefits

FeatureLZ4GzipBrotli
Compression Speed700 MB/s35 MB/s8 MB/s
Decompression Speed4 GB/s300 MB/s400 MB/s
Compression Ratio50-60%70-80%75-85%
CPU UsageVery LowMediumHigh
Best ForHigh throughput, real-timeGeneral useBandwidth-constrained

When to Use LZ4

βœ… Use LZ4 when:

  • High throughput (100k+ req/s)
  • Low latency is critical (< 10ms P99)
  • CPU is more expensive than bandwidth
  • Real-time applications
  • Internal APIs (microservices communication)

❌ Don’t use LZ4 when:

  • Bandwidth is extremely expensive
  • Users on slow connections (use Brotli)
  • Maximum compression ratio needed

Quick Start

Basic Configuration

use grpc_graphql_gateway::{Gateway, CompressionConfig};

let gateway = Gateway::builder()
    .with_compression(CompressionConfig::ultra_fast())  // LZ4!
    .build()?;

Advanced Configuration

use grpc_graphql_gateway::CompressionConfig;

let config = CompressionConfig {
    enabled: true,
    level: CompressionLevel::Fast,
    min_size_bytes: 256,  // Lower threshold for LZ4
    algorithms: vec!["lz4".into()],
};

Multi-Algorithm Support

// Prefer LZ4, fallback to gzip for browsers
let config = CompressionConfig {
    algorithms: vec![
        "lz4".into(),    // For high-performance clients
        "gzip".into(),   // For browsers
    ],
    ..Default::default()
};

Client Support

JavaScript/TypeScript

// Axios example
import axios from 'axios';

const client = axios.create({
  baseURL: 'http://localhost:8888/graphql',
  headers: {
    'Accept-Encoding': 'lz4, gzip, deflate',
  },
  // Add LZ4 decompression
  transformResponse: [(data) => {
    // Handle LZ4 decompression if needed
    return JSON.parse(data);
  }],
});

Rust Client

use reqwest::Client;

let client = Client::builder()
    .gzip(true)
    .build()?;

// The gateway will automatically use LZ4 if client supports it
let response = client
    .post("http://localhost:8888/graphql")
    .header("Accept-Encoding", "lz4, gzip")
    .json(&graphql_query)
    .send()
    .await?;

Go Client

import (
    "github.com/pierrec/lz4"
    "net/http"
)

client := &http.Client{
    Transport: &lz4Transport{},
}

// Add LZ4 decompression support
type lz4Transport struct{}

func (t *lz4Transport) RoundTrip(req *http.Request) (*http.Response, error) {
    req.Header.Set("Accept-Encoding", "lz4, gzip")
    // ... handle LZ4 decompression
}

Performance Comparison

Benchmark: 1KB GraphQL Response

AlgorithmCompression TimeDecompression TimeCompressed Size
LZ40.002ms0.001ms580 bytes
Gzip0.15ms0.05ms320 bytes
Brotli2.5ms0.08ms280 bytes

Result: LZ4 is 75x faster to compress than gzip with acceptable size.

Benchmark: 100KB GraphQL Response

AlgorithmCompression TimeDecompression TimeCompressed Size
LZ40.14ms0.05ms52 KB
Gzip12ms3ms28 KB
Brotli180ms4ms24 KB

Result: LZ4 is 85x faster to compress, 60x faster to decompress.

Cost Impact at 100k req/s

Scenario: 2KB average response size

With Gzip:

CPU: 4 cores @ 100% = 4 vCPU
Cost: ~$140/mo
Bandwidth: 155 MB/s compressed
Latency: +2ms P99

With LZ4:

CPU: 2 cores @ 40% = 0.8 vCPU
Cost: ~$28/mo (80% reduction!)
Bandwidth: 180 MB/s compressed
Latency: +0.3ms P99

Savings: $112/month on compression CPU alone

Integration Examples

Example 1: Ultra-Fast Internal APIs

For microservices communication where throughput matters more than bandwidth:

let gateway = Gateway::builder()
    .with_compression(CompressionConfig {
        enabled: true,
        algorithms: vec!["lz4".into()],
        min_size_bytes: 256,
        level: CompressionLevel::Fast,
    })
    .build()?;

Example 2: Hybrid Strategy

Use LZ4 for internal calls, Brotli for external:

// In middleware
async fn compression_selector(req: Request) -> CompressionConfig {
    if is_internal_request(&req) {
        CompressionConfig::ultra_fast()  // LZ4
    } else {
        CompressionConfig::best()  // Brotli
    }
}

Example 3: Content-Type Based

Use LZ4 for JSON, Gzip for HTML:

let config = if response_is_json {
    CompressionConfig::ultra_fast()
} else {
    CompressionConfig::default()
};

Cache Optimization with LZ4

Use LZ4 to compress cached responses for better memory efficiency:

use grpc_graphql_gateway::Lz4CacheCompressor;

// Store in cache
let json = serde_json::to_string(&response)?;
let compressed = Lz4CacheCompressor::compress(&json)?;
cache.set("key", compressed).await?;

// Retrieve from cache
let compressed = cache.get("key").await?;
let json = Lz4CacheCompressor::decompress(&compressed)?;
let response: GraphQLResponse = serde_json::from_str(&json)?;

Result: 50-60% memory savings in cache with minimal CPU overhead.

Advanced: Custom Middleware

Add LZ4 compression as custom middleware:

use grpc_graphql_gateway::lz4_compression_middleware;
use axum::{Router, middleware};

let app = Router::new()
    .route("/graphql", post(graphql_handler))
    .layer(middleware::from_fn(lz4_compression_middleware));

Monitoring

Track LZ4 compression effectiveness:

// Export metrics
gauge!("compression_ratio_lz4", compression_ratio);
histogram!("compression_time_lz4_ms", compression_time.as_millis() as f64);
counter!("bytes_saved_lz4", bytes_saved);

Best Practices

1. Set Reasonable Thresholds

CompressionConfig {
    min_size_bytes: 256,  // Don't compress tiny responses
    // ...
}

2. Combine with Caching

LZ4 + caching = maximum performance:

Gateway::builder()
    .with_response_cache(cache_config)
    .with_compression(CompressionConfig::ultra_fast())
    .build()?

3. Monitor CPU vs Bandwidth Trade-off

// If CPU > 80%: Use LZ4
// If bandwidth > 80%: Use Brotli
// Otherwise: Use Gzip
let config = match (cpu_usage, bandwidth_usage) {
    (cpu, _) if cpu > 0.8 => CompressionConfig::ultra_fast(),
    (_, bw) if bw > 0.8 => CompressionConfig::best(),
    _ => CompressionConfig::default(),
};

4. Test with Your Data

use grpc_graphql_gateway::{compress_lz4, compression};

let sample_response = get_typical_graphql_response();
let compressed = compress_lz4(sample_response.as_bytes())?;

let ratio = compressed.len() as f64 / sample_response.len() as f64;
println!("Compression ratio: {:.1}%", ratio * 100.0);

Production Deployment

Kubernetes

apiVersion: apps/v1
kind: Deployment
metadata:
  name: graphql-gateway
spec:
  template:
    spec:
      containers:
      - name: gateway
        env:
        - name: COMPRESSION_ALGORITHM
          value: "lz4"
        - name: COMPRESSION_MIN_SIZE
          value: "256"
        resources:
          requests:
            cpu: "500m"  # LZ4 uses less CPU
            memory: "512Mi"

Docker

FROM rust:1.75-alpine AS builder
RUN apk add --no-cache lz4-dev

# ... build gateway with LZ4 support

ENTRYPOINT ["./gateway", "--compression=lz4"]

FAQ

Q: Is LZ4 supported by browsers? A: Not natively. Use gzip/brotli for browser clients, LZ4 for server-to-server.

Q: Can I use both LZ4 and Gzip? A: Yes! The gateway automatically selects based on Accept-Encoding header.

Q: Does LZ4 work with CloudFlare? A: CloudFlare doesn’t support LZ4. Use it for origin-to-CloudFlare, let CloudFlare handle client compression.

Q: How much CPU does LZ4 save? A: 60-80% less CPU than gzip at 100k req/s (see benchmarks above).

Request Collapsing

Request collapsing (also known as request deduplication) is a powerful optimization that reduces the number of gRPC backend calls by identifying and coalescing identical concurrent requests.

How It Works

When a GraphQL query contains multiple fields that call the same gRPC method with identical arguments, request collapsing ensures only one gRPC call is made:

query {
  user1: getUser(id: "1") { name }
  user2: getUser(id: "2") { name }
  user3: getUser(id: "1") { name }  # Duplicate of user1!
}

Without Request Collapsing: 3 gRPC calls are made.

With Request Collapsing: Only 2 gRPC calls are made (user1 and user3 share the same response).

The Leader-Follower Pattern

  1. Leader: The first request with a unique key executes the gRPC call
  2. Followers: Subsequent identical requests wait for the leader’s result
  3. Broadcast: When the leader completes, it broadcasts the result to all followers
  4. Cleanup: The in-flight entry is removed after broadcasting

Configuration

use grpc_graphql_gateway::{Gateway, RequestCollapsingConfig};
use std::time::Duration;

let gateway = Gateway::builder()
    .with_descriptor_set_bytes(DESCRIPTORS)
    .with_request_collapsing(RequestCollapsingConfig::default())
    .add_grpc_client("service", client)
    .build()?;

Configuration Options

OptionDefaultDescription
coalesce_window50msMaximum time to wait for in-flight requests
max_waiters100Maximum followers waiting for a single leader
enabledtrueEnable/disable collapsing
max_cache_size10000Maximum in-flight requests to track

Builder Pattern

let config = RequestCollapsingConfig::new()
    .coalesce_window(Duration::from_millis(100))  // Longer window
    .max_waiters(200)                              // More waiters allowed
    .max_cache_size(20000)                         // Larger cache
    .enabled(true);

Presets

Request collapsing comes with several presets for common scenarios:

Default (Balanced)

let config = RequestCollapsingConfig::default();
// coalesce_window: 50ms
// max_waiters: 100
// max_cache_size: 10000

Best for most workloads with a balance between latency and deduplication.

High Throughput

let config = RequestCollapsingConfig::high_throughput();
// coalesce_window: 100ms
// max_waiters: 500
// max_cache_size: 50000

Best for high-traffic scenarios where maximizing deduplication is more important than latency.

Low Latency

let config = RequestCollapsingConfig::low_latency();
// coalesce_window: 10ms
// max_waiters: 50
// max_cache_size: 5000

Best for latency-sensitive applications where quick responses are critical.

Disabled

let config = RequestCollapsingConfig::disabled();

Completely disables request collapsing.

Monitoring

You can monitor request collapsing effectiveness using the built-in statistics:

// Get the registry from ServeMux
if let Some(registry) = mux.request_collapsing() {
    let stats = registry.stats();
    println!("In-flight requests: {}", stats.in_flight_count);
    println!("Max cache size: {}", stats.max_cache_size);
    println!("Enabled: {}", stats.enabled);
}

Request Key Generation

Each request is identified by a SHA-256 hash of:

  1. Service name - The gRPC service identifier
  2. gRPC path - The method path (e.g., /greeter.Greeter/SayHello)
  3. Request bytes - The serialized protobuf message

This ensures that only truly identical requests are collapsed.

Relationship with Other Features

Response Caching

Request collapsing and response caching work together:

  • Request Collapsing: Deduplicates concurrent identical requests
  • Response Caching: Caches completed responses for future requests

The typical flow is:

  1. Check response cache β†’ cache hit? Return cached response
  2. Check in-flight requests β†’ follower? Wait for leader
  3. Execute gRPC call as leader
  4. Broadcast result to followers
  5. Cache response for future requests

Circuit Breaker

Request collapsing works seamlessly with the circuit breaker:

  • If the circuit is open, all collapsed requests fail fast together
  • The leader request respects circuit breaker state
  • Followers receive the same error as the leader

Best Practices

  1. Start with defaults: The default configuration works well for most use cases

  2. Monitor collapse ratio: Track how many requests are being deduplicated

    • Low ratio? Requests may be too unique, consider if collapsing adds value
    • High ratio? Great! You’re saving significant backend load
  3. Tune for your workload:

    • High read traffic? Use high_throughput() preset
    • Real-time requirements? Use low_latency() preset
  4. Consider request patterns:

    • GraphQL queries with aliases benefit most
    • Unique requests per field won’t see much benefit

Example: Full Configuration

use grpc_graphql_gateway::{Gateway, RequestCollapsingConfig, CacheConfig};
use std::time::Duration;

let gateway = Gateway::builder()
    .with_descriptor_set_bytes(DESCRIPTORS)
    // Enable response caching
    .with_response_cache(CacheConfig {
        max_size: 10_000,
        default_ttl: Duration::from_secs(60),
        stale_while_revalidate: Some(Duration::from_secs(30)),
        invalidate_on_mutation: true,
    })
    // Enable request collapsing
    .with_request_collapsing(
        RequestCollapsingConfig::new()
            .coalesce_window(Duration::from_millis(75))
            .max_waiters(150)
    )
    .add_grpc_client("service", client)
    .build()?;

Automatic Persisted Queries (APQ)

Reduce bandwidth by caching queries on the server and sending only hashes.

Enabling APQ

use grpc_graphql_gateway::{Gateway, PersistedQueryConfig};
use std::time::Duration;

let gateway = Gateway::builder()
    .with_descriptor_set_bytes(DESCRIPTORS)
    .with_persisted_queries(PersistedQueryConfig {
        cache_size: 1000,                        // Max cached queries
        ttl: Some(Duration::from_secs(3600)),    // 1 hour expiration
    })
    .build()?;

How APQ Works

  1. First request: Client sends hash only β†’ Gateway returns PERSISTED_QUERY_NOT_FOUND
  2. Retry: Client sends hash + full query β†’ Gateway caches and executes
  3. Subsequent requests: Client sends hash only β†’ Gateway uses cached query

Client Request Format

Hash only (after caching):

{
  "extensions": {
    "persistedQuery": {
      "version": 1,
      "sha256Hash": "ecf4edb46db40b5132295c0291d62fb65d6759a9eedfa4d5d612dd5ec54a6b38"
    }
  }
}

Hash + query (initial):

{
  "query": "{ user(id: \"123\") { id name } }",
  "extensions": {
    "persistedQuery": {
      "version": 1,
      "sha256Hash": "ecf4edb46db40b5132295c0291d62fb65d6759a9eedfa4d5d612dd5ec54a6b38"
    }
  }
}

Apollo Client Setup

import { createPersistedQueryLink } from '@apollo/client/link/persisted-queries';
import { sha256 } from 'crypto-hash';
import { createHttpLink } from '@apollo/client';

const link = createPersistedQueryLink({ sha256 }).concat(
  createHttpLink({ uri: 'http://localhost:8888/graphql' })
);

Configuration Options

OptionTypeDefaultDescription
cache_sizeusize1000Max number of cached queries
ttlOption<Duration>NoneOptional expiration time

Benefits

  • βœ… 90%+ reduction in request payload size
  • βœ… Compatible with Apollo Client APQ
  • βœ… LRU eviction prevents unbounded memory growth
  • βœ… Optional TTL for cache expiration

Error Response

When hash is not found:

{
  "errors": [
    {
      "message": "PersistedQueryNotFound",
      "extensions": {
        "code": "PERSISTED_QUERY_NOT_FOUND"
      }
    }
  ]
}

Cache Statistics

Monitor APQ performance through logs and metrics.

Circuit Breaker

Protect your gateway from cascading failures when backend services are unhealthy.

Enabling Circuit Breaker

use grpc_graphql_gateway::{Gateway, CircuitBreakerConfig};
use std::time::Duration;

let gateway = Gateway::builder()
    .with_descriptor_set_bytes(DESCRIPTORS)
    .with_circuit_breaker(CircuitBreakerConfig {
        failure_threshold: 5,                      // Open after 5 failures
        recovery_timeout: Duration::from_secs(30), // Wait 30s before testing
        half_open_max_requests: 3,                 // Allow 3 test requests
    })
    .build()?;

Circuit States

   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
   β”‚                                                 β”‚
   β–Ό                                                 β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”  failure_threshold  β”Œβ”€β”€β”€β”€β”€β”€β”  recovery   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚CLOSEDβ”‚ ─────────────────▢  β”‚ OPEN β”‚ ──────────▢ β”‚HALF-OPENβ”‚
β””β”€β”€β”€β”€β”€β”€β”˜     reached         β””β”€β”€β”€β”€β”€β”€β”˜   timeout   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
   β–²                                                 β”‚
   β”‚         success                                 β”‚
   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
StateDescription
ClosedNormal operation, all requests flow through
OpenService unhealthy, requests fail fast
Half-OpenTesting recovery with limited requests

Configuration Options

OptionTypeDefaultDescription
failure_thresholdu325Consecutive failures to open circuit
recovery_timeoutDuration30sTime before testing recovery
half_open_max_requestsu323Test requests in half-open state

How It Works

  1. Closed: Requests flow normally, failures are counted
  2. Threshold reached: Circuit opens after N consecutive failures
  3. Open: Requests fail immediately with SERVICE_UNAVAILABLE
  4. Timeout: After recovery timeout, circuit enters half-open
  5. Half-Open: Limited requests test if service recovered
  6. Success: Circuit closes, normal operation resumes
  7. Failure: Circuit reopens, back to step 3

Error Response

When circuit is open:

{
  "errors": [
    {
      "message": "Service unavailable: circuit breaker is open",
      "extensions": {
        "code": "SERVICE_UNAVAILABLE",
        "service": "UserService"
      }
    }
  ]
}

Per-Service Circuits

Each gRPC service has its own circuit breaker:

  • UserService circuit open doesn’t affect ProductService
  • Failures are isolated to their respective services

Benefits

  • βœ… Prevents cascading failures
  • βœ… Fast-fail reduces latency when services are down
  • βœ… Automatic recovery testing
  • βœ… Per-service isolation

Monitoring

Track circuit breaker state through logs:

WARN Circuit breaker opened for UserService
INFO Circuit breaker half-open for UserService (testing recovery)
INFO Circuit breaker closed for UserService (service recovered)

Batch Queries

Execute multiple GraphQL operations in a single HTTP request.

Usage

Send an array of operations:

curl -X POST http://localhost:8888/graphql \
  -H "Content-Type: application/json" \
  -d '[
    {"query": "{ users { id name } }"},
    {"query": "{ products { upc price } }"},
    {"query": "mutation { createUser(input: {name: \"Alice\"}) { id } }"}
  ]'

Response Format

Returns an array of responses in the same order:

[
  {"data": {"users": [{"id": "1", "name": "Bob"}]}},
  {"data": {"products": [{"upc": "123", "price": 99}]}},
  {"data": {"createUser": {"id": "2"}}}
]

Benefits

  • Reduces HTTP overhead (one connection, one request)
  • Atomic execution perception
  • Ideal for initial page loads

Considerations

  • Operations execute concurrently (not sequentially)
  • Mutations don’t wait for previous queries
  • Total response size is sum of all responses

Error Handling

Errors are returned per-operation:

[
  {"data": {"users": [{"id": "1"}]}},
  {"errors": [{"message": "Product not found"}]},
  {"data": {"createUser": {"id": "2"}}}
]

Client Example

const batchQuery = async (queries) => {
  const response = await fetch('http://localhost:8888/graphql', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify(queries),
  });
  return response.json();
};

const results = await batchQuery([
  { query: '{ users { id } }' },
  { query: '{ products { upc } }' },
]);

DataLoader

The gateway includes a built-in DataLoader implementation for batching entity resolution requests. This is essential for preventing the N+1 query problem in federated GraphQL architectures.

The N+1 Query Problem

Without DataLoader, resolving a list of entities results in one backend call per entity:

Query: users { friends { name } }

β†’ Fetch users (1 call)
β†’ For each user, fetch friends:
    - User 1's friends (call #2)
    - User 2's friends (call #3)
    - User 3's friends (call #4)
    ... (N more calls)

This is the N+1 problem: 1 initial query + N follow-up queries.

How DataLoader Solves This

DataLoader collects all entity resolution requests within a single execution frame and batches them together:

Query: users { friends { name } }

β†’ Fetch users (1 call)
β†’ Collect all friend IDs
β†’ Batch fetch all friends (1 call)

Total: 2 calls instead of N+1

EntityDataLoader

The EntityDataLoader is the main DataLoader implementation for entity resolution:

use grpc_graphql_gateway::{EntityDataLoader, EntityConfig};
use grpc_graphql_gateway::federation::EntityResolver;
use std::sync::Arc;
use std::collections::HashMap;

// Your entity resolver implementation
let resolver: Arc<dyn EntityResolver> = /* ... */;

// Entity configurations
let mut entity_configs: HashMap<String, EntityConfig> = HashMap::new();
entity_configs.insert("User".to_string(), user_config);
entity_configs.insert("Product".to_string(), product_config);

// Create the DataLoader
let loader = EntityDataLoader::new(resolver, entity_configs);

API Reference

EntityDataLoader::new

Creates a new DataLoader instance:

pub fn new(
    resolver: Arc<dyn EntityResolver>,
    entity_configs: HashMap<String, EntityConfig>,
) -> Self
  • resolver: The underlying entity resolver that performs the actual resolution
  • entity_configs: Map of entity type names to their configurations

EntityDataLoader::load

Load a single entity with automatic batching:

pub async fn load(
    &self,
    entity_type: &str,
    representation: IndexMap<Name, Value>,
) -> Result<Value>

Multiple concurrent calls to load() for the same entity type are automatically batched together.

EntityDataLoader::load_many

Load multiple entities in a batch:

pub async fn load_many(
    &self,
    entity_type: &str,
    representations: Vec<IndexMap<Name, Value>>,
) -> Result<Vec<Value>>

Explicitly batch multiple entity resolution requests.

Integration with Federation

When using Apollo Federation, the DataLoader is typically integrated through the entity resolution pipeline:

use grpc_graphql_gateway::{
    Gateway, GrpcEntityResolver, EntityDataLoader, EntityConfig
};
use std::sync::Arc;
use std::collections::HashMap;

// 1. Create the base entity resolver
let base_resolver = Arc::new(GrpcEntityResolver::default());

// 2. Configure entity types
let mut entity_configs: HashMap<String, EntityConfig> = HashMap::new();
entity_configs.insert(
    "User".to_string(),
    EntityConfig {
        type_name: "User".to_string(),
        keys: vec![vec!["id".to_string()]],
        extend: false,
        resolvable: true,
        descriptor: user_descriptor,
    },
);

// 3. Wrap with DataLoader
let loader = Arc::new(EntityDataLoader::new(
    base_resolver.clone(),
    entity_configs.clone(),
));

// 4. Build gateway with entity resolution
let gateway = Gateway::builder()
    .with_descriptor_set_bytes(DESCRIPTORS)
    .enable_federation()
    .with_entity_resolver(base_resolver)
    .add_grpc_client("UserService", user_client)
    .build()?;

Custom Entity Resolver with DataLoader

You can wrap a custom entity resolver with DataLoader:

use grpc_graphql_gateway::EntityDataLoader;
use grpc_graphql_gateway::federation::{EntityConfig, EntityResolver};
use async_graphql::{Value, indexmap::IndexMap, Name};
use async_trait::async_trait;
use std::sync::Arc;

struct DataLoaderResolver {
    loader: Arc<EntityDataLoader>,
}

impl DataLoaderResolver {
    pub fn new(
        base_resolver: Arc<dyn EntityResolver>,
        entity_configs: HashMap<String, EntityConfig>,
    ) -> Self {
        let loader = Arc::new(EntityDataLoader::new(
            base_resolver,
            entity_configs,
        ));
        Self { loader }
    }
}

#[async_trait]
impl EntityResolver for DataLoaderResolver {
    async fn resolve_entity(
        &self,
        config: &EntityConfig,
        representation: &IndexMap<Name, Value>,
    ) -> Result<Value> {
        // Single entity resolution goes through DataLoader
        self.loader.load(&config.type_name, representation.clone()).await
    }

    async fn batch_resolve_entities(
        &self,
        config: &EntityConfig,
        representations: Vec<IndexMap<Name, Value>>,
    ) -> Result<Vec<Value>> {
        // Batch resolution via DataLoader
        self.loader.load_many(&config.type_name, representations).await
    }
}

Key Features

Automatic Batching

Concurrent entity requests are automatically batched:

// These concurrent requests are batched into a single backend call
let (user1, user2, user3) = tokio::join!(
    loader.load("User", user1_repr),
    loader.load("User", user2_repr),
    loader.load("User", user3_repr),
);

Deduplication

Identical entity requests are deduplicated:

// Same user requested twice = only 1 backend call
let user1a = loader.load("User", user1_repr.clone());
let user1b = loader.load("User", user1_repr.clone());

let (result_a, result_b) = tokio::join!(user1a, user1b);
// result_a == result_b, and only 1 backend call was made

Normalized Cache Keys

Entity representations are normalized before caching, so field order doesn’t matter:

// These are treated as the same entity
let repr1 = indexmap! {
    Name::new("id") => Value::String("123".into()),
    Name::new("region") => Value::String("us".into()),
};

let repr2 = indexmap! {
    Name::new("region") => Value::String("us".into()),
    Name::new("id") => Value::String("123".into()),
};

// Only 1 backend call despite different field order

Per-Type Grouping

Entities are grouped by type for efficient batching:

// Mixed entity types are grouped appropriately
let (user, product, order) = tokio::join!(
    loader.load("User", user_repr),
    loader.load("Product", product_repr),
    loader.load("Order", order_repr),
);
// 3 batched backend calls (1 per entity type)

Performance Benefits

ScenarioWithout DataLoaderWith DataLoader
10 users with friends11 calls2 calls
100 products with reviews101 calls2 calls
N entities, M relationsN*M+1 callsM+1 calls

When to Use DataLoader

βœ… Always use DataLoader for:

  • Federated entity resolution
  • Nested field resolution that fetches related entities
  • Any resolver that may be called multiple times per query

❌ DataLoader may not be needed for:

  • Single root queries (no N+1 potential)
  • Mutations (typically single entity)
  • Subscriptions (streaming, not batched)

Example: Complete Federation Setup

Here’s a complete example demonstrating DataLoader with federation:

use grpc_graphql_gateway::{
    Gateway, EntityDataLoader, GrpcEntityResolver, EntityConfig,
    federation::EntityResolver,
};
use async_graphql::{Value, indexmap::IndexMap, Name};
use std::sync::Arc;
use std::collections::HashMap;

// Your store or data source
struct InMemoryStore {
    users: HashMap<String, User>,
    products: HashMap<String, Product>,
}

// Entity resolver that uses the DataLoader
struct StoreEntityResolver {
    store: Arc<InMemoryStore>,
    loader: Arc<EntityDataLoader>,
}

impl StoreEntityResolver {
    pub fn new(store: Arc<InMemoryStore>) -> Self {
        // Create base resolver
        let base = Arc::new(DirectStoreResolver { store: store.clone() });
        
        // Configure entities
        let mut configs = HashMap::new();
        configs.insert("User".to_string(), user_entity_config());
        configs.insert("Product".to_string(), product_entity_config());
        
        // Wrap with DataLoader
        let loader = Arc::new(EntityDataLoader::new(base, configs));
        
        Self { store, loader }
    }
}

#[async_trait::async_trait]
impl EntityResolver for StoreEntityResolver {
    async fn resolve_entity(
        &self,
        config: &EntityConfig,
        representation: &IndexMap<Name, Value>,
    ) -> grpc_graphql_gateway::Result<Value> {
        self.loader.load(&config.type_name, representation.clone()).await
    }

    async fn batch_resolve_entities(
        &self,
        config: &EntityConfig,
        representations: Vec<IndexMap<Name, Value>>,
    ) -> grpc_graphql_gateway::Result<Vec<Value>> {
        self.loader.load_many(&config.type_name, representations).await
    }
}

Best Practices

  1. Create DataLoader per request: For request-scoped caching, create a new DataLoader instance per GraphQL request.

  2. Share across resolvers: Pass the same DataLoader instance to all resolvers within a request.

  3. Configure appropriate batch sizes: The underlying resolver should handle batch sizes efficiently.

  4. Monitor batch efficiency: Track how many entities are batched together to identify optimization opportunities.

  5. Handle partial failures: The batch resolver should return results in the same order as the input, using null for failed items.

See Also

Helm Deployment

This guide covers deploying the gRPC-GraphQL Gateway to Kubernetes using Helm charts with load balancing and high availability.

Prerequisites

  • Kubernetes cluster (v1.19+)
  • Helm 3.x installed (brew install helm)
  • kubectl configured
  • Docker image of your gateway

Quick Start

Install from Source

# Clone repository
git clone https://github.com/Protocol-Lattice/grpc_graphql_gateway.git
cd grpc_graphql_gateway

# Install chart
helm install my-gateway ./helm/grpc-graphql-gateway \
  --namespace grpc-gateway \
  --create-namespace

Install from Helm Repository

# Add helm repository (once published)
helm repo add protocol-lattice https://protocol-lattice.github.io/grpc_graphql_gateway
helm repo update

# Install
helm install my-gateway protocol-lattice/grpc-graphql-gateway \
  --namespace grpc-gateway \
  --create-namespace

Configuration Options

Basic Deployment

# values.yaml
replicaCount: 3

image:
  repository: ghcr.io/protocol-lattice/grpc-graphql-gateway
  tag: "0.2.9"

service:
  type: ClusterIP
  httpPort: 8080

With Ingress

ingress:
  enabled: true
  className: nginx
  annotations:
    cert-manager.io/cluster-issuer: "letsencrypt-prod"
  hosts:
    - host: api.example.com
      paths:
        - path: /graphql
          pathType: Prefix
  tls:
    - secretName: gateway-tls
      hosts:
        - api.example.com

With Horizontal Pod Autoscaler

autoscaling:
  enabled: true
  minReplicas: 3
  maxReplicas: 10
  targetCPUUtilizationPercentage: 70
  targetMemoryUtilizationPercentage: 80

With LoadBalancer

loadBalancer:
  enabled: true
  externalTrafficPolicy: Local  # Preserve source IP
  annotations:
    # AWS NLB
    service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
    # GCP
    # cloud.google.com/load-balancer-type: "Internal"

Load Balancing Strategies

Round Robin (Default)

service:
  sessionAffinity: None

ingress:
  annotations:
    nginx.ingress.kubernetes.io/load-balance: "round_robin"

Sticky Sessions

service:
  sessionAffinity: ClientIP
  sessionAffinityConfig:
    clientIP:
      timeoutSeconds: 10800

Least Connections

ingress:
  annotations:
    nginx.ingress.kubernetes.io/load-balance: "least_conn"

High Availability Setup

# Minimum 3 replicas
replicaCount: 3

# Pod Disruption Budget
podDisruptionBudget:
  enabled: true
  minAvailable: 2

# Spread across nodes
affinity:
  podAntiAffinity:
    preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 100
        podAffinityTerm:
          labelSelector:
            matchExpressions:
              - key: app.kubernetes.io/name
                operator: In
                values:
                  - grpc-graphql-gateway
          topologyKey: kubernetes.io/hostname

Federation Deployment

Deploy multiple subgraphs with independent scaling:

# User subgraph
helm install user-subgraph ./helm/grpc-graphql-gateway \
  -f helm/values-federation-user.yaml \
  --namespace federation \
  --create-namespace

# Product subgraph
helm install product-subgraph ./helm/grpc-graphql-gateway \
  -f helm/values-federation-product.yaml \
  --namespace federation

# Review subgraph
helm install review-subgraph ./helm/grpc-graphql-gateway \
  -f helm/values-federation-review.yaml \
  --namespace federation

Or use the automated script:

./helm/deploy-federation.sh

Monitoring & Observability

Prometheus Metrics

serviceMonitor:
  enabled: true
  interval: 30s
  labels:
    release: prometheus

Pod Annotations

podAnnotations:
  prometheus.io/scrape: "true"
  prometheus.io/port: "9090"
  prometheus.io/path: "/metrics"

Security

Network Policies

networkPolicy:
  enabled: true
  ingress:
    - from:
      - namespaceSelector:
          matchLabels:
            name: ingress-nginx

Pod Security

podSecurityContext:
  runAsNonRoot: true
  runAsUser: 1000
  fsGroup: 1000

securityContext:
  capabilities:
    drop:
    - ALL
  readOnlyRootFilesystem: true
  allowPrivilegeEscalation: false

Common Operations

Upgrade

helm upgrade my-gateway ./helm/grpc-graphql-gateway \
  -f custom-values.yaml \
  --namespace grpc-gateway

Rollback

# View history
helm history my-gateway -n grpc-gateway

# Rollback
helm rollback my-gateway 1 -n grpc-gateway

Uninstall

helm uninstall my-gateway --namespace grpc-gateway

View Rendered Templates

helm template my-gateway ./helm/grpc-graphql-gateway \
  -f custom-values.yaml \
  --output-dir ./rendered

Troubleshooting

Pods Not Starting

kubectl describe pod <pod-name> -n grpc-gateway
kubectl logs <pod-name> -n grpc-gateway

HPA Not Scaling

# Check metrics server
kubectl top nodes
kubectl get hpa -n grpc-gateway

Service Not Accessible

kubectl get svc -n grpc-gateway
kubectl describe svc my-gateway -n grpc-gateway
kubectl get endpoints -n grpc-gateway

Best Practices

  1. Always use PodDisruptionBudget for production
  2. Enable HPA for automatic scaling
  3. Use anti-affinity to spread pods across nodes
  4. Configure health checks properly
  5. Set resource limits to prevent resource exhaustion
  6. Use secrets for sensitive data
  7. Enable monitoring with ServiceMonitor
  8. Test in staging before production deployment

Next Steps

Autoscaling and Load Balancing

This guide covers setting up comprehensive autoscaling and load balancing for the gRPC GraphQL Gateway.

Overview

The gateway supports three types of scaling and load balancing:

  1. Horizontal Pod Autoscaler (HPA) - Scales the number of pods based on metrics
  2. Vertical Pod Autoscaler (VPA) - Adjusts resource requests/limits for pods
  3. LoadBalancer - External load balancing for traffic distribution

Horizontal Pod Autoscaler (HPA)

HPA automatically scales the number of pods based on observed CPU, memory, or custom metrics.

Basic Configuration

autoscaling:
  enabled: true
  minReplicas: 3
  maxReplicas: 10
  targetCPUUtilizationPercentage: 70
  targetMemoryUtilizationPercentage: 80

Custom Metrics

For advanced scaling based on custom metrics:

autoscaling:
  enabled: true
  minReplicas: 5
  maxReplicas: 50
  customMetrics:
    - type: Pods
      pods:
        metric:
          name: http_requests_per_second
        target:
          type: AverageValue
          averageValue: "1000"

Deployment

helm install my-gateway ./grpc-graphql-gateway \
  --set autoscaling.enabled=true \
  --set autoscaling.minReplicas=3 \
  --set autoscaling.maxReplicas=10

Monitoring HPA

# Watch HPA status
kubectl get hpa -w

# Describe HPA for detailed metrics
kubectl describe hpa my-gateway

# View current metrics
kubectl top pods -l app.kubernetes.io/name=grpc-graphql-gateway

Vertical Pod Autoscaler (VPA)

VPA automatically adjusts CPU and memory requests/limits based on actual usage.

Prerequisites

Install VPA in your cluster:

git clone https://github.com/kubernetes/autoscaler.git
cd autoscaler/vertical-pod-autoscaler
./hack/vpa-up.sh

Configuration

verticalPodAutoscaler:
  enabled: true
  updateMode: "Auto"  # Off, Initial, Recreate, Auto
  minAllowed:
    cpu: 100m
    memory: 128Mi
  maxAllowed:
    cpu: 2000m
    memory: 2Gi
  controlledResources:
    - cpu
    - memory

Update Modes

ModeDescriptionUse Case
OffOnly provides recommendationsSafe to use with HPA
InitialApplies recommendations on pod creation onlyGood for initial sizing
RecreateUpdates running pods (requires restart)When you want automatic updates
AutoAutomatically applies recommendationsFull automation

Using VPA with HPA

⚠️ Important: VPA and HPA should not target the same metrics (CPU/Memory).

Recommended Setup:

# Use VPA in "Off" mode for recommendations
verticalPodAutoscaler:
  enabled: true
  updateMode: "Off"
  
# Use HPA for horizontal scaling
autoscaling:
  enabled: true
  minReplicas: 3
  maxReplicas: 10

Alternative: Use VPA for CPU/Memory and HPA for custom metrics.

Viewing VPA Recommendations

# Get VPA status
kubectl describe vpa my-gateway

# View recommendations
kubectl get vpa my-gateway -o jsonpath='{.status.recommendation}'

LoadBalancer Service

LoadBalancer provides external access with cloud provider integration.

Basic Configuration

loadBalancer:
  enabled: true
  httpPort: 80
  grpcPort: 50051
  externalTrafficPolicy: Cluster

AWS Network Load Balancer

loadBalancer:
  enabled: true
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
    service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"
    service.beta.kubernetes.io/aws-load-balancer-internal: "false"
  externalTrafficPolicy: Local  # Preserve source IP
  loadBalancerSourceRanges:
    - "10.0.0.0/8"  # Restrict to VPC

Google Cloud Load Balancer

loadBalancer:
  enabled: true
  annotations:
    cloud.google.com/load-balancer-type: "Internal"
    cloud.google.com/backend-config: '{"default": "backend-config"}'
  externalTrafficPolicy: Cluster

Azure Load Balancer

loadBalancer:
  enabled: true
  annotations:
    service.beta.kubernetes.io/azure-load-balancer-internal: "true"
  loadBalancerIP: "10.0.0.10"  # Static internal IP

External Traffic Policy

PolicyProsCons
ClusterEven load distribution across nodesLoses source IP
LocalPreserves source IP, lower latencyMay cause uneven load distribution

Complete Example

Production Deployment with All Features

# values-production.yaml
loadBalancer:
  enabled: true
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
  externalTrafficPolicy: Local
  httpPort: 80
  loadBalancerSourceRanges:
    - "0.0.0.0/0"

autoscaling:
  enabled: true
  minReplicas: 5
  maxReplicas: 50
  targetCPUUtilizationPercentage: 70
  targetMemoryUtilizationPercentage: 80

verticalPodAutoscaler:
  enabled: true
  updateMode: "Off"  # Get recommendations without conflicts
  minAllowed:
    cpu: 250m
    memory: 256Mi
  maxAllowed:
    cpu: 4000m
    memory: 4Gi

resources:
  limits:
    cpu: 2000m
    memory: 2Gi
  requests:
    cpu: 1000m
    memory: 1Gi

podDisruptionBudget:
  enabled: true
  minAvailable: 3

affinity:
  podAntiAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
            - key: app.kubernetes.io/name
              operator: In
              values:
                - grpc-graphql-gateway
        topologyKey: kubernetes.io/hostname

Deploy:

helm install gateway ./grpc-graphql-gateway \
  -f helm/values-production.yaml \
  --namespace production \
  --create-namespace

Load Balancing Strategies

At Service Level

service:
  sessionAffinity: ClientIP  # Sticky sessions
  sessionAffinityConfig:
    clientIP:
      timeoutSeconds: 10800  # 3 hours

At Ingress Level

ingress:
  annotations:
    # Round Robin (default)
    nginx.ingress.kubernetes.io/load-balance: "round_robin"
    
    # Least Connections
    # nginx.ingress.kubernetes.io/load-balance: "least_conn"
    
    # IP Hash
    # nginx.ingress.kubernetes.io/load-balance: "ip_hash"

Monitoring and Troubleshooting

Check Load Distribution

# View pod distribution across nodes
kubectl get pods -o wide -l app.kubernetes.io/name=grpc-graphql-gateway

# Check service endpoints
kubectl get endpoints my-gateway

# Check LoadBalancer status
kubectl get svc my-gateway-lb

Monitor Autoscaling

# Watch HPA
watch kubectl get hpa

# Monitor resource usage
kubectl top pods

# Check VPA recommendations
kubectl describe vpa my-gateway

Load Testing

# Install k6
brew install k6

# Run load test
k6 run --vus 100 --duration 5m - <<EOF
import http from 'k6/http';

export default function () {
  const query = JSON.stringify({
    query: '{ __typename }'
  });
  
  http.post('http://<loadbalancer-ip>/graphql', query, {
    headers: { 'Content-Type': 'application/json' },
  });
}
EOF

# Watch scaling in action
watch kubectl get pods,hpa

Best Practices

  1. Start Conservative: Begin with moderate min/max replicas and adjust based on observed patterns

  2. VPA + HPA: Use VPA in β€œOff” mode alongside HPA to get recommendations without conflicts

  3. LoadBalancer: Use externalTrafficPolicy: Local when you need source IP preservation

  4. PodDisruptionBudget: Always configure PDB to maintain availability during updates

  5. Multi-AZ: Use pod anti-affinity to spread pods across availability zones

  6. Gradual Rollouts: Test autoscaling in staging before production

  7. Monitor Costs: Set reasonable maxReplicas to prevent runaway costs

  8. Health Checks: Ensure liveness and readiness probes are properly configured

Federation with Autoscaling

For federated deployments, each subgraph can scale independently:

# Deploy user subgraph with autoscaling
helm install user-subgraph ./grpc-graphql-gateway \
  -f helm/values-federation-user.yaml \
  --set autoscaling.maxReplicas=20

# Deploy product subgraph with different scaling
helm install product-subgraph ./grpc-graphql-gateway \
  -f helm/values-federation-product.yaml \
  --set autoscaling.maxReplicas=30

Next Steps

Production Security Checklist

The gateway is designed with a β€œZero Trust” security philosophy, minimizing the attack surface by default. However, a secure deployment requires coordination between the gateway’s internal features and your infrastructure.

Gateway Security Features (Built-in)

When correctly configured, the gateway provides Enterprise-Grade security covering the following layers:

1. Zero-Trust Access Layer

  • Query Whitelisting: With WhitelistMode::Enforce, the gateway rejects all arbitrary queries. This neutralizes 99% of GraphQL-specific attacks (introspection abuse, deep nesting, resource exhaustion) effectively treating GraphQL as a secured set of RPCs.
  • Introspection Disabled: Schema exploration is blocked in production.

2. Browser Security Layer

  • HSTS: Strict-Transport-Security enforces HTTPS usage.
  • CSP: Content-Security-Policy limits script sources using β€˜self’.
  • CORS: Strict Cross-Origin Resource Sharing controls.
  • XSS Protection: Headers to prevent cross-site scripting and sniffing.

3. Infrastructure Protection Layer

  • DoS Protection: Lock poisoning prevention (using parking_lot) and safe error handling (no stack leaks).
  • Rate Limiting: Token-bucket based limiting with burst control.
  • IP Protection: Strict IP header validation (preventing X-Forwarded-For spoofing).

Operational Responsibilities (Ops)

While the gateway code is secure, your deployment environment must handle the following external responsibilities:

βœ… TLS / SSL Termination

The gateway speaks plain HTTP. You must run it behind a reverse proxy (e.g., Nginx, Envoy, AWS ALB, Cloudflare) that handles:

  • HTTPS Termination: Manage certificates and TLS versions (TLS 1.2/1.3 recommended).
  • Force Redirects: Redirect all HTTP traffic to HTTPS.

βœ… Secrets Management

Never hardcode sensitive credentials. Use environment variables or a secrets manager (Vault, AWS Secrets Manager, Kubernetes Secrets) for:

  • REDIS_URL
  • API Keys
  • Database Credentials
  • Private Keys (if using JWT signing)

βœ… Authentication & Authorization

The gateway validates the presence of auth headers (via middleware), but your logic must define the validity:

  • JWT Verification: Ensure your EnhancedAuthMiddleware is configured with the correct public keys/secrets.
  • Role Limits: Verify that identified users have permission to execute specific operations.

βœ… Network Segmentation

  • Internal gRPC: The gRPC backend services should typically be isolated in a private network, accessible only by the gateway.
  • Redis Access: Restrict Redis access to only the gateway instances.

Verification

Before deploying to production, run the included comprehensive security suite:

# Run the 60+ point security audit script
./test_security.sh

A passing suite confirms that all built-in security layers are active and functioning correctly.

Graceful Shutdown

Enable production-ready server lifecycle management with graceful shutdown.

Enabling Graceful Shutdown

use grpc_graphql_gateway::{Gateway, ShutdownConfig};
use std::time::Duration;

let gateway = Gateway::builder()
    .with_descriptor_set_bytes(DESCRIPTORS)
    .with_graceful_shutdown(ShutdownConfig {
        timeout: Duration::from_secs(30),          // Wait up to 30s
        handle_signals: true,                       // Handle SIGTERM/SIGINT
        force_shutdown_delay: Duration::from_secs(5),
    })
    .build()?;

gateway.serve("0.0.0.0:8888").await?;

How It Works

  1. Signal Received: SIGTERM, SIGINT, or Ctrl+C is received
  2. Stop Accepting: Server stops accepting new connections
  3. Drain Requests: In-flight requests are allowed to complete
  4. Cleanup: Active subscriptions cancelled, resources released
  5. Exit: Server shuts down gracefully

Configuration Options

OptionTypeDefaultDescription
timeoutDuration30sMax wait for in-flight requests
handle_signalsbooltrueHandle OS signals automatically
force_shutdown_delayDuration5sWait before forcing shutdown

Custom Shutdown Signal

Trigger shutdown from your own logic:

use tokio::sync::oneshot;

let (tx, rx) = oneshot::channel::<()>();

// Trigger shutdown after some condition
tokio::spawn(async move {
    tokio::time::sleep(Duration::from_secs(60)).await;
    let _ = tx.send(());
});

Gateway::builder()
    .with_descriptor_set_bytes(DESCRIPTORS)
    .serve_with_shutdown("0.0.0.0:8888", async { let _ = rx.await; })
    .await?;

Kubernetes Integration

The gateway responds correctly to Kubernetes termination:

spec:
  terminationGracePeriodSeconds: 30
  containers:
    - name: gateway
      lifecycle:
        preStop:
          exec:
            command: ["sleep", "5"]

Benefits

  • βœ… No dropped requests during deployment
  • βœ… Automatic OS signal handling
  • βœ… Configurable drain timeout
  • βœ… Active subscription cleanup
  • βœ… Kubernetes-compatible

Header Propagation

Forward HTTP headers from GraphQL requests to gRPC backends for authentication and tracing.

Enabling Header Propagation

use grpc_graphql_gateway::{Gateway, HeaderPropagationConfig};

let gateway = Gateway::builder()
    .with_descriptor_set_bytes(DESCRIPTORS)
    .with_header_propagation(
        HeaderPropagationConfig::new()
            .propagate("authorization")
            .propagate("x-request-id")
            .propagate("x-tenant-id")
    )
    .build()?;

Common Headers Preset

Use the preset for common auth and tracing headers:

Gateway::builder()
    .with_header_propagation(HeaderPropagationConfig::common())
    .build()?;

Includes:

  • authorization - Bearer tokens
  • x-request-id, x-correlation-id - Request tracking
  • traceparent, tracestate - W3C Trace Context
  • x-b3-* - Zipkin B3 headers

Configuration Methods

MethodDescription
.propagate("header")Propagate exact header name
.propagate_with_prefix("x-custom-")Propagate headers with prefix
.propagate_all_headers()Propagate all headers (with exclusions)
.exclude("cookie")Exclude specific headers

Examples

Exact Match

HeaderPropagationConfig::new()
    .propagate("authorization")
    .propagate("x-api-key")

Prefix Match

HeaderPropagationConfig::new()
    .propagate_with_prefix("x-custom-")
    .propagate_with_prefix("x-tenant-")

All with Exclusions

HeaderPropagationConfig::new()
    .propagate_all_headers()
    .exclude("cookie")
    .exclude("host")

Security

Uses an allowlist approach - only explicitly configured headers are forwarded. This prevents accidental leakage of sensitive headers like Cookie or Host.

gRPC Backend

Headers become gRPC metadata:

// In your gRPC service
async fn get_user(&self, request: Request<GetUserRequest>) -> ... {
    let metadata = request.metadata();
    let auth = metadata.get("authorization")
        .map(|v| v.to_str().ok())
        .flatten();
    
    // Use auth for authorization
}

W3C Trace Context

For distributed tracing, propagate trace context headers:

HeaderPropagationConfig::new()
    .propagate("traceparent")
    .propagate("tracestate")
    .propagate("authorization")

Configuration Reference

Complete reference for all gateway configuration options.

GatewayBuilder Methods

Core Configuration

MethodDescription
.with_descriptor_set_bytes(bytes)Set primary proto descriptor
.add_descriptor_set_bytes(bytes)Add additional proto descriptor
.with_descriptor_set_file(path)Load primary descriptor from file
.add_descriptor_set_file(path)Load additional descriptor from file
.add_grpc_client(name, client)Register a gRPC backend client
.with_services(services)Restrict to specific services

Federation

MethodDescription
.enable_federation()Enable Apollo Federation v2
.with_entity_resolver(resolver)Custom entity resolver

Security

MethodDescription
.with_query_depth_limit(n)Max query nesting depth
.with_query_complexity_limit(n)Max query complexity
.disable_introspection()Block __schema queries

Middleware

MethodDescription
.add_middleware(middleware)Add custom middleware
.with_error_handler(handler)Custom error handler

Performance

MethodDescription
.with_response_cache(config)Enable response caching
.with_compression(config)Enable response compression
.with_persisted_queries(config)Enable APQ
.with_circuit_breaker(config)Enable circuit breaker

Production

MethodDescription
.enable_health_checks()Add /health and /ready endpoints
.enable_metrics()Add /metrics Prometheus endpoint
.enable_tracing()Enable OpenTelemetry tracing
.with_graceful_shutdown(config)Enable graceful shutdown
.with_header_propagation(config)Forward headers to gRPC

Environment Variables

Configure via environment variables:

# Query limits
QUERY_DEPTH_LIMIT=10
QUERY_COMPLEXITY_LIMIT=100

# Environment
ENV=production  # Affects introspection default

# Tracing
OTEL_EXPORTER_OTLP_ENDPOINT=http://jaeger:4317
OTEL_SERVICE_NAME=graphql-gateway

Configuration Structs

CacheConfig

CacheConfig {
    max_size: 10_000,
    default_ttl: Duration::from_secs(60),
    stale_while_revalidate: Some(Duration::from_secs(30)),
    invalidate_on_mutation: true,
    redis_url: Some("redis://127.0.0.1:6379".to_string()),
}

CompressionConfig

CompressionConfig {
    enabled: true,
    level: CompressionLevel::Default,
    min_size_bytes: 1024,
    algorithms: vec!["br".into(), "gzip".into()],
}

// Presets
CompressionConfig::fast()
CompressionConfig::best()
CompressionConfig::default()
CompressionConfig::disabled()

CircuitBreakerConfig

CircuitBreakerConfig {
    failure_threshold: 5,
    recovery_timeout: Duration::from_secs(30),
    half_open_max_requests: 3,
}

PersistedQueryConfig

PersistedQueryConfig {
    cache_size: 1000,
    ttl: Some(Duration::from_secs(3600)),
}

ShutdownConfig

ShutdownConfig {
    timeout: Duration::from_secs(30),
    handle_signals: true,
    force_shutdown_delay: Duration::from_secs(5),
}

HeaderPropagationConfig

HeaderPropagationConfig::new()
    .propagate("authorization")
    .propagate_with_prefix("x-custom-")
    .exclude("cookie")

// Preset
HeaderPropagationConfig::common()

TracingConfig

TracingConfig::new()
    .with_service_name("my-gateway")
    .with_sample_ratio(0.1)
    .with_otlp_endpoint("http://jaeger:4317")

Cost Analysis

This guide provides a comprehensive cost analysis for running grpc_graphql_gateway in production environments, with specific calculations for handling 100,000 requests per second.

Performance Baseline

Based on our benchmarks, grpc_graphql_gateway achieves:

MetricValue
Single instance throughput~54,000 req/s
Comparison to Apollo Server27x faster
Memory footprint100-200MB per instance

To handle 100k req/s, you need approximately 2-3 instances (with headroom for spikes).


Architecture Overview

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        CLOUDFLARE PRO                                   β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚  Edge Cache (200+ PoPs worldwide)                                β”‚   β”‚
β”‚  β”‚  β€’ GraphQL response caching                                      β”‚   β”‚
β”‚  β”‚  β€’ DDoS protection                                               β”‚   β”‚
β”‚  β”‚  β€’ WAF rules                                                     β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                             β”‚ Cache MISS
                             β–Ό
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚  Load Balancer  β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                             β”‚
         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
         β–Ό                   β–Ό                   β–Ό
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚ Gateway β”‚         β”‚ Gateway β”‚         β”‚ Gateway β”‚
    β”‚   #1    β”‚         β”‚   #2    β”‚         β”‚   #3    β”‚
    β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜         β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜         β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜
         β”‚                   β”‚                   β”‚
         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                             β”‚
              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
              β–Ό                             β–Ό
      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”            β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
      β”‚  Redis Cache  β”‚            β”‚  gRPC Services β”‚
      β”‚   (L2 Cache)  β”‚            β”‚                β”‚
      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜            β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                           β”‚
                                           β–Ό
                                   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                                   β”‚   Database    β”‚
                                   β”‚ (PostgreSQL)  β”‚
                                   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Cloud Provider Cost Estimates

AWS Stack

ComponentSpecificationMonthly Cost
Cloudflare ProPro Plan + Cache API$20
Gateway Instances3Γ— c6g.large (2 vCPU, 4GB ARM)$90
Load BalancerALB$22
Redis (L2 Cache)ElastiCache cache.t3.medium (3GB)$50
PostgreSQL (HA)RDS db.t3.medium (Multi-AZ)$140
PostgreSQL (Basic)RDS db.t3.small (Single-AZ)$30
Data Transfer~500GB egress (estimated)$45
Total (Production HA)With Multi-AZ DB~$370/month
Total (Cost-Optimized)Single-AZ DB~$260/month

GCP Stack

ComponentSpecificationMonthly Cost
Cloudflare ProPro Plan$20
Gateway Instances3Γ— e2-standard-2$75
Load BalancerCloud Load Balancing$20
Redis (L2 Cache)Memorystore 3GB$55
PostgreSQL (HA)Cloud SQL db-custom-2-4096 (HA)$120
PostgreSQL (Basic)Cloud SQL db-f1-micro$10
Data Transfer~500GB egress$40
Total (Production HA)With HA database~$330/month
Total (Cost-Optimized)Basic database~$220/month

Azure Stack

ComponentSpecificationMonthly Cost
Cloudflare ProPro Plan$20
Gateway Instances3Γ— Standard_D2s_v3$105
Load BalancerStandard LB$25
Redis (L2 Cache)Azure Cache 3GB$55
PostgreSQL (HA)Flexible Server (Zone Redundant)$150
Data Transfer~500GB egress$45
Total (Production HA)~$400/month

Cloudflare Pro Benefits

FeatureBenefit
Edge CachingCache GraphQL responses at 200+ edge locations
Cache RulesCustom caching for POST /graphql with query hash
WAFBlock malicious GraphQL queries
Rate Limiting10 rules included, protect per-endpoint
AnalyticsReal-time traffic insights
DDoS ProtectionLayer 3/4/7 protection included

GraphQL Edge Caching with Cloudflare Workers

// workers/graphql-cache.js
addEventListener('fetch', event => {
  event.respondWith(handleRequest(event.request));
});

async function handleRequest(request) {
  if (request.method === 'POST') {
    const body = await request.clone().json();
    
    // Create cache key from query + variables
    const cacheKey = new Request(
      request.url + '?q=' + btoa(JSON.stringify(body)),
      { method: 'GET' }
    );
    
    const cache = caches.default;
    let response = await cache.match(cacheKey);
    
    if (!response) {
      response = await fetch(request);
      
      // Cache for 60 seconds
      const headers = new Headers(response.headers);
      headers.set('Cache-Control', 'max-age=60');
      
      response = new Response(response.body, { ...response, headers });
      event.waitUntil(cache.put(cacheKey, response.clone()));
    }
    
    return response;
  }
  
  return fetch(request);
}

3-Tier Caching Strategy

Implementing a multi-tier caching strategy significantly reduces costs by minimizing database load:

Request Flow:
                                   Cache Hit Rate
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                    ─────────────
β”‚ Cloudflare  β”‚ ──── HIT (40%) ──→ Response    ← Edge, <10ms
β”‚ Edge Cache  β”‚
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
       β”‚ MISS
       β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Gateway   β”‚
β”‚ Redis Cache β”‚ ──── HIT (35%) ──→ Response    ← L2, 1-5ms
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜
       β”‚ MISS
       β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Database  β”‚ ──── Query (25%) β†’ Response    ← Origin, 5-50ms
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Total cache hit rate: ~75%
Database load reduced by: 75%

Gateway Configuration for Caching

use grpc_graphql_gateway::{Gateway, CacheConfig};

Gateway::builder()
    .with_descriptor_set_bytes(DESCRIPTORS)
    .add_grpc_client("service", client)
    // Enable Redis caching
    .with_response_cache(CacheConfig::builder()
        .redis_url("redis://localhost:6379")
        .default_ttl(Duration::from_secs(300))
        .build())
    // Enable DataLoader for batching
    .with_data_loader(true)
    // Protection
    .with_rate_limiter(RateLimiterConfig::new(150_000))
    .with_circuit_breaker(CircuitBreakerConfig::default())
    // Observability
    .enable_metrics()
    .enable_health_checks()
    .build()?

Database Sizing Guide

With proper caching, your database load is significantly reduced:

Cache Hit RateEffective DB Load (for 100k req/s)
50%50,000 queries/s
75%25,000 queries/s
85%15,000 queries/s
90%10,000 queries/s

Bandwidth Cost Analysis (The Hidden Giant)

For 100k req/s, data transfer is often the largest cost. Assumption: 2KB average response size.

Total Data Transfer: 2KB * 100k/s β‰ˆ 518 TB/month.

ScenarioEgress DataAWS Cost ($0.09/GB)
1. Raw Traffic518 TB$46,620 / mo 😱
2. + Compression (70%)155 TB$13,950 / mo
3. + Cloudflare (80% Hit)31 TB$2,790 / mo
4. + Both~10 TB$900 / mo

How to achieve Scenario 4:

  1. Compression: Enable Brotli/Gzip in Gateway (.with_compression(CompressionConfig::default())).
  2. APQ: Enable Automatic Persisted Queries to reduce Ingress bandwidth.
  3. Cloudflare: Cache common queries at the edge.

Savings: Compression and Caching save you over $45,000/month in bandwidth costs.

Database Optimization with PgBouncer

Adding PgBouncer (connection pooler) is critical for high-throughput GraphQL workloads. It reduces connection overhead by reusing existing connections, allowing you to handle significantly more requests with smaller database instances.

OptimizationImpactCost Saving
PgBouncerIncreases transaction throughput by 2-4xDowngrade DB tier (e.g., Large β†’ Medium)
Read ReplicasOffloads read traffic from primaryScale horizontally instead of vertically

Revised Database Sizing with PgBouncer:

Database SizeOps/sec (Raw)Ops/sec (w/ PgBouncer)Monthly Cost
Small~2,000~8,000$30-50
Medium~5,000~25,000$100-150
Large~15,000~60,000+$300-500

Recommendation: With PgBouncer + Redis Caching, a Medium instance or even a well-tuned Small instance can often handle 100k req/s traffic if the cache hit rate is high (>85%).


Cost Comparison: grpc_graphql_gateway vs Apollo Server

Metricgrpc_graphql_gatewayApollo Server (Node.js)
Single instance throughput~54,000 req/s~4,000 req/s
Instances for 100k req/s325-30
Gateway instances cost~$90/month~$750/month
Memory per instance100-200MB512MB-1GB
Total monthly cost~$370~$1,200+
Annual cost~$4,440~$14,400+
Annual savings~$10,000

Cost Savings Visualization

Apollo Server (25 instances): $$$$$$$$$$$$$$$$$$$$$$$$$
grpc_graphql_gateway (3):     $$$$

Savings: ~92% reduction in gateway costs

Pricing Tiers Summary

TierComponentsMonthly CostBest For
Development1 Gateway + SQLite~$20/monthLocal/Dev
Staging2 Gateways + CF Free + Managed DB~$100/monthStaging
Production3 Gateways + CF Pro + Redis + PgBouncer + Postgres~$1,200/month100k req/s (Public)
Enterprise5 Gateways + CF Business + Redis Cluster + DB Cluster~$2,500+/monthHigh Volume

Scaling Scenarios

Cost estimates based on user count (assuming 0.5 req/s per active user):

MetricStartup (1k Users)Growth (10k Users)Scale (100k Users)High Scale
Est. Load~500 req/s~5,000 req/s~50,000 req/s100k req/s
Gateways1 (t4g.micro)2 (t4g.small)3 (c6g.medium)3 (c6g.large)
DatabaseSQLite / LowSmall RDSMedium RDSOptimized HA
BandwidthFree Tier~$50/mo~$450/mo~$900/mo
Total Cost~$20 / mo~$155 / mo~$600 / mo~$1,200 / mo

Note: β€œ10k users online” usually generates ~5,000 req/s. At this scale, your infrastructure cost is negligible (<$200) because the gateway is so efficient.

Profitability Analysis (ROI)

Since your infrastructure cost is so low (~$155/mo for 10k users), you achieve profitability much faster than with traditional stacks.

Revenue Potential Scaling (Freemium Model): Assumption: 5% of users convert to a $9/mo plan.

User BaseMonthly RevenueInfra Cost (Ops)Net Profit
1,000$450~$20$430 (95% Margin)
10,000$4,500~$155$4,345 (96% Margin)
100,000$45,000~$600$44,400 (98% Margin)
1 Million$450,000~$6,000$444,000 (98% Margin)

The β€œRust Scaling Advantage”: With Node.js or Java, your infrastructure costs usually grow linearly with users ($20 -> $200 -> $2,000). With this optimized Rust stack, your costs grow sub-linearly thanks to high efficiency, meaning your profit margins actually increase as you scale.


Quick Reference Card

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  100k req/s Full Stack - Optimized                      β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Cloudflare Pro .......................... $20/month   β”‚
β”‚  3Γ— Gateway (c6g.large) .................. $90/month   β”‚
β”‚  PgBouncer (t4g.micro) ................... $10/month   β”‚
β”‚  Redis 3GB ............................... $50/month   β”‚
β”‚  PostgreSQL (Optimization) ............... $80/month   β”‚
β”‚  Data Transfer (Optimized 10TB) .......... $900/month  β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  TOTAL .................................. ~$1,150/month β”‚
β”‚  Annual ................................ ~$13,800/year  β”‚
β”‚  vs Unoptimized (~$47k/mo) ............. save $500k/yr β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Cost Optimization Tips

  1. Use PgBouncer - Essential for high concurrency.
  2. Use ARM instances (c6g on AWS, t2a on GCP) - 20% cheaper than x86.
  3. Enable response caching - Reduces backend load by 60-80%.
  4. Bandwidth Optimization - Use APQ and Compression to cut data transfer costs by 50-90%.
  5. Use Cloudflare edge caching - Reduces origin requests by 30-50%
  6. Right-size your database - Start small, scale based on metrics
  7. Use Reserved Instances - Save 30-60% on long-term commitments
  8. Enable compression - Reduces data transfer costs

Next Steps

Cost Optimization Strategies for Requests Per Second

This guide provides actionable strategies to reduce the cost per request for your gRPC-GraphQL gateway deployment. By implementing these optimizations, you can achieve 97%+ cost reduction while maintaining high performance.

Table of Contents

  1. Quick Wins (Immediate 80% Cost Reduction)
  2. Advanced Optimizations (Additional 15% Reduction)
  3. Infrastructure Optimizations
  4. Monitoring & Fine-Tuning

Quick Wins (Immediate 80% Cost Reduction)

1. Enable Multi-Tier Caching (60-75% Cost Reduction)

Caching is the single most impactful optimization for reducing request costs.

Implementation

use grpc_graphql_gateway::{Gateway, CacheConfig};
use std::time::Duration;

let gateway = Gateway::builder()
    .with_descriptor_set_bytes(DESCRIPTORS)
    .add_grpc_client("service", client)
    // Redis caching for distributed deployments
    .with_response_cache(CacheConfig {
        redis_url: Some("redis://127.0.0.1:6379".to_string()),
        max_size: 50_000,  // Increase from default 10k
        default_ttl: Duration::from_secs(300), // 5 minutes
        stale_while_revalidate: Some(Duration::from_secs(60)), // Serve stale for 1 min
        invalidate_on_mutation: true,
        vary_headers: vec!["Authorization".to_string()],
    })
    .build()?;

Cost Impact

Cache Hit RateDatabase LoadMonthly DB Cost (100k req/s)Savings
0% (No cache)100k queries/s$500+Baseline
50%50k queries/s$25050%
75%25k queries/s$8084%
85%15k queries/s$5090%

Action Items:

  • βœ… Enable Redis caching
  • βœ… Increase max_size to 50,000+ entries
  • βœ… Set appropriate TTL per query type
  • βœ… Enable stale-while-revalidate

2. Enable Response Compression (50-70% Bandwidth Reduction)

Data transfer often costs more than compute at scale.

Implementation

use grpc_graphql_gateway::{Gateway, CompressionConfig};

let gateway = Gateway::builder()
    .with_compression(CompressionConfig {
        level: 6, // Balanced compression (1-9, higher = more compression)
        min_size: 1024, // Only compress responses > 1KB
        enabled_algorithms: vec!["br", "gzip", "deflate"], // Brotli preferred
    })
    .build()?;

Cost Impact

Bandwidth Cost Analysis (100k req/s, 2KB avg response):

ScenarioMonthly Data TransferAWS Cost ($0.09/GB)Annual Cost
No Compression518 TB$46,620/mo$559,440/yr
With Compression (70%)155 TB$13,950/mo$167,400/yr
Savings363 TB$32,670/mo$392,040/yr

Action Items:

  • βœ… Enable Brotli compression (better than gzip)
  • βœ… Use compression level 6 (balance between CPU and size)
  • βœ… Set min_size to avoid compressing small responses

3. Enable Automatic Persisted Queries (90% Request Size Reduction)

APQ reduces ingress bandwidth by sending query hashes instead of full queries.

Implementation

use grpc_graphql_gateway::{Gateway, PersistedQueryConfig};

let gateway = Gateway::builder()
    .with_persisted_queries(PersistedQueryConfig {
        cache_size: 5_000, // Cache up to 5k unique queries
        ttl: Some(Duration::from_secs(7200)), // 2 hour expiration
    })
    .build()?;

Cost Impact

Request Size Reduction:

Request TypeSize Without APQSize With APQReduction
Typical Query1.5 KB150 bytes90%
Complex Query5 KB150 bytes97%

Bandwidth Savings (100k req/s):

  • Ingress: 130 TB/mo β†’ 13 TB/mo = $10,000+/mo savings

Action Items:

  • βœ… Enable APQ on gateway
  • βœ… Configure Apollo Client to use APQ
  • βœ… Set appropriate cache size and TTL

4. Enable Request Collapsing (Eliminate Redundant Queries)

Request collapsing deduplicates identical in-flight queries.

Implementation

use grpc_graphql_gateway::{Gateway, RequestCollapsingConfig};

let gateway = Gateway::builder()
    .with_request_collapsing(RequestCollapsingConfig {
        enabled: true,
        max_wait: Duration::from_millis(10), // Coalesce within 10ms window
    })
    .build()?;

Cost Impact

For high-traffic queries (e.g., homepage data):

  • Without collapsing: 1,000 identical requests β†’ 1,000 database queries
  • With collapsing: 1,000 identical requests β†’ 1 database query

Typical Reduction:

  • 10-25% fewer database queries during traffic spikes
  • $50-100/mo savings on database costs

Action Items:

  • βœ… Enable request collapsing
  • βœ… Monitor metrics to track deduplication rate

Advanced Optimizations (Additional 15% Reduction)

5. Use High-Performance Mode (2x Throughput)

Enable SIMD JSON parsing and sharded caching for maximum throughput.

Implementation

use grpc_graphql_gateway::{Gateway, HighPerformanceConfig};

let gateway = Gateway::builder()
    .enable_high_performance(HighPerformanceConfig {
        simd_json: true,           // SIMD-accelerated JSON parsing
        sharded_cache: true,        // Lock-free sharded cache (128 shards)
        object_pooling: true,       // Reuse buffers to reduce allocations
        num_cache_shards: 128,      // Number of cache shards (power of 2)
    })
    .build()?;

Cost Impact

Throughput Improvement:

  • Standard mode: ~54k req/s per instance
  • High-performance mode: ~100k req/s per instance

Instance Cost Savings (100k req/s):

  • Standard: 2-3 instances Γ— $30/mo = $90/mo
  • High-perf: 1-2 instances Γ— $30/mo = $45/mo
  • Savings: $45/mo (50% reduction)

Action Items:

  • βœ… Enable high-performance mode in production
  • βœ… Use larger instance types (more CPU cores benefit from SIMD)

6. Implement Query Complexity Limits

Prevent expensive queries from consuming resources.

Implementation

let gateway = Gateway::builder()
    .with_query_depth_limit(10)      // Max nesting depth
    .with_query_complexity_limit(1000) // Max complexity score
    .build()?;

Cost Impact

Protection against:

  • Deeply nested queries that cause N+1 problems
  • Overly complex queries that exhaust database connections
  • Malicious queries designed to overload the system

Potential Savings:

  • Prevents 99% of abusive queries
  • Eliminates database overload during attacks
  • $100-500/mo savings by preventing over-provisioning

Action Items:

  • βœ… Set appropriate depth limit (8-12 for most apps)
  • βœ… Set complexity limit based on your schema
  • βœ… Monitor rejected queries to fine-tune limits

7. Enable DataLoader for Batch Processing

Eliminate N+1 query problems by batching requests.

Implementation

let gateway = Gateway::builder()
    .with_data_loader(true)
    .build()?;

Cost Impact

Example: Loading 100 users with their posts

  • Without DataLoader: 1 query + 100 queries = 101 database queries
  • With DataLoader: 1 query + 1 batched query = 2 database queries

Typical Reduction:

  • 50-80% fewer database queries for relationship-heavy schemas
  • $100-200/mo savings on database costs

Action Items:

  • βœ… Enable DataLoader globally
  • βœ… Review schema for relationship fields

8. Use Circuit Breakers

Prevent cascading failures and unnecessary retries.

Implementation

use grpc_graphql_gateway::{Gateway, CircuitBreakerConfig};

let gateway = Gateway::builder()
    .with_circuit_breaker(CircuitBreakerConfig {
        failure_threshold: 5,          // Open after 5 failures
        timeout: Duration::from_secs(30), // Reset after 30s
        half_open_max_requests: 3,     // Allow 3 test requests
    })
    .build()?;

Cost Impact

Protection against:

  • Repeated calls to failing backends
  • Resource exhaustion during outages
  • Cascading failures across services

Potential Savings:

  • Prevents 90% of unnecessary retries during outages
  • $50-100/mo savings by avoiding spike in error traffic

Action Items:

  • βœ… Enable circuit breaker per gRPC client
  • βœ… Configure appropriate thresholds
  • βœ… Monitor circuit breaker state

Infrastructure Optimizations

9. Use ARM Instances (20-30% Cost Reduction)

ARM processors (AWS Graviton, GCP Tau) offer better price-performance.

Recommendations

AWS:

Standard: c6i.large (x86) = $0.085/hr = $62/mo
Optimized: c6g.large (ARM) = $0.068/hr = $50/mo
Savings: $12/mo per instance (19% cheaper)

GCP:

Standard: e2-standard-2 (x86) = $0.067/hr = $49/mo
Optimized: t2a-standard-2 (ARM) = $0.053/hr = $39/mo
Savings: $10/mo per instance (20% cheaper)

Cost Impact (3 instances):

  • Annual savings: $360-400/yr

Action Items:

  • βœ… Switch to ARM instances (Graviton2/3 on AWS)
  • βœ… Test for compatibility (Rust has excellent ARM support)

10. Use PgBouncer Connection Pooling

Reduce database connection overhead.

Implementation

# Install PgBouncer on t4g.micro ($6/mo)
docker run -d \
  --name pgbouncer \
  -e DATABASE_URL=postgres://user:pass@db-host:5432/dbname \
  -e POOL_MODE=transaction \
  -e MAX_CLIENT_CONN=10000 \
  -e DEFAULT_POOL_SIZE=25 \
  -p 6432:6432 \
  edoburu/pgbouncer

Cost Impact

Database Performance Improvement:

  • Increases throughput by 2-4x
  • Allows smaller database instances

Cost Savings:

Without PgBouncerWith PgBouncerSavings
db.m5.large ($144/mo)db.t3.medium ($72/mo)$72/mo
db.m5.xlarge ($288/mo)db.t3.large ($144/mo)$144/mo

Action Items:

  • βœ… Deploy PgBouncer on micro instance ($6/mo)
  • βœ… Use transaction pooling mode
  • βœ… Downgrade database instance size

11. Implement Cloudflare Edge Caching

Cache responses at 200+ edge locations worldwide.

Implementation

Cloudflare Worker for GraphQL Caching:

// workers/graphql-cache.js
addEventListener('fetch', event => {
  event.respondWith(handleRequest(event.request));
});

async function handleRequest(request) {
  if (request.method === 'POST' && request.url.includes('/graphql')) {
    const body = await request.clone().json();
    
    // Create cache key from query hash
    const cacheKey = new Request(
      request.url + '?q=' + btoa(JSON.stringify(body)),
      { method: 'GET' }
    );
    
    const cache = caches.default;
    let response = await cache.match(cacheKey);
    
    if (!response) {
      response = await fetch(request);
      
      // Cache for 60 seconds (adjust per query type)
      const headers = new Headers(response.headers);
      headers.set('Cache-Control', 'public, max-age=60');
      
      response = new Response(response.body, { ...response, headers });
      event.waitUntil(cache.put(cacheKey, response.clone()));
    }
    
    return response;
  }
  
  return fetch(request);
}

Cost Impact

Edge Cache Hit Rate: 30-50%

Before Cloudflare:

  • Origin requests: 100k req/s
  • Bandwidth from origin: 518 TB/mo
  • Cost: $46,620/mo

After Cloudflare (40% hit rate):

  • Origin requests: 60k req/s
  • Bandwidth from origin: 310 TB/mo
  • Cost: $27,900/mo
  • Savings: $18,720/mo

With Cloudflare + Compression:

  • Origin requests: 60k req/s
  • Bandwidth from origin: 93 TB/mo (compressed)
  • Cost: $8,370/mo
  • Savings: $38,250/mo

Action Items:

  • βœ… Sign up for Cloudflare Pro ($20/mo)
  • βœ… Deploy edge caching worker
  • βœ… Configure cache rules per query type

12. Right-Size Your Database

Start small and scale based on metrics.

Sizing Guide

With Caching + PgBouncer:

Cache Hit RateEffective DB LoadRecommended InstanceMonthly Cost
50%50k queries/sdb.m5.large$144
75%25k queries/sdb.t3.medium$72
85%15k queries/sdb.t3.small$36
90%10k queries/sdb.t3.micro$15

Action Items:

  • βœ… Start with smallest instance that handles load
  • βœ… Enable auto-scaling based on CPU/connections
  • βœ… Monitor cache hit rate to optimize database size

Monitoring & Fine-Tuning

13. Track Cost Metrics

Monitor these key metrics to optimize costs:

let gateway = Gateway::builder()
    .enable_metrics()  // Prometheus metrics
    .enable_analytics(AnalyticsConfig::development())
    .build()?;

Key Metrics to Monitor

MetricTargetAction if Below Target
Cache hit rate\u003e75%Increase TTL or cache size
APQ hit rate\u003e80%Increase APQ cache size
Request collapsing rate\u003e10%Review query patterns
Database connections\u003c50 per instanceVerify PgBouncer config
P99 latency\u003c50msCheck for N+1 queries

Action Items:

  • βœ… Set up Prometheus + Grafana
  • βœ… Create alerts for low cache hit rates
  • βœ… Review metrics weekly to optimize

14. Implement Query Whitelisting (Production)

Only allow pre-approved queries in production.

Implementation

use grpc_graphql_gateway::{Gateway, QueryWhitelistConfig};

let gateway = Gateway::builder()
    .with_query_whitelist(QueryWhitelistConfig {
        whitelist_file: "queries.whitelist",
        enforce: true, // Block non-whitelisted queries
    })
    .build()?;

Cost Impact

Benefits:

  • Prevents ad-hoc expensive queries
  • Allows pre-optimization of all queries
  • Enables aggressive caching (known query patterns)

Potential Savings:

  • Eliminates rogue queries that spike costs
  • 20-30% better cache hit rates (predictable queries)
  • $100-200/mo savings from better optimization

Action Items:

  • βœ… Extract queries from production traffic
  • βœ… Enable whitelist in production
  • βœ… Keep whitelist in version control

Complete Optimized Configuration

Here’s a production-ready configuration with all optimizations enabled:

use grpc_graphql_gateway::{
    Gateway, GrpcClient, CacheConfig, CompressionConfig,
    PersistedQueryConfig, RequestCollapsingConfig,
    CircuitBreakerConfig, HighPerformanceConfig,
    QueryWhitelistConfig, AnalyticsConfig,
};
use std::time::Duration;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = GrpcClient::builder("http://backend:50051")
        .lazy(false)
        .connect()
        .await?;

    let gateway = Gateway::builder()
        .with_descriptor_set_bytes(DESCRIPTORS)
        .add_grpc_client("service", client)
        
        // Performance optimizations
        .enable_high_performance(HighPerformanceConfig {
            simd_json: true,
            sharded_cache: true,
            object_pooling: true,
            num_cache_shards: 128,
        })
        
        // Caching (60-75% cost reduction)
        .with_response_cache(CacheConfig {
            redis_url: Some("redis://redis:6379".to_string()),
            max_size: 50_000,
            default_ttl: Duration::from_secs(300),
            stale_while_revalidate: Some(Duration::from_secs(60)),
            invalidate_on_mutation: true,
            vary_headers: vec!["Authorization".to_string()],
        })
        
        // Bandwidth optimization (50-70% reduction)
        .with_compression(CompressionConfig {
            level: 6,
            min_size: 1024,
            enabled_algorithms: vec!["br", "gzip", "deflate"],
        })
        
        // APQ (90% request size reduction)
        .with_persisted_queries(PersistedQueryConfig {
            cache_size: 5_000,
            ttl: Some(Duration::from_secs(7200)),
        })
        
        // Request deduplication
        .with_request_collapsing(RequestCollapsingConfig {
            enabled: true,
            max_wait: Duration::from_millis(10),
        })
        
        // DataLoader for batching
        .with_data_loader(true)
        
        // Circuit breaker
        .with_circuit_breaker(CircuitBreakerConfig {
            failure_threshold: 5,
            timeout: Duration::from_secs(30),
            half_open_max_requests: 3,
        })
        
        // Security
        .with_query_depth_limit(12)
        .with_query_complexity_limit(1000)
        .with_query_whitelist(QueryWhitelistConfig {
            whitelist_file: "queries.whitelist",
            enforce: true,
        })
        
        // Observability
        .enable_metrics()
        .enable_analytics(AnalyticsConfig::production())
        .enable_health_checks()
        
        .build()?;

    gateway.serve("0.0.0.0:8080").await?;
    Ok(())
}

Cost Reduction Summary

OptimizationCost ReductionEffortPriority
Multi-tier caching60-75%MediumπŸ”΄ Critical
Response compression50-70%LowπŸ”΄ Critical
APQ30-50%Medium🟑 High
Cloudflare edge caching30-50%Medium🟑 High
Request collapsing10-25%Low🟒 Medium
ARM instances20-30%Low🟒 Medium
PgBouncer40-60%Medium🟑 High
High-performance mode50%Low🟑 High
DataLoader50-80%Medium🟑 High
Query limits10-20%Low🟒 Medium

Total Potential Savings: 90-97% cost reduction


Before & After Comparison

Before Optimization (100k req/s)

ComponentCost
Gateway (25 Node.js instances)$750/mo
Database (Large instance)$288/mo
Data transfer (518 TB)$46,620/mo
Total$47,658/mo

After Optimization (100k req/s)

ComponentCost
Cloudflare Pro$20/mo
Gateway (2 ARM instances)$45/mo
PgBouncer$6/mo
Redis (3GB)$50/mo
Database (Small instance, 90% cache hit)$30/mo
Data transfer (10 TB compressed)$900/mo
Total$1,051/mo

Annual Savings: $559,284 per year (97.8% reduction)


Cost Reduction Features Summary

This document summarizes the new cost-lowering features added to the grpc_graphql_gateway.

🎯 Overview

Two new major features have been implemented to dramatically reduce per-request costs:

  1. Query Cost Analysis - Prevent expensive queries from spiking infrastructure costs
  2. Smart TTL Management - Intelligently optimize cache durations for maximum hit rates

πŸ’° Cost Impact

FeatureMonthly SavingsCache Hit Rate ImprovementDatabase Load Reduction
Query Cost Analysis$200-500/moN/APrevents over-provisioning
Smart TTL Management$100-200/mo+15% (75% β†’ 90%)-60% (25k β†’ 10k q/s)
Combined$300-700/mo+15%-60%

1️⃣ Query Cost Analysis

Purpose

Assign costs to GraphQL queries and enforce budgets to prevent expensive queries from overwhelming infrastructure.

Key Features

  • Per-Query Cost Limits: Reject queries exceeding cost thresholds
  • User Budget Enforcement: Limit costs per user over time windows
  • Field-Specific Multipliers: Assign higher costs to expensive fields
  • Adaptive Costs: Increase costs during high system load
  • Cost Analytics: Track and identify expensive query patterns

Implementation

use grpc_graphql_gateway::{QueryCostAnalyzer, QueryCostConfig};
use std::collections::HashMap;
use std::time::Duration;

// Configure cost analysis
let mut field_multipliers = HashMap::new();
field_multipliers.insert("user.posts".to_string(), 50);  // 50x cost
field_multipliers.insert("analytics".to_string(), 200);  // 200x cost

let cost_config = QueryCostConfig {
    max_cost_per_query: 1000,
    base_cost_per_field: 1,
    field_cost_multipliers: field_multipliers,
    user_cost_budget: 10_000,
    budget_window: Duration::from_secs(60),
    track_expensive_queries: true,
    adaptive_costs: true,
    ..Default::default()
};

let analyzer = QueryCostAnalyzer::new(cost_config);

// Check query cost
let result = analyzer.calculate_query_cost(query).await?;
println!("Query cost: {}", result.total_cost);

// Enforce user budget
analyzer.check_user_budget("user_123", result.total_cost).await?;

Benefits

  • βœ… Prevent runaway queries
  • βœ… Fair resource allocation
  • βœ… Predictable costs
  • βœ… Database protection
  • βœ… Avoid over-provisioning

Cost Savings

  • $200-500/month by preventing database over-provisioning and spikes

2️⃣ Smart TTL Management

Purpose

Dynamically optimize cache TTLs based on query patterns and data volatility instead of using a single static TTL.

Key Features

  • Query Type Detection: Auto-detect static content, user profiles, real-time data, etc.
  • Volatility Learning: Track how often data changes and adjust TTLs
  • Mutation Tracking: Learn which mutations affect which queries
  • Cache Control Hints: Respect @cacheControl directives
  • Custom Patterns: Define TTLs for specific query patterns

TTL Defaults

Query TypeDefault TTLExample Queries
Static Content24 hourscategories, tags, settings
User Profiles15 minutesprofile, user, me
Aggregated Data30 minutesanalytics, statistics
List Queries10 minutesposts(page: 1), listUsers
Item Queries5 minutesgetUserById, getPost
Real-Time Data5 secondsliveScores, currentPrice

Implementation

use grpc_graphql_gateway::{
    SmartTtlManager, SmartTtlConfig, CacheConfig
};
use std::sync::Arc;
use std::time::Duration;

// Configure Smart TTL
let smart_ttl_config = SmartTtlConfig {
    default_ttl: Duration::from_secs(300),
    user_profile_ttl: Duration::from_secs(900),
    static_content_ttl: Duration::from_secs(86400),
    real_time_data_ttl: Duration::from_secs(5),
    auto_detect_volatility: true,  // Enable learning
    ..Default::default()
};

let smart_ttl = Arc::new(SmartTtlManager::new(smart_ttl_config));

// Integrate with cache
let cache_config = CacheConfig {
    max_size: 50_000,
    default_ttl: Duration::from_secs(300),
    smart_ttl_manager: Some(smart_ttl),
    ..Default::default()
};

let gateway = Gateway::builder()
    .with_response_cache(cache_config)
    .build()?;

Volatility-Based Adjustment

VolatilityData BehaviorTTL Adjustment
> 70%Changes frequently0.5x (halve TTL)
30-70%Moderate changes0.75x
10-30%Stable1.5x
< 10%Very stable2.0x (double TTL)

Benefits

  • βœ… Higher cache hit rates (+15%)
  • βœ… Reduced database load (-60%)
  • βœ… Automatic optimization
  • βœ… Lower latency
  • βœ… Better user experience

Cost Savings

  • $100-200/month in reduced database costs
  • Can downgrade database instance (e.g., Medium β†’ Small)

πŸ“Š Combined Impact

Before Cost Optimization (100k req/s workload)

Cache Configuration:
β”œβ”€ Static 5-minute TTL for all queries
β”œβ”€ No query cost enforcement
└─ Cache hit rate: 75%

Database Load:
β”œβ”€ Effective queries: 25,000/s
β”œβ”€ Instance: db.t3.medium
└─ Cost: $72/month

Problems:
β”œβ”€ Expensive queries spike load
β”œβ”€ Suboptimal TTLs lower cache efficiency
└─ Over-provisioned for safety

After Cost Optimization (100k req/s workload)

Cache Configuration:
β”œβ”€ Smart TTL (query-type specific)
β”œβ”€ Volatility-based learning
β”œβ”€ Query cost enforcement
└─ Cache hit rate: 90% (+15%)

Database Load:
β”œβ”€ Effective queries: 10,000/s (-60%)
β”œβ”€ Instance: db.t3.small
└─ Cost: $36/month (-50%)

Benefits:
β”œβ”€ Predictable costs (query budgets)
β”œβ”€ Optimal cache efficiency (smart TTL)
└─ Right-sized infrastructure

Cost Breakdown

ComponentBeforeAfterSavings
Database Instance$72/mo$36/mo-$36/mo
Over-Provisioning Buffer+$50/mo$0/mo-$50/mo
Spike Prevention---$100-300/mo
Total Savings--$186-386/mo

Annual Savings: $2,232 - $4,632


πŸš€ Quick Start Guide

Step 1: Enable Query Cost Analysis

use grpc_graphql_gateway::{QueryCostAnalyzer, QueryCostConfig};

let cost_analyzer = Arc::new(QueryCostAnalyzer::new(
    QueryCostConfig::default()
));

// Add cost check to your middleware
async fn cost_middleware(query: &str, user_id: &str) -> Result<(), Error> {
    let cost = cost_analyzer.calculate_query_cost(query).await?;
    cost_analyzer.check_user_budget(user_id, cost.total_cost).await?;
    Ok(())
}

Step 2: Enable Smart TTL

use grpc_graphql_gateway::{SmartTtlManager, SmartTtlConfig, CacheConfig};

let smart_ttl = Arc::new(SmartTtlManager::new(
    SmartTtlConfig::default()
));

let cache_config = CacheConfig {
    smart_ttl_manager: Some(smart_ttl),
    ..Default::default()
};

Step 3: Monitor Effectiveness

// Query Cost Analytics
let cost_analytics = cost_analyzer.get_analytics().await;
println!("P95 query cost: {}", cost_analytics.p95_cost);

// Smart TTL Analytics
let ttl_analytics = smart_ttl.get_analytics().await;
println!("Cache hit rate improved to: {}%", 
    (1.0 - (db_load / total_requests)) * 100.0);

πŸ“š Documentation

Query Cost Analysis

  • Full Documentation
  • Configuration examples
  • Field cost multipliers
  • Budget enforcement
  • Analytics and monitoring

Smart TTL Management

  • Full Documentation
  • Query type detection
  • Volatility learning
  • Custom patterns
  • Integration guide

Cost Optimization Strategies


🎯 Best Practices

1. Start Conservative

QueryCostConfig {
    max_cost_per_query: 2000,  // High limit initially
    track_expensive_queries: true,
    ..Default::default()
}

2. Monitor and Tune

  • Review analytics daily for first week
  • Identify expensive queries
  • Adjust field multipliers
  • Lower limits gradually

3. Combine with Existing Features

Gateway::builder()
    .with_query_cost_config(cost_config)      // NEW
    .with_response_cache(cache_config)         // Existing
    .with_smart_ttl(smart_ttl_config)         // NEW
    .with_query_depth_limit(10)               // Existing
    .with_query_complexity_limit(1000)        // Existing
    .with_query_whitelist(whitelist_config)   // Existing
    .build()?

4. Track Metrics

// Export to Prometheus
gauge!("query_cost_p95", cost_analytics.p95_cost);
gauge!("cache_hit_rate", cache_hit_rate);
gauge!("smart_ttl_avg", ttl_analytics.avg_recommended_ttl.as_secs());

These new features work best when combined with:

  • Response Caching - Smart TTL makes caching more effective
  • Query Whitelisting - Pre-calculate costs for whitelisted queries
  • APQ - Reduce bandwidth costs
  • Request Collapsing - Deduplicate identical queries
  • Circuit Breaker - Protect against cascading failures

πŸ“ˆ Expected Results

After implementing both features:

MetricBeforeAfterImprovement
Cache Hit Rate75%90%+20%
Database Load25k q/s10k q/s-60%
P99 Latency50ms30ms-40%
Database Cost$72/mo$36/mo-50%
Expensive Query Incidents5-10/mo0-1/mo-90%
Over-Provisioning+40%+10%-30%

Total Monthly Savings: $300-700 for a 100k req/s workload


πŸŽ‰ Summary

The combination of Query Cost Analysis and Smart TTL Management provides:

βœ… Predictable Costs - No surprise spikes from expensive queries
βœ… Maximum Cache Efficiency - 90%+ hit rates with intelligent TTLs
βœ… Right-Sized Infrastructure - No over-provisioning needed
βœ… Better Performance - Lower latency, higher throughput
βœ… Automatic Optimization - Self-learning and self-tuning

Result: $300-700/month savings while improving performance!

API Documentation

The full Rust API documentation is available on docs.rs:

πŸ“š docs.rs/grpc_graphql_gateway

Main Types

Gateway

The main entry point for creating and running the gateway.

use grpc_graphql_gateway::Gateway;

let gateway = Gateway::builder()
    // ... configuration
    .build()?;

GatewayBuilder

Configuration builder with fluent API.

GrpcClient

Manages connections to gRPC backend services.

use grpc_graphql_gateway::GrpcClient;

// Lazy connection (connects on first request)
let client = GrpcClient::builder("http://localhost:50051")
    .connect_lazy()?;

// Immediate connection
let client = GrpcClient::new("http://localhost:50051").await?;

SchemaBuilder

Low-level builder for the dynamic GraphQL schema.

Module Reference

ModuleDescription
gatewayMain Gateway and GatewayBuilder
schemaSchema generation from protobuf
grpc_clientgRPC client management
federationApollo Federation support
middlewareRequest middleware
cacheResponse caching
compressionResponse compression
circuit_breakerCircuit breaker pattern
persisted_queriesAPQ support
healthHealth check endpoints
metricsPrometheus metrics
tracing_otelOpenTelemetry tracing
shutdownGraceful shutdown
headersHeader propagation

Re-exported Types

pub use gateway::{Gateway, GatewayBuilder};
pub use grpc_client::GrpcClient;
pub use schema::SchemaBuilder;
pub use cache::{CacheConfig, ResponseCache};
pub use compression::{CompressionConfig, CompressionLevel};
pub use circuit_breaker::{CircuitBreakerConfig, CircuitBreaker};
pub use persisted_queries::PersistedQueryConfig;
pub use shutdown::ShutdownConfig;
pub use headers::HeaderPropagationConfig;
pub use tracing_otel::TracingConfig;
pub use middleware::{Middleware, Context};
pub use federation::{EntityResolver, EntityResolverMapping, GrpcEntityResolver};

Error Types

use grpc_graphql_gateway::{Error, Result};

// Main error type
enum Error {
    Schema(String),
    Io(std::io::Error),
    Grpc(tonic::Status),
    // ...
}

Async Traits

When implementing custom resolvers or middleware, you’ll use:

use async_trait::async_trait;

#[async_trait]
impl Middleware for MyMiddleware {
    async fn call(&self, ctx: &mut Context, next: ...) -> Result<()> {
        // ...
    }
}

Changelog

All notable changes to this project are documented here.

For the full changelog, see the CHANGELOG.md file in the repository.

[0.9.0] - 2025-12-27

Router Security Hardening πŸ›‘οΈ

Implemented a massive expansion of security headers and browser protections to ensure production-grade security.

Key Features:

  • HSTS Enforcement: Added Strict-Transport-Security to force HTTPS connections for 1 year.
  • Browser Isolation: Added COOP, COEP, and CORP headers to mitigate Spectre/Meltdown class side-channel attacks.
  • CSP Tightening: Further restricted Content-Security-Policy with object-src 'none', preventing object injection attacks.
  • Privacy First: Added X-DNS-Prefetch-Control and strict Permissions-Policy.

[0.8.9] - 2025-12-27

DDoS Protection Fixes πŸ›‘οΈ

Resolved a critical stability issue in the DDoS protection module.

Key Fixes:

  • Zero-Config Panic: Fixed a runtime panic that occurred when global_rps or per_ip_rps were set to 0.
  • Strict Blocking: Configuring limits to 0 now correctly blocks all traffic as intended, instead of crashing the application.

[0.8.8] - 2025-12-27

Production Reliability & CLI Tools πŸ› οΈ

Enhanced operational maturity with graceful shutdown support and configuration validation tools.

Key Features:

  • Graceful Shutdown: Safely drains active requests on SIGTERM/SIGINT to ensure zero dropped connections during deployments.
  • Config Check: New router --check command to validate configuration integrity in CI pipelines.
  • Critical Fix: Resolved a deadlock that could freeze hot-reloading when live queries were active.

[0.8.7] - 2025-12-27

Circuit Breaker Pattern πŸ›‘οΈ

Integrated a robust Circuit Breaker into the GBP Router to prevent cascading failures and ensure system resilience.

Key Features:

  • Fail Fast: Immediately rejects requests to unhealthy subgraphs when the circuit is β€œOpen”, preventing resource exhaustion.
  • Automatic Recovery: Periodically allows test requests in β€œHalf-Open” state to check for service recovery without overwhelming the backend.
  • Configurable: Fully configurable via router.yaml (failure threshold, recovery timeout, half-open limit).
  • State Management: Tracks success/failure rates per subgraph with atomic counters.

[0.8.6] - 2025-12-27

WAF Header Validation & Enhanced Security πŸ›‘οΈ

Significantly strengthened the security posture by extending WAF protection to HTTP headers and adding modern browser security policies.

Key Features:

  • Header Scanning: All incoming HTTP headers are now scanned for malicious payloads (SQLi, XSS, etc.) before processing.
  • CSP Header: Added a strict Content-Security-Policy to mitigate XSS risks while supporting GraphiQL development tools.
  • Permissions Policy: Enforced a restrictive Permissions-Policy to block sensitive browser features (camera, microphone, geolocation) by default.
  • Direct Query Validation: Optimized checking mechanism for raw GraphQL query strings.

[0.8.5] - 2025-12-26

Zero-Downtime Hot Reloading ♻️

The Router now supports dynamic configuration updates without process restarts, critical for high-availability environments.

Key Features:

  • Instant Updates: Modify router.yaml (e.g., add a subgraph, update WAF rules) and see changes apply immediately.
  • Atomic Swaps: State transitions are atomic, ensuring no request sees a partially applied configuration.
  • SafetyNet: Invalid configurations are rejected, logging an error while the router continues serving traffic with the last known good config.
  • Operational Agility: Rotate secrets, adjust rate limits, or deploy new subgraphs with zero packet loss.

[0.8.4] - 2025-12-26

Router Security Hardening πŸ›‘οΈ

Verified and hardened the router’s WAF and configuration systems for production stability.

Key Improvements:

  • Verified WAF Protection: Validated blocking of major attack vectors (SQLi, XSS, NoSQLi) with live security tests.
  • Robust Configuration: hardened router.yaml parsing to gracefully handle deprecated or missing fields (specifically in QueryCostConfig).
  • Startup Stability: Resolved startup panics caused by configuration mismatches, ensuring smoother upgrades.

[0.8.3] - 2025-12-26

WAF Massive Expansion πŸ›‘οΈ

Significantly expanded the Web Application Firewall rule set to cover over 200+ attack patterns across 7 distinct security categories, turning the Router into a hardened security edge.

New Protections:

  • Command Injection (CMDI): Blocks shell execution attempts (| ls, ; cat, $(...), cmd.exe).
  • Path Traversal: Prevents unauthorized file access (../, /etc/passwd, c:\windows).
  • LDAP Injection: Filters malicious LDAP queries (*, (|, &)).
  • SSTI: Detects Server-Side Template Injection payloads ({{}}, <%=, #{}).

Enhanced Protections:

  • Advanced SQLi: Added Blind SQLi, time-based attacks, and file system access checks.
  • Deep XSS: Expanded regex to catch obfuscated event handlers and dangerous tags.
  • NoSQLi: Covered advanced MongoDB operators and JavaScript execution vectors.

[0.8.2] - 2025-12-26

Web Application Firewall (WAF) πŸ›‘οΈ

Introduced a native WAF middleware for SQL Injection protection.

Key Features:

  • Active Blocking: Detects and blocks OR 1=1, UNION SELECT and other SQLi patterns.
  • Context Aware: Inspects GraphQL variables, headers, and query parameters.
  • Defense in Depth: Integrated into both the Gateway library and the high-performance Router binary.
  • Zero Overhead: Optimized regex engine ensures negligible latency impact (<10Β΅s).

[0.8.1] - 2025-12-26

Transparent Field-Level Encryption πŸ•΅οΈβ€β™€οΈ

Introduced a Zero-Trust data protection layer for federated graphs.

Key Features:

  • Encrypted Transport: Sensitive fields (like PII) are encrypted by subgraphs using the gateway secret.
  • Edge Decryption: The Router acts as a secure cryptographic edge, automatically decrypting data before final delivery to the client.
  • Seamless Integration: Works transparently with existing resolvers via the new recursive_decrypt pipeline.

[0.8.0] - 2025-12-26

Service-to-Service Authentication πŸ”

Implemented a comprehensive security layer for internal federation traffic.

Key Features:

  • Zero-Trust Subgraphs: Subgraphs now verify the origin of requests using X-Gateway-Secret.
  • Dynamic Secrets: Secrets are loaded from the GATEWAY_SECRET environment variable, eliminating hardcoded credentials.
  • Header Injection: The Router dynamically signs every outbound request with the authorized secret.

[0.7.9] - 2025-12-26

DDoS Protection Hardening πŸ›‘οΈ

Addressed a critical resource management issue in the rate limiting system.

Improvements:

  • Memory Leak Patched: Fixed a potential DoS vector where stale IP rate limiters were never cleaned up.
  • Active Lifecycle Management: Rate limiters are now tracked by last_seen timestamp and actively purged after inactivity.
  • Background Maintenance: A new background task ensures consistent memory usage under long-running operation.

[0.7.8] - 2025-12-25

Router Security Verification πŸ•΅οΈβ€β™‚οΈ

Added a robust security test suite to the GBP Router, validating its resilience against extreme conditions and attack vectors.

Verified Protections:

  1. Input Resilience:

    • Massive Queries: Validated stability with 10MB+ input payloads.
    • Deep Nesting: Verified handling of 500+ deep nested queries.
  2. Subgraph Isolation:

    • Slow Loris: Verified that slow subgraphs do not impact healthy ones.
    • Malformed Data: Graceful handling of invalid or huge subgraph responses.
  3. DDoS Verification:

    • Concurrent Flooding: Validated token bucket effectiveness under load.

[0.7.7] - 2025-12-25

Router Security Hardening πŸ›‘οΈ

Major security upgrades for the GBP Router, hardening it for production deployments.

Security Enhancements:

  1. Defensive Headers:

    • X-Frame-Options: DENY
    • X-Content-Type-Options: nosniff
    • X-XSS-Protection: 1; mode=block
    • Referrer-Policy: strict-origin-when-cross-origin
  2. Resource Protection:

    • 2MB Body Limit: Prevents large payload DoS attacks.
    • 30s Timeout: Protects against slow-loris and resource exhaustion.
  3. Dynamic CORS:

    • Full configuration support via router.yaml.
    • Strict origin allowlists for production security.

Maintainability:

  • Cleaned up dead code and improved middleware organization.

[0.7.6] - 2025-12-25

Bidirectional Binary Protocol πŸš€

Revolutionary GraphQL Binary Protocol (GBP) support for both requests AND responses, delivering 73-98% bandwidth reduction and massive cost savings!

Key Features:

  1. Binary Request Parsing - Router accepts GBP-encoded queries

    • Content-Type detection (application/x-gbp, application/graphql-request+gbp)
    • Automatic binary request decoding
    • Seamless JSON fallback for compatibility
  2. Binary Response Encoding - Enhanced content negotiation

    • Accept header detection (application/x-gbp, application/graphql-response+gbp)
    • Automatic binary response when request is binary
    • Error responses in client’s requested format
  3. Truly Bidirectional - Complete binary request/response cycle

    • 48% smaller requests (64 bytes JSON β†’ 33 bytes binary)
    • 73-98% smaller responses (depending on data pattern)
    • Both directions benefit from GBP compression

Performance by Data Pattern:

  • Realistic Production (73-74% compression):

    • 50K users with unique IDs: 73.2% reduction (35.26 MB β†’ 9.45 MB)
    • 10K users: 74.4% reduction (6.71 MB β†’ 1.71 MB)
    • Common in user management, CRM, authentication systems
  • Mid-Case Repetitive (85-91% compression):

    • 50K products: 91.3% reduction (28.98 MB β†’ 2.53 MB, 11.4x smaller)
    • Product catalogs, analytics dashboards, event logs
    • Shared pricing, inventory, ratings data
  • Extreme Repetitive (97-98% compression):

    • 50K users (cache-like): 98.2% reduction (30.65 MB β†’ 578 KB, 54x smaller)
    • Analytics caching, template-based responses

Real-World Impact (1M requests/month):

  • Bandwidth Savings: 25-29 TB/month saved
  • Cost Savings: $2,100-$2,500/month = $25K-$30K/year (AWS CloudFront pricing)
  • Performance: 3-54x faster network transfers
  • Mobile: 73-98% less data usage

Technical Implementation:

  • Modified router to parse both JSON and binary requests
  • Full content negotiation system
  • Encoder pooling for high-performance
  • Error responses respect client format preferences
  • 100% backward compatible (opt-in via headers)

Examples:

  • examples/binary_protocol_client.rs - Complete compression analysis
    • 9 scenarios covering realistic to extreme cases
    • Bandwidth/cost savings calculations
    • Monthly impact projections

Use Cases:

  • High-traffic APIs with bandwidth costs
  • Mobile applications with data-sensitive users
  • E-commerce product catalogs
  • Analytics dashboards with large datasets
  • Real-time feeds and event streams
  • Microservice communication optimization

[0.7.5] - 2025-12-24

Advanced Live Query Features πŸš€

Complete implementation of sophisticated live query capabilities delivering up to 99% bandwidth reduction in optimal scenarios!

Key Features:

  1. Filtered Live Queries - Server-side filtering with custom predicates

    • Example: users(status: ONLINE) @live only sends online users
    • Reduces bandwidth by 50-90% by filtering at the source
    • Supports complex filter expressions and multiple conditions
    • Perfect for dashboards showing subsets of large datasets
  2. Field-Level Invalidation - Granular tracking of field changes

    • Only re-execute queries when specific fields are modified
    • Prevents unnecessary updates for unrelated mutations
    • Reduces update messages by 30-60%
    • Example: Status change doesn’t trigger name field queries
  3. Batch Invalidation - Intelligent merging of rapid updates

    • Configurable batching window (default: 100ms)
    • Reduces update messages by 70-95% during burst changes
    • Prevents client-side UI thrashing
    • Ideal for high-frequency data sources
  4. Client Caching Hints - Smart cache directives

    • Automatic max-age, stale-while-revalidate headers
    • Based on data volatility analysis
    • Optimizes both bandwidth and CPU usage
    • Works seamlessly with browser caching

Performance Impact:

  • Combined Optimization: Up to 99% bandwidth reduction
    • Filtered queries: 50-90% reduction
    • Field tracking: 30-60% fewer updates
    • Batch invalidation: 70-95% message reduction
    • GBP compression: 90-99% payload reduction

Real-World Example (Dashboard with 1000 items updating every second):

  • Without optimization: ~100 MB/min
  • With all features: ~1 MB/min or less

New Examples & Documentation:

  • advanced_features_example.rs - Complete demonstration
  • VISUAL_GUIDE.md - Architecture diagrams and flow charts
  • test_advanced_features.js - Validation test suite
  • Extended docs/src/advanced/live-queries.md

Enhanced API:

  • filter_live_query_results() - Apply server-side filtering
  • extract_filter_predicate() - Parse filter expressions
  • batch_invalidation_events() - Merge invalidation events
  • LiveQueryStore enhancements for filters and field tracking

Use Cases:

  • Real-time analytics and trading platforms
  • Collaborative editing and chat applications
  • IoT monitoring with thousands of devices
  • Gaming leaderboards and live statistics
  • Social media feeds with personalized filtering

[0.7.4] - 2025-12-21

Comprehensive Test Suite βœ…

Added nearly 500 unit and integration tests across the entire codebase for maximum reliability and regression prevention!

Test Coverage by Module:

  • Analytics: 122 tests (query tracking, metrics, privacy)
  • Cache: 497 tests (LRU, TTL, invalidation, Redis)
  • Circuit Breaker: 166 tests (failure detection, recovery)
  • Compression: 156 tests (Brotli, Gzip, Zstd, GBP)
  • DataLoader: 197 tests (batching, N+1 prevention)
  • Error Handling: 251 tests (conversions, formatting)
  • Federation: 144 tests (entity resolution, coordination)
  • Gateway: 435 tests (builder, configuration, runtime)
  • GBP: 267 tests (encoding/decoding, integrity)
  • Headers: 328 tests (propagation, security, CORS)
  • Health Checks: 348 tests (probes, metrics)
  • High Performance: 226 tests (SIMD, cache, pooling)
  • Live Query: 215 tests (WebSocket, invalidation)
  • Metrics: 116 tests (Prometheus, tracking)
  • Middleware: 214 tests (auth, logging, filtering)
  • REST Connector: 146 tests (API integration)
  • Router: 139 tests (federation, scatter-gather)
  • Runtime: 377 tests (HTTP/WebSocket handlers)
  • And many more…

Quality Improvements:

  • Validates edge cases and error conditions
  • Tests performance characteristics
  • Prevents future breaking changes
  • Serves as living documentation
  • Designed for reliable CI/CD execution

[0.7.0] - 2025-12-20

WebSocket Live Query Compression πŸš€

Revolutionary GBP compression support for WebSocket live queries, delivering 60-97% bandwidth reduction on real-time GraphQL subscriptions!

Key Features:

  • Compression Negotiation: Opt-in via connection_init payload with compression: "gbp-lz4"
  • Binary Frame Protocol: Two-frame system (JSON envelope + GBP binary payload)
  • Backward Compatible: Standard JSON mode still works (wscat compatible)
  • Automatic Fallback: Gracefully degrades to JSON if compression fails

Performance:

  • Small (13 users): 60.62% reduction (617 β†’ 243 bytes)
  • Medium (1K users): ~90% reduction
  • Large (100K users): 97.01% reduction (73.5 MB β†’ 2.2 MB)
  • Massive (1M users): 97.06% reduction (726.99 MB β†’ 21.37 MB)
  • Encoding: 83.38 MB/s | Decoding: 23.05 MB/s

Real-World Impact (10K connections, 1M users, 5s updates):

  • Bandwidth saved: 121.9 PB/month
  • Cost savings: $9.75M/month
  • Infrastructure: 97% fewer network links needed
  • Mobile data: 34Γ— reduction

Technical:

  • Multi-layer compression (semantic + structural + block)
  • Compression improves with dataset size
  • Field name and object deduplication
  • Full data integrity preserved

Fixed:

  • Live query example gRPC server (port 50051 β†’ 50052)
  • Server stability improvements

[0.6.9] - 2025-12-20

Comprehensive GBP Compression Benchmarks πŸ“Š

Added three benchmark tests demonstrating GBP performance across different data patterns:

  • Best-Case (test_gbp_ultra_99_percent_miracle): 99.0% reduction on highly repetitive GraphQL data

    • 27.15 MB β†’ 0.28 MB (97:1 ratio)
    • Represents typical GraphQL responses with shared values
  • Mid-Case (test_gbp_mid_case_compression): 96.1% reduction on realistic production data

    • 4.33 MB β†’ 0.17 MB (25:1 ratio)
    • Throughput: 24.79 MB/s
    • Characteristics: Limited categorical values, shared organizations, unique IDs
    • Represents real-world production APIs
  • Worst-Case (test_gbp_worst_case_compression): 56.6% reduction even on completely random data

    • 12.27 MB β†’ 5.32 MB (2.3:1 ratio)
    • Throughput: 11.21 MB/s
    • Represents theoretical limit with maximum entropy

Key Insights:

  • Production GraphQL APIs can expect 90-99% compression with GBP Ultra
  • Even pathological random data achieves >50% compression
  • Mid-case validates that realistic data compresses nearly as well as best-case
  • GBP’s semantic compression (shape pooling, value deduplication, columnar storage) provides significant advantage over traditional JSON compression

[0.6.8] - 2025-12-20

RestConnector Validation Fix πŸ”§

  • Fixed: Overly aggressive path validation that was incorrectly rejecting GraphQL queries with newlines.
  • Improvement: build_request() now only validates arguments actually used as path parameters.
  • Result: Router successfully executes federated queries through subgraphs with GBP compression.
  • Performance: Verified 99.998% compression (43.5 MB β†’ 776 bytes, 56,091:1 ratio) on federated datasets with 20,000 products.

[0.6.7] - 2025-12-20

Internal Maintenance

  • Version bump for consistency across the project.

[0.6.6] - 2025-12-20

GBP Ultra: 99% Compression Achieved 🎯

  • LZ4 High Compression: Upgraded to LZ4 HC mode (level 12) for maximum compression ratio.
  • Realistic Test Data: All subgraphs now generate 20k items with production-like nested structures.
  • Verified Results: 41.51 MB JSON β†’ 266.26 KB GBP (99.37% reduction).
  • Fixed: Empty array edge case causing β€œInvalid value reference” errors.

[0.6.5] - 2025-12-19

Live Query Auto-Push Updates πŸš€

  • Persistent Connections: @live queries keep WebSocket connections open for receiving updates.
  • Automatic Re-execution: Server re-executes queries when InvalidationEvent is triggered by mutations.
  • Global Store: global_live_query_store() singleton shared across all connections for proper invalidation propagation.
  • Zero Polling: Updates are server-initiated, no client-side polling required.

[0.6.4] - 2025-12-19

Live Query WebSocket Integration

  • WebSocket Endpoint: Dedicated /graphql/live endpoint for @live queries with full graphql-transport-ws protocol support.
  • HTTP Support: @live directive detection and stripping in HTTP POST requests.
  • Runtime Handlers: handle_live_query_ws() and handle_live_socket() for processing live subscriptions.
  • Example Script: test_ws.js demonstrating WebSocket connection, queries, and mutation integration.

[0.6.3] - 2025-12-19

Live Query Core Module

  • LiveQueryStore: Central store for managing active queries and invalidation triggers.
  • InvalidationEvent: Notify live queries when mutations occur (e.g., User.update, User.delete).
  • Proto Definitions: GraphqlLiveQuery message and graphql.live_query extension for RPC-level configuration.
  • Strategies: Support for INVALIDATION, POLLING, and HASH_DIFF modes.
  • API Functions: has_live_directive(), strip_live_directive(), create_live_query_store().
  • Example: Full CRUD implementation in examples/live_query/.

[0.6.2] - 2025-12-19

GBP Ultra: Parallel Optimization

  • Parallel Chunk Encoding: Implemented multi-core encoding (Tag 0x0C) for massive arrays, achieving 1,749 MB/s throughput.
  • Scalability: Reduces 1GB payload encoding time to ~585ms, scaling linearly with CPU cores.

[0.6.1] - 2025-12-19

GBP Ultra: RLE Optimization

  • Run-Length Encoding: New O(1) compression for repetitive columnar data (Tag 0x0B).
  • Performance: Boosted throughput to 486 MB/s with 99.26% compression on 100MB+ payloads.
  • Integrity: Validated cross-language compatibility (Rust/TS) and pooling synchronization.

[0.6.0] - 2025-12-19

GBP Fast Block & Gzip Stability

  • Ultra-Fast Block Mode: Switched GBP to LZ4 Block compression, increasing throughput to 211 MB/s and reducing latency to <0.3ms.
  • Stable Transport: Integrated Gzip (flate2) as a stable fallback for frontend environments where LZ4 libraries are inconsistent.
  • Data Integrity: Fixed router-level data corruption by aligning binary framing with the new high-performance decoder specification.

[0.5.9] - 2025-12-19

GBP O(1) Turbo Mode

  • Massive Payload Support: Optimized GBP for 1GB+ payloads by replacing recursive hashing with O(1) shallow hashing.
  • Zero-Clone Deduplication: Switched from value cloning to positional buffer references, eliminating memory overhead.
  • Performance: Verified 195 MB/s throughput on massive datasets with 99.25% compression.

[0.5.8] - 2025-12-18

GBP LZ4 Compression

  • LZ4 Integration: Native support for LZ4 compression within the GBP pipeline for ultra-low latency server-to-server traffic.
  • Efficiency: Combined GBP’s structural deduplication with high-speed block compression for 10x smaller payloads than Gzip.

[0.5.6] - 2025-12-18

GBP Data Integrity

  • Hash Collision Protection: Enhanced GbpEncoder to resolve hash collisions by verifying value equality, ensuring 100% data integrity for large-scale datasets.
  • Safety: Fully deterministic encoding behavior even with 64-bit hash collisions.

[0.5.5] - 2025-12-18

Federation & Stability

  • Full Federation Demo: Complete 3-subgraph setup (Users, Products, Reviews) with standalone router.

  • Hardened GBP: Improved read_varint safety against malformed payloads (DoS protection).

  • Benchmarks: Updated performance tools to match the new federation schema.

  • Fixes: Resolved compilation issues in examples and build configuration.

[0.5.4] - 2025-12-18

Router Performance Overhaul

  • Sharded Response Cache: 128-shard lock-free cache with sub-microsecond lookups
  • SIMD JSON Parsing: Integrated FastJsonParser for 2-5x faster parsing
  • FuturesUnordered: True streaming parallelism - results processed as they arrive
  • Query Hash Caching: AHash-based O(1) cache key lookups
  • Atomic Metrics: Lock-free request/cache counters via stats()
  • New Methods: execute_fail_fast(), with_cache_ttl(), clear_cache()
  • Performance: Verified 33K+ RPS on local hardware (shared CPU), <2.5ms P50 latency, 100% success rate at 100 concurrent connections.

[0.5.3] - 2025-12-18

GBP Federation Router

  • GbpRouter: New scatter-gather federation router with GBP Ultra compression for subgraph communication.
    • RouterConfig - Configure subgraphs, GBP settings, and HTTP/2 connections
    • SubgraphConfig - Per-subgraph configuration (URL, timeout, GBP enable)
    • DdosConfig - Two-tier DDoS protection with global and per-IP rate limiting
    • DdosProtection - Token bucket algorithm with strict() and relaxed() presets
  • Performance: ~99% bandwidth reduction between router and subgraphs, parallel execution with latency β‰ˆ slowest subgraph.
  • Binary: New cargo run --bin router for standalone federation router deployment.

[0.5.2] - 2025-12-18

DDoS Protection Enhancements

  • Added DdosConfig::strict() and DdosConfig::relaxed() presets for common use cases.
  • Improved token bucket algorithm efficiency.
  • Enhanced rate limiter cleanup for stale IP entries.

[0.5.1] - 2025-12-18

GBP Decoder & Fixes

  • GbpDecoder: Full decoder implementation for GBP Ultra payloads.
    • decode() - Decode raw GBP bytes to JSON
    • decode_lz4() - Decode LZ4-compressed GBP payloads
    • Value pool reference resolution
    • Columnar array reconstruction
  • Fixed: Value pool synchronization between encoder and decoder (Post-Order traversal).
  • Fixed: Columnar encoding for arrays with 5+ homogeneous objects.

[0.5.0] - 2025-12-18

GBP Ultra: The β€œSpeed of Light” Upgrade

  • GraphQL Binary Protocol v8: A complete reimagining of the binary layer.
    • <1ms Latency: Structural hashing eliminates allocation overhead.
    • 99.25% Compression: Intelligent deduplication makes JSON payloads vanish.
    • 176+ MB/s: Throughput that saturates 10Gbps links before maxing CPU.

[0.4.9] - 2025-12-18 (Cumulative Release: 0.3.9 - 0.4.9)

Enterprise Performance & Cost Optimization

  • High-Performance Mode: SIMD JSON parsing, lock-free sharded caching, and object pooling for 100K+ RPS per instance.
  • Cost Reduction Suite: Advanced Query Cost Analysis and Smart TTL Management for significant resource savings.
  • GBP (GraphQL Binary Protocol) v8: Achievement of 99.25% compression reduction verified on 100MB+ β€œBehemoth” payloads.
  • Binary Interoperability: New GbpDecoder and gbp-lz4 encoding for high-speed server-to-server GraphQL communication.
  • Infrastructure Upgrades: Migrated to latest stable versions of axum, tonic, and prost.

[0.4.8] - 2025-12-18

LZ4 + GBP Ultra Refinement

  • Refined the structural deduplication algorithm to achieve higher compression ratios.
  • Optimized GbpEncoder and GbpDecoder for deeper recursion.

[0.4.7] - 2025-12-18

LZ4 Block Compression

  • Integrated lz4 block compression for ultra-fast, low-CPU overhead data transfer.
  • New CompressionConfig::ultra_fast() preset.

[0.4.6] - 2025-12-18

Maintenance Release

  • Internal buffer optimizations and dependency updates.

[0.4.4] - 2025-12-17

Security Maintenance

  • Critical dependency updates to address identified security vulnerabilities.
  • Hardened gRPC metadata handling.

[0.4.2] - 2025-12-17

Vulnerability Patching

  • Fixed multiple security vulnerabilities identified in CI/CD pipeline.
  • Improved error handling in protoc-gen-graphql-template.

[0.4.0] - 2025-12-17

High Performance Foundations

  • Initial support for SIMD-accelerated data processing and sharded caching.
  • Enhanced protoc plugin capabilities.

[0.3.9] - 2025-12-16

Redis & Smart TTL

  • Redis Backend: Distributed caching support for horizontal scalability.
  • Smart TTL: Initial foundation for mutation-aware cache invalidation.

[0.3.8] - 2025-12-16

Helm & Kubernetes Deployment

  • Production-ready Helm chart (helm/grpc-graphql-gateway/)
  • Docker multi-stage builds for optimized images
  • HPA (Horizontal Pod Autoscaler) support (5-50 pods)
  • VPA (Vertical Pod Autoscaler) resource recommendations
  • Federation deployment script (deploy-federation.sh)
  • Docker Compose for local federation testing
  • AWS/GCP LoadBalancer annotations support
  • Comprehensive deployment guides (DEPLOYMENT.md, ARCHITECTURE.md)
  • Fixed rustdoc intra-doc links for docs.rs compatibility

[0.3.7] - 2025-12-16

Production Security Hardening

  • Comprehensive security headers: HSTS, CSP, X-XSS-Protection, Referrer-Policy
  • CORS preflight handling with proper OPTIONS response (204)
  • Cache-Control headers to prevent sensitive data caching
  • Query whitelist default to Enforce mode with introspection disabled
  • Improved query normalization for robust hash matching
  • Redis crate upgraded from 0.24 to 0.27
  • 31-test security assessment script (test_security.sh)

[0.3.6] - 2025-12-16

Security Fixes

  • Replaced std::sync::RwLock with parking_lot::RwLock to prevent DoS via lock poisoning
  • IP spoofing protection in middleware
  • SSRF protection in REST connectors
  • Security headers (X-Content-Type-Options, X-Frame-Options)

[0.3.5] - 2025-12-16

Redis Distributed Cache Backend

  • CacheConfig.redis_url - Configure Redis connection for distributed caching
  • Dual backend support: in-memory (single instance) or Redis (distributed)
  • Distributed cache invalidation across all gateway instances
  • Redis SETs for type and entity indexes (type:{TypeName}, entity:{EntityKey})
  • TTL synchronization with Redis SETEX
  • Automatic fallback to in-memory cache on connection failure
  • Ideal for Kubernetes deployments and horizontal scaling

[0.3.4] - 2025-12-14

OpenAPI to REST Connector

  • OpenApiParser - Parse OpenAPI 3.0/3.1 and Swagger 2.0 specs
  • Support for JSON and YAML formats
  • Automatic endpoint generation from paths and operations
  • Operation filtering by tags or custom predicates
  • Base URL override for different environments

[0.3.3] - 2025-12-14

Request Collapsing

  • RequestCollapsingConfig - Configure coalesce window, max waiters, and cache size
  • RequestCollapsingRegistry - Track in-flight requests for deduplication
  • Reduces gRPC calls by sharing responses for identical concurrent requests
  • Presets: default(), high_throughput(), low_latency(), disabled()
  • Metrics tracking: collapse ratio, leader/follower counts

[0.3.2] - 2025-12-14

Query Analytics Dashboard

  • Beautiful dark-themed analytics dashboard at /analytics
  • Most used queries, slowest queries, error patterns tracking
  • Field usage statistics and operation distribution
  • Cache hit rate monitoring and uptime tracking
  • Privacy-focused production mode (no query text storage)
  • JSON API at /analytics/api

[0.3.1] - 2025-12-14

Bug Fixes

  • Minor bug fixes and performance improvements
  • Updated dependencies

[0.3.0] - 2025-12-14

REST API Connectors

  • RestConnector - HTTP client with retry logic, caching, and interceptor support
  • RestEndpoint - Define REST endpoints with path templates and body templates
  • Typed responses with RestResponseSchema for GraphQL field selection
  • add_rest_connector() - New GatewayBuilder method
  • Built-in interceptors: BearerAuthInterceptor, ApiKeyInterceptor
  • JSONPath response extraction (e.g., $.data.users)
  • Ideal for hybrid gRPC/REST architectures and gradual migrations

[0.2.9] - 2025-12-14

Enhanced Middleware & Auth System

  • EnhancedAuthMiddleware - JWT support with claims extraction and context enrichment
  • AuthConfig - Required/optional modes with Bearer, Basic, ApiKey schemes
  • EnhancedLoggingMiddleware - Structured logging with sensitive data masking
  • LoggingConfig - Configurable log levels and slow request detection
  • Improved context with request_id, client_ip, and auth helpers
  • MiddlewareChain - Combine multiple middleware with builder pattern

[0.2.8] - 2025-12-13

Query Whitelisting (Stored Operations)

  • QueryWhitelistConfig - Configure allowed queries and enforcement mode
  • WhitelistMode - Enforce, Warn, or Disabled modes
  • Hash-based and ID-based query validation
  • Production security for PCI-DSS compliance
  • Compatible with APQ and GraphQL clients

[0.2.7] - 2025-12-12

Multi-Descriptor Support (Schema Stitching)

  • add_descriptor_set_bytes() - Add additional descriptor sets
  • add_descriptor_set_file() - Add descriptors from files
  • Seamless merging of services from multiple sources
  • Essential for microservice architectures

[0.2.6] - 2025-12-12

Header Propagation

  • HeaderPropagationConfig - Configure header forwarding
  • Allowlist approach for security
  • Support for distributed tracing headers

[0.2.5] - 2025-12-12

Response Compression

  • Brotli, Gzip, Deflate, Zstd support
  • Configurable compression levels
  • Minimum size threshold

[0.2.4] - 2025-12-12

Response Caching

  • LRU cache with TTL expiration
  • Stale-while-revalidate support
  • Mutation-triggered invalidation

[0.2.3] - 2025-12-11

Graceful Shutdown

  • Clean server shutdown
  • In-flight request draining
  • OS signal handling

[0.2.2] - 2025-12-11

Multiplex Subscriptions

  • Multiple subscriptions per WebSocket
  • graphql-transport-ws protocol support

[0.2.1] - 2025-12-11

Circuit Breaker

  • Per-service circuit breakers
  • Automatic recovery testing
  • Cascading failure prevention

[0.2.0] - 2025-12-11

Automatic Persisted Queries

  • SHA-256 query hashing
  • LRU cache with optional TTL
  • Apollo APQ protocol support

[0.1.x] - Earlier Releases

See the full changelog for earlier versions including:

  • Health checks and Prometheus metrics
  • OpenTelemetry tracing
  • Query depth and complexity limiting
  • Apollo Federation v2 support
  • File uploads
  • Middleware system

Contributing

We welcome contributions to gRPC-GraphQL Gateway!

Getting Started

  1. Fork the repository
  2. Clone your fork
  3. Create a feature branch
  4. Make your changes
  5. Submit a pull request

Development Setup

# Clone the repository
git clone https://github.com/Protocol-Lattice/grpc_graphql_gateway.git
cd grpc_graphql_gateway

# Build the project
cargo build

# Run tests
cargo test

# Run clippy
cargo clippy --all-targets

# Format code
cargo fmt

Running Examples

# Start the greeter example
cargo run --example greeter

# Start the federation example
cargo run --example federation

Project Structure

src/
β”œβ”€β”€ lib.rs              # Re-exports and module definitions
β”œβ”€β”€ gateway.rs          # Main Gateway and GatewayBuilder
β”œβ”€β”€ schema.rs           # GraphQL schema generation
β”œβ”€β”€ grpc_client.rs      # gRPC client management
β”œβ”€β”€ federation.rs       # Apollo Federation support
β”œβ”€β”€ middleware.rs       # Middleware trait and types
β”œβ”€β”€ cache.rs            # Response caching
β”œβ”€β”€ compression.rs      # Response compression
β”œβ”€β”€ circuit_breaker.rs  # Circuit breaker pattern
β”œβ”€β”€ persisted_queries.rs # APQ support
β”œβ”€β”€ health.rs           # Health check endpoints
β”œβ”€β”€ metrics.rs          # Prometheus metrics
β”œβ”€β”€ tracing_otel.rs     # OpenTelemetry tracing
β”œβ”€β”€ shutdown.rs         # Graceful shutdown
β”œβ”€β”€ headers.rs          # Header propagation
└── ...

Pull Request Guidelines

  • Follow Rust naming conventions
  • Add tests for new functionality
  • Update documentation as needed
  • Run cargo fmt before committing
  • Ensure cargo clippy passes
  • Update CHANGELOG.md for notable changes

Reporting Issues

Please include:

  • Rust version (rustc --version)
  • Gateway version
  • Minimal reproduction case
  • Expected vs actual behavior

License

By contributing, you agree that your contributions will be licensed under the MIT License.

Resources