Skip to content

Performance Tips

Traffik is designed to be performant out of the box. But like all systems, there are knobs to turn when you need to squeeze out more. This page covers the most impactful optimizations, roughly ordered by impact.


Quick Wins Checklist

Before diving into specifics, here are the changes that have the most impact for the least effort:

  • Use >= 1 second rate windows (eliminates lock overhead for FixedWindow/SlidingWindowCounter)
  • Enable InMemory shard striping: InMemoryBackend(number_of_shards=32)
  • Use connection pooling: MemcachedBackend(pool_size=20)
  • Don't call DB/external APIs inside identifier functions
  • Exempt trusted internal clients with EXEMPTED (zero backend overhead)
  • Keep cache_ids=True (default) — especially for WebSockets

1. Use the Right Backend

The backend choice has a larger impact on performance than any other single factor.

Deployment Backend Notes
Single-process (dev, small apps) InMemoryBackend Zero network overhead, fastest possible
Multi-process, single machine InMemoryBackend + replication State doesn't share between processes
Distributed (multiple nodes) RedisBackend Network round-trip, but accurate distributed counting
Already have Memcached MemcachedBackend Comparable to Redis, no scripting support
Redis + memory efficiency RedisBackend Scripts cached on server, minimal overhead

Don't use RedisBackend for a single-process application just because "Redis is production-grade." The network round-trip will cost you 1-5ms per request unnecessarily. InMemoryBackend is genuinely the right tool for single-process deployments.

from traffik.backends.inmemory import InMemoryBackend
from traffik.backends.redis import RedisBackend

# Development / single-process
backend = InMemoryBackend(namespace="myapp")

# Production / distributed
backend = RedisBackend("redis://localhost:6379", namespace="myapp")

2. Choose Your Strategy Wisely

Strategies differ significantly in performance. Here's the order from fastest to slowest:

Strategy Overhead Accuracy Best For
FixedWindow (>= 1s window) Minimal — single atomic op, no lock Good General APIs, high throughput
SlidingWindowCounter Low — two atomic ops Better When burst boundaries matter
TokenBucket Medium — lock + read/write Excellent (burst-aware) APIs with natural burst patterns
LeakyBucket Medium — lock + read/write Excellent (smooth) Enforcing constant rate
SlidingWindowLog High — serialization + cleanup Perfect Strict correctness, lower traffic

The default strategy is FixedWindow. For most applications, it's the right choice — it's accurate enough, and the performance is hard to beat.

from traffik import HTTPThrottle
from traffik.strategies import FixedWindow, TokenBucket

# Fastest: FixedWindow with a >= 1s window
throttle = HTTPThrottle("api", rate="1000/min", strategy=FixedWindow())

# More nuanced: TokenBucket for burst-friendly limiting
throttle = HTTPThrottle("api", rate="100/sec", strategy=TokenBucket())

3. Configure Lock Striping (InMemory)

InMemoryBackend uses internal locks to prevent race conditions. By default it uses a small number of shards. Increasing the shard count distributes lock contention across more buckets, dramatically improving throughput under high concurrency:

from traffik.backends.inmemory import InMemoryBackend

# Default: small number of shards (low concurrency overhead)
backend = InMemoryBackend(namespace="myapp")

# High concurrency: 32 shards distribute lock contention
backend = InMemoryBackend(namespace="myapp", number_of_shards=32)

A good rule of thumb: set number_of_shards to roughly your expected peak concurrent requests / 4. For most applications, 16-64 shards is a good range.


4. Use Connection Pooling (Memcached)

Every throttle check requires a backend operation. Without connection pooling, each operation opens and closes a network connection — this is expensive. Use pooling:

from traffik.backends.memcached import MemcachedBackend

# Default: small pool
backend = MemcachedBackend(host="localhost", port=11211, namespace="myapp")

# Production: larger pool to handle concurrent requests
backend = MemcachedBackend(
    host="localhost",
    port=11211,
    namespace="myapp",
    pool_size=20,   # Tune based on expected concurrency
)

For Redis, aioredis (used by RedisBackend) handles connection pooling internally. You can tune it via the connection URL or by passing a pre-configured client.


5. Keep Identifier Functions Cheap

The identifier function runs on every request to determine who is being throttled. A slow identifier function adds latency proportional to your traffic.

# Fast: read from request headers (in-memory, no I/O)
async def fast_identifier(request: Request):
    return request.headers.get("X-API-Key") or request.client.host

# Slow: database lookup on every request (don't do this!)
async def slow_identifier(request: Request):
    api_key = request.headers.get("X-API-Key")
    user = await db.fetch("SELECT id FROM users WHERE api_key = $1", api_key)
    return user["id"]  # Every request = one DB query

If you need to look up user information, cache it in the request state after the first lookup, or do the lookup once in authentication middleware and store the result in request.state.user_id.


6. Avoid Logging in Backend Operations

This sounds minor but it isn't. Logging is synchronous I/O and can block the async event loop under load.

# This is inside the hot path — every request goes through here
async def my_backend_increment(self, key: str, amount: int = 1) -> int:
    # BAD: logging on every request
    logger.debug(f"Incrementing {key} by {amount}")
    result = await self._client.incr(key, amount)
    logger.debug(f"New value: {result}")
    return result

# Better: no logging in the hot path
async def my_backend_increment(self, key: str, amount: int = 1) -> int:
    return await self._client.incr(key, amount)

According to our benchmarks, adding logger.debug() calls inside backend operations produces a ~10x slowdown on high-concurrency workloads. Log in error handlers (which only run on failures), not in the normal execution path.


7. Exempt Trusted Clients with EXEMPTED

For trusted internal services, health check endpoints, or admin users, returning EXEMPTED from the identifier completely bypasses throttling — no counter read, no counter write, no lock. Zero overhead:

from traffik import EXEMPTED

INTERNAL_TOKEN = "my-internal-service-token"

async def smart_identifier(request: Request):
    # Internal services: completely bypassed
    if request.headers.get("X-Internal-Token") == INTERNAL_TOKEN:
        return EXEMPTED

    # Everyone else: throttled by API key
    return request.headers.get("X-API-Key") or request.client.host


throttle = HTTPThrottle("api", rate="1000/min", identifier=smart_identifier)

This is especially useful for health check endpoints that get hammered by load balancers. There's no point counting those requests — exempt them and save the backend round-trip.


8. Enable Identifier Caching (It's Default)

cache_ids=True is the default and caches the computed identifier on the connection's state object. For WebSocket connections especially, this prevents recomputing the identifier on every message:

# cache_ids=True (default) — computes identifier once, caches on connection.state
throttle = HTTPThrottle("api", rate="100/min", cache_ids=True)

# cache_ids=False — recomputes identifier on every hit
throttle = HTTPThrottle("api", rate="100/min", cache_ids=False)

Only disable cache_ids if your identifier might change during a connection's lifetime (rare). For everything else, leave it enabled.


9. Sort Middleware Throttles Cheap First

When using multiple throttles in middleware, put the cheapest (lowest limit, most likely to fire) first. A throttle that rejects a request early prevents the more expensive throttles from running at all:

from traffik.middleware import ThrottleMiddleware

# Burst throttle (cheap, low limit) runs first
# Sustained throttle (more expensive, higher limit) only runs if burst passes
app.add_middleware(
    ThrottleMiddleware,
    throttles=[burst_throttle, sustained_throttle],
    # cheap_first=True is the default — throttles are automatically sorted
)

Traffik's middleware already applies cheap_first ordering by default, so this is mostly about being deliberate when configuring multiple throttles.


10. Avoid Sub-Second Rate Windows in Production

Sub-second windows (e.g., "100/500ms", "10/100ms") trigger locking in FixedWindow and SlidingWindowCounter — even when they would otherwise be lock-free. This is necessary for accuracy at millisecond precision, but it adds distributed lock overhead on every request.

# Lock-free: uses atomic increment_with_ttl only
throttle = HTTPThrottle("api", rate="6000/min")    # 60s window >= 1s

# Requires locking: sub-second window
throttle = HTTPThrottle("api", rate="100/500ms")   # 500ms < 1s

In most cases, you can express the same intent with a >= 1s window and get better performance:

# These are semantically equivalent for burst protection:
throttle = HTTPThrottle("api", rate="20/sec")       # 1s window, no lock overhead
# vs.
throttle = HTTPThrottle("api", rate="10/500ms")     # 500ms window, adds lock overhead

Use sub-second windows when you genuinely need millisecond-level burst control. Otherwise, stick to seconds.


Putting It All Together

Here's a production-optimized configuration for a high-traffic API:

from traffik import HTTPThrottle, EXEMPTED
from traffik.backends.redis import RedisBackend
from traffik.strategies import FixedWindow

# Redis backend (distributed, production-grade)
backend = RedisBackend(
    "redis://localhost:6379",
    namespace="myapp",
    on_error="allow",  # Fail open on backend errors (tune per your risk tolerance)
)

# Efficient identifier: fast header read, exempt internal services
async def api_identifier(request):
    if request.headers.get("X-Internal") == "true":
        return EXEMPTED
    return request.headers.get("X-API-Key") or request.client.host

# Fast strategy: FixedWindow with >= 1s window = lock-free path
throttle = HTTPThrottle(
    "api:v1",
    rate="1000/min",          # 60s window, no lock overhead
    strategy=FixedWindow(),
    identifier=api_identifier,
    cache_ids=True,            # Cache identifier per connection
    backend=backend,
)