Strategies¶
A strategy is the algorithm that decides how Traffik counts. Every strategy receives the same inputs, a key, a rate, and a backend, and returns a wait time in milliseconds. Zero means "go ahead". Anything positive means "slow down".
Choosing a strategy is a trade-off between accuracy, memory, and burst tolerance. The table below gives you the overview; the sections that follow go deeper.
Strategy comparison¶
| Strategy | Accuracy | Memory | Bursts | Best For |
|---|---|---|---|---|
FixedWindow |
Low | O(1) | Yes (boundary) | Simple limits, high throughput APIs |
SlidingWindowCounter |
Medium | O(1) | Minimal | General purpose, good default |
SlidingWindowLog |
Highest | O(limit) | No | Financial, security-critical, strict SLA |
TokenBucket |
High | O(1) | Yes (configurable) | Variable traffic, mobile clients |
TokenBucketWithDebt |
High | O(1) | Yes + overdraft | Gradual degradation, user-facing APIs |
LeakyBucket |
High | O(1) | No | Protecting downstream services |
LeakyBucketWithQueue |
High | O(limit) | No (strict FIFO) | Ordered processing, fairness guarantees |
GCRA |
Highest | O(1) | Configurable | Telecom, smooth pipelines, strict SLA |
FixedWindow (default)¶
The simplest possible algorithm. Time is divided into fixed windows aligned to clock boundaries (e.g., 00:00–01:00, 01:00–02:00). Each request increments a counter for the current window. When the window ends, the counter resets automatically via TTL.
The boundary problem: A user can make up to limit requests at 00:59 and another
limit at 01:00, up to 2x the limit within any two-second span. If that is
acceptable, FixedWindow is your best friend: it is fast, cheap, and requires only
one atomic counter per key.
from traffik import HTTPThrottle
from traffik.strategies import FixedWindow
throttle = HTTPThrottle(
uid="my-api",
rate="100/min",
strategy=FixedWindow(), # this is also the default if you omit strategy
)
Storage keys:
{namespace}:{key}:fixedwindow:counter— request counter (integer, TTL = window duration){namespace}:{key}:fixedwindow:start— window start timestamp (sub-second windows only)
FixedWindow is the default
You do not need to pass strategy=FixedWindow() explicitly, it is what you get
when you omit the strategy argument entirely.
SlidingWindowCounter¶
A smarter cousin of FixedWindow. Instead of snapping to fixed clock boundaries,
it tracks two consecutive windows and computes a weighted count:
As you move further through the current window, the previous window's contribution shrinks toward zero. The result is much smoother traffic flow with only two counters per key.
from traffik.strategies import SlidingWindowCounter
throttle = HTTPThrottle(
uid="my-api",
rate="100/min",
strategy=SlidingWindowCounter(),
)
Storage keys:
{namespace}:{key}:slidingcounter:{window_id}— counter per window (TTL = 2x window duration)- Two consecutive windows are read on every request.
When to use: General-purpose rate limiting where you want better accuracy than
FixedWindow without paying the memory cost of SlidingWindowLog.
SlidingWindowLog¶
The most accurate strategy. Traffik maintains a log of [timestamp, cost] pairs for
every request in the window, serialised with MessagePack. On each request it evicts
expired entries and sums the remaining costs.
from traffik.strategies import SlidingWindowLog
throttle = HTTPThrottle(
uid="payment-endpoint",
rate="100/min",
strategy=SlidingWindowLog(),
)
Storage keys:
{namespace}:{key}:slidinglog— msgpack-encoded list of[timestamp_ms, cost]tuples
Memory: O(limit) — at peak load the log holds one entry per allowed request.
With a limit of 10,000 this can add up. Choose SlidingWindowCounter if memory is a
concern.
Use SlidingWindowLog only when accuracy is mandatory
For most APIs, SlidingWindowCounter gives 95% of the accuracy at a fraction of
the memory. Reserve SlidingWindowLog for payment processing, security-critical
endpoints, and anywhere where the boundary burst of FixedWindow is genuinely
unacceptable.
TokenBucket¶
Models rate limiting as a bucket that holds tokens. Tokens refill at a constant rate
(e.g., 100 tokens/minute). Each request consumes one token (or cost tokens). If
the bucket is empty, the request must wait until enough tokens have accumulated.
The burst_size parameter controls the maximum bucket capacity. Set it higher than
rate.limit to allow occasional bursts; set it equal to rate.limit for no extra
burst allowance.
from traffik.strategies import TokenBucket
throttle = HTTPThrottle(
uid="my-api",
rate="100/min",
strategy=TokenBucket(burst_size=150), # allow bursts up to 150
)
Storage keys:
{namespace}:{key}:tokenbucket:{capacity}— msgpack{"tokens": float, "last_refill": ms}
Refill formula: new_tokens = min(current + elapsed_ms * (limit / expire_ms), capacity)
Tokens are refilled lazily on every request, there is no background process.
TokenBucketWithDebt¶
An extended token bucket that lets the bucket go negative. Requests are still
allowed when the bucket is at zero, they go into "debt" up to max_debt. The debt
is paid back through normal token refilling. This produces a softer experience: users
never hit a sudden wall; traffic degrades gradually.
from traffik.strategies import TokenBucketWithDebt
throttle = HTTPThrottle(
uid="user-facing-api",
rate="100/min",
strategy=TokenBucketWithDebt(
burst_size=150, # max positive tokens
max_debt=50, # allow up to 50 tokens of debt
),
)
Storage keys:
{namespace}:{key}:tokenbucket:{capacity}:debt:{max_debt}— same state format asTokenBucket, but tokens can be negative
When NOT to use: Strict SLA enforcement, third-party API proxying, billing and payment systems. Debt means you can temporarily exceed the nominal rate.
LeakyBucket¶
The inverse of TokenBucket. Instead of tokens filling a bucket, requests fill it,
and it drains (leaks) at a constant rate. If the bucket is full when a new request
arrives, that request is rejected.
The effect is perfectly smooth output, no bursts are ever allowed. This is ideal for protecting downstream services that cannot handle spikes.
from traffik.strategies import LeakyBucket
throttle = HTTPThrottle(
uid="downstream-proxy",
rate="100/min",
strategy=LeakyBucket(),
)
Storage keys:
{namespace}:{key}:leakybucket:state— msgpack{"level": float, "last_leak": ms}
LeakyBucketWithQueue¶
LeakyBucket with a strict FIFO guarantee. Instead of a simple fill level, Traffik
stores the full queue of [timestamp, cost] entries. Requests drain in the exact
order they arrived. No request can "cut in line".
from traffik.strategies import LeakyBucketWithQueue
throttle = HTTPThrottle(
uid="ordered-queue",
rate="100/min",
strategy=LeakyBucketWithQueue(),
)
Storage keys:
{namespace}:{key}:leakybucketqueue:state— msgpack{"queue": [[ts, cost], ...], "last_leak": ms}
Memory: O(limit) for the same reason as SlidingWindowLog.
GCRA¶
The Generic Cell Rate Algorithm, also called "leaky bucket as a meter" or "virtual scheduling". Originally designed for ATM networks, it provides the smoothest possible rate enforcement with the smallest memory footprint of any stateful algorithm: a single float (the Theoretical Arrival Time, TAT).
How it works: Every request is only allowed if now >= TAT - burst_tolerance_ms.
After an allowed request, the TAT advances by one emission interval
(window_duration / limit). The result: requests are uniformly spaced in time.
from traffik.strategies.custom import GCRA
# Perfectly smooth: 100 req/min = one request every 600ms
throttle = HTTPThrottle(
uid="realtime-pipeline",
rate="100/min",
strategy=GCRA(burst_tolerance_ms=0), # 0 = no bursts at all
)
# Allow a small burst tolerance (e.g. 1 full emission interval)
throttle = HTTPThrottle(
uid="realtime-pipeline",
rate="100/min",
strategy=GCRA(burst_tolerance_ms=600), # one request's worth of tolerance
)
Storage keys:
{namespace}:{key}:gcra:tat— a single float: the TAT in milliseconds
When to use: Telecommunications systems, real-time data pipelines, strict SLA contracts, anywhere you need requests to be evenly spread over time rather than clustered at the start of a window.
GCRA is the most memory-efficient accurate strategy
Just one float per key. If you need accuracy and have a large number of unique
keys (millions of users), GCRA beats SlidingWindowLog decisively on memory.
Advanced strategies in traffik.strategies.custom¶
Traffik ships a second, richer set of strategies for specialised scenarios. They live
in traffik.strategies.custom and are imported explicitly:
from traffik.strategies.custom import (
AdaptiveThrottle, # dynamically reduces limits under high load
TieredRate, # different multipliers per user tier (free/pro/enterprise)
PriorityQueue, # CRITICAL / HIGH / NORMAL / LOW priority classes
QuotaWithRollover, # monthly quotas where unused quota carries over
TimeOfDay, # different limits by hour of day
CostBasedTokenBucket, # refill rate adjusts based on average request cost
DistributedFairness, # deficit round-robin across multiple app instances
GeographicDistribution, # per-region capacity with optional spillover
)
Each of these strategies is a drop-in replacement, same interface, same __call__
signature. Check the API reference and the traffik.strategies.custom module for
full configuration details and examples.