Skip to content

Request Costs

Here's a scenario: your /upload endpoint lets users push files up to 100 MB. Your /health endpoint returns {"status": "ok"} in microseconds. Should they both tick the rate limit counter by exactly one?

Probably not.

Request costs are Traffik's answer to this. Every hit(...) consumes a configurable amount of quota instead of a flat 1. You get to decide what "expensive" means for your API.


The Default: Cost of 1

By default, every request costs 1 unit of quota. A throttle set to "100/min" allows 100 requests per minute. This is fine for most endpoints.

from traffik import HTTPThrottle

# 100 requests per minute, each costs 1 (the default)
throttle = HTTPThrottle(uid="api:default", rate="100/min")

Fixed Cost at Initialization

Pass cost=N when you create the throttle to make every request through it consume N units of quota. This is the simplest way to make an endpoint feel "heavier" to the rate limiter.

from traffik import HTTPThrottle
from fastapi import Depends

# Each request to this throttle burns 10 units — same as 10 normal requests
export_throttle = HTTPThrottle(
    uid="api:export",
    rate="100/min",
    cost=10,
)

@app.get("/export", dependencies=[Depends(export_throttle)])
async def export_data():
    ...

Thinking in units

A throttle with rate="100/min" and cost=10 effectively allows 10 export requests per minute. The math is: rate.limit / cost = effective_limit.


Dynamic Cost via Function

When the cost depends on the request itself — file size, number of records, operation type — pass an async function instead of an integer. Traffik calls it on every hit, passing the connection and the current context.

from traffik import HTTPThrottle
from starlette.requests import Request
import typing

async def upload_cost(
    request: Request,
    context: typing.Optional[typing.Dict[str, typing.Any]],
) -> int:
    # Cost = file size in megabytes, minimum 1
    content_length = int(request.headers.get("content-length", 0))
    size_mb = content_length // (1024 * 1024)
    return max(size_mb, 1)

upload_throttle = HTTPThrottle(
    uid="api:upload",
    rate="500/hour",  # 500 MB of uploads per user per hour
    cost=upload_cost,
)

The cost function signature is:

async def my_cost(
    connection: Request,
    context: typing.Optional[typing.Dict[str, typing.Any]],
) -> int:
    ...

Both parameters are provided automatically by Traffik. context will be None unless you passed a context dict when initializing the throttle or calling hit(...).


Per-Call Override

Sometimes you know the cost only at the moment of the call — for example, after parsing a request body. Pass cost=N directly to hit(...) or __call__(...) to override the throttle's default for that one request.

@app.post("/process")
async def process(request: Request):
    body = await request.json()
    operation = body.get("operation", "read")

    # Charge more quota for mutations
    cost_map = {"read": 1, "write": 5, "delete": 10}
    cost = cost_map.get(operation, 1)

    await throttle(request, cost=cost)
    return {"status": "processed"}

Per-call cost takes priority

When you pass cost=N to hit(...), it overrides both the fixed cost set at initialization and any dynamic cost function. Think of it as the last word.


Cost of Zero: The Exemption Shortcut

A cost of 0 is special: Traffik sees it and short-circuits immediately — no counter is incremented, no backend is called. The request passes through as if it was never throttled.

async def selective_cost(
    request: Request,
    context: typing.Optional[typing.Dict[str, typing.Any]],
) -> int:
    # Health checks and metrics scrapes don't count against the limit
    if request.url.path in {"/health", "/metrics", "/readyz"}:
        return 0
    return 1

Cost 0 vs EXEMPTED

Returning 0 from a cost function and returning EXEMPTED from an identifier function both skip throttling — but they do it at different stages:

  • cost=0: Skips before the identifier is resolved (immediately hit(...) is called).
  • EXEMPTED: Skips at the identifier stage. Nothing downstream runs at all.

For exempting entire client categories (admins, internal services), EXEMPTED from the identifier is cleaner. For skipping specific paths or methods within a shared throttle, cost=0 is a pragmatic shortcut and slightly cheaper.


Real-World Examples

Charge quota proportional to the uploaded file's size. A 50 MB upload consumes 50x the quota of a 1 MB upload.

from traffik import HTTPThrottle
from starlette.requests import Request
import typing

async def file_size_cost(
    request: Request,
    context: typing.Optional[typing.Dict[str, typing.Any]],
) -> int:
    content_length = int(request.headers.get("content-length", 0))
    # Cost = ceil(bytes / 1 MB), minimum 1
    size_mb = max(1, -(-content_length // (1024 * 1024)))  # ceiling division
    return size_mb

upload_throttle = HTTPThrottle(
    uid="uploads:per-user",
    rate="500/hour",   # 500 MB of uploads per user per hour
    cost=file_size_cost,
    identifier=get_user_id,
)

Charge quota based on how many tokens an AI request consumed, reported back by the model. Use hit(...) manually after the response so you know the actual count.

from traffik import HTTPThrottle
from fastapi import Request

token_throttle = HTTPThrottle(
    uid="ai:token-budget",
    rate="100000/day",   # 100k tokens per user per day
    identifier=get_user_id,
)

@app.post("/ai/complete")
async def complete(request: Request):
    body = await request.json()

    # Call the model first — we need the actual token count
    result = await call_llm(body["prompt"])
    tokens_used = result["usage"]["total_tokens"]

    # Now consume that many units of quota
    await token_throttle(request, cost=tokens_used)

    return result

Different operations have different blast radii. A DELETE that wipes a table should cost more than a GET that reads one row.

from traffik import HTTPThrottle
from starlette.requests import Request
import typing

OPERATION_COSTS = {
    "GET":    1,
    "POST":   3,
    "PUT":    3,
    "PATCH":  3,
    "DELETE": 10,
}

async def method_cost(
    request: Request,
    context: typing.Optional[typing.Dict[str, typing.Any]],
) -> int:
    method = request.method.upper()
    return OPERATION_COSTS.get(method, 1)

api_throttle = HTTPThrottle(
    uid="api:weighted",
    rate="100/min",
    cost=method_cost,
)

# With this setup:
# - 100 GET requests per minute (100 × 1 = 100 units)
# - 33 POST/PUT/PATCH requests per minute (33 × 3 ≈ 100 units)
# - 10 DELETE requests per minute (10 × 10 = 100 units)

Endpoints that operate on multiple records at once should count each record, not each HTTP request.

from traffik import HTTPThrottle
from fastapi import Request

async def bulk_cost(
    request: Request,
    context: typing.Optional[typing.Dict[str, typing.Any]],
) -> int:
    body = await request.json()
    items = body.get("items", [])
    # Each item in the batch counts as one unit, minimum 1
    return max(len(items), 1)

bulk_throttle = HTTPThrottle(
    uid="api:bulk-write",
    rate="1000/hour",
    cost=bulk_cost,
)

Summary

Approach When to use
cost=N (integer) All requests to this throttle are equally expensive
cost=async_fn Cost depends on the request content or headers
await throttle(request, cost=N) You know the cost only at call time
cost=0 Skip throttling for certain requests within a shared throttle