Skip to main content
Version: development

Per-user Rate Limiting

Overview

Rate limiting is a critical strategy for managing the load on an API. By imposing restrictions on the number of requests a unique consumer can make within a specific time frame, rate limiting prevents a small set of users from monopolizing the majority of resources on a service, ensuring fair access for all API consumers.

Aperture implements this strategy through its high-performance, distributed rate limiter. This system enforces per-key limits based on fine-grained labels, thereby offering precise control over API usage. For each unique key, Aperture maintains a token bucket of a specified bucket capacity and fill rate. The fill rate dictates the sustained requests per second (RPS) permitted for a key, while transient overages over the fill rate are accommodated for brief periods, as determined by the bucket capacity.

flowchart LR classDef TokenBucket fill:#F8773D,stroke:#000000,stroke-width:2px; classDef Service fill:#56AE89,stroke:#000000,stroke-width:2px; subgraph Cloud TB[\Token Bucket/] class TB TokenBucket end class Cloud Service TB <-- "counting tokens" --> SDK subgraph "SDK" end class SDK Service

The diagram shows how the Aperture SDK interacts with a global token bucket to determine whether to allow or reject a request. Each call decrements tokens from the bucket and if the bucket runs out of tokens, indicating that the rate limit has been reached, the incoming request is rejected. Conversely, if tokens are available in the bucket, the request is accepted. The token bucket is continually replenished at a predefined fill rate, up to the maximum number of tokens specified by the bucket capacity.

note

The following policy is based on the Rate Limiting blueprint.

Rate Limiting with Aperture SDK

The first step to use Aperture SDK is to import and set up Aperture Client:


Start the flow with StartFlow by passing in a Control Point and labels necessary to determine if a request should proceed. The function Flow.ShouldRun() checks if the flow allows the request. The Flow.End() function is responsible for sending telemetry, and updating the specified cache entry within Aperture.


Configuration

This policy is based on the Rate Limiting blueprint. It applies a rate limiter to the awesomeFeature and identifies unique users by referencing the user_id.

Each user is allowed 2 requests every 1s (1 second) period. A burst of up to 40 requests is allowed. This means that the user can send up to 40 requests in the first second, and then 2 requests every second after that. The bucket gets replenished at the rate of 2 requests per second (the fill rate).

The below values.yaml file can be generated by following the steps in the Installation section.

# yaml-language-server: $schema=../../../../../blueprints/rate-limiting/base/gen/definitions.json
blueprint: rate-limiting/base
uri: ../../../../../blueprints
policy:
policy_name: "static-rate-limiting"
rate_limiter:
bucket_capacity: 40
fill_amount: 2
selectors:
- control_point: "awesomeFeature"
parameters:
limit_by_label_key: "user_id"
interval: 1s

Generated Policy

apiVersion: fluxninja.com/v1alpha1
kind: Policy
metadata:
labels:
fluxninja.com/validate: "true"
name: static-rate-limiting
spec:
circuit:
components:
- flow_control:
rate_limiter:
in_ports:
bucket_capacity:
constant_signal:
value: 40
fill_amount:
constant_signal:
value: 2
out_ports:
accept_percentage:
signal_name: ACCEPT_PERCENTAGE
parameters:
interval: 1s
limit_by_label_key: user_id
request_parameters: {}
selectors:
- control_point: awesomeFeature
- decider:
in_ports:
lhs:
signal_name: ACCEPT_PERCENTAGE
rhs:
constant_signal:
value: 90
operator: gte
out_ports:
output:
signal_name: ACCEPT_PERCENTAGE_ALERT
- alerter:
in_ports:
signal:
signal_name: ACCEPT_PERCENTAGE_ALERT
parameters:
alert_name: More than 90% of requests are being rate limited
evaluation_interval: 1s
resources:
flow_control:
classifiers: []

info

Circuit Diagram for this policy.

Installation

Generate a values file specific to the policy. This can be achieved using the command provided below.

aperturectl blueprints values --name=rate-limiting/base --version=main --output-file=values.yaml

Apply the policy using the aperturectl CLI or kubectl.

aperturectl cloud blueprints apply --values-file=values.yaml

Policy in Action

When the policy is applied at a service, no more than 2 requests per second period (after an initial burst of 40 requests) are accepted for a user.

Static Rate Limiting