Skip to main content
Version: 2.33.1

Rate Limiter

The Rate Limiter component can be used to ensure fair access and manage costs by regulating the number of requests made by an entity over time. It achieves this by accepting or rejecting incoming requests based on per-label limits, which are configured using the token bucket algorithm. Instead of measuring the number of requests, the Rate Limiter can also be configured to measure the number of tokens associated with a request. Tokens can be sent as flow labels using Aperture SDKs.

The Rate Limiter is a component of Aperture's policy system, and it can be configured to work with different labels and limits depending on the needs of an application.

The following example creates a Rate Limiter at the ingress control point for service checkout.default.svc.cluster.local. A rate limit of 2 requests per second with a burst capacity of 40 is applied per unique value of http.request.header.user_id flow label:

- flow_control:
value: 40
value: 2
interval: 1s
label_key: http.request.header.user_id
- control_point: ingress
service: checkout.default.svc.cluster.local

Distributed Counters

For each configured Rate Limiter Component, every matching Aperture Agent instantiates a copy of the Rate Limiter. Although each agent has its own copy of the component, they all share counters through a distributed cache. This means that they work together as a single Rate Limiter, providing seamless coordination and control across Agents. The Agents within an agent group constantly share state and detect failures using a gossip protocol.

Token Bucket Algorithm

This algorithm allows users to run a substantial number of requests in bursts, and then continue at a steady rate. Here are the key points to understand about the token bucket algorithm:

  • Each user (or any flow label) has access to a bucket, which can hold, say, 60 "tokens".
  • Every second, a token is added to the bucket (if there's room). In this way, the bucket is steadily refilled over time.
  • Each API request requires the user to remove a token from the bucket.
  • If the bucket is empty, the user gets an error and has to wait for new tokens to be added to the bucket before making more requests.

This model ensures that apps that handle API calls judiciously will always have a supply of tokens for a burst of requests when necessary. For example, if users average 20 requests ("tokens") per second but suddenly need to make 30 requests at once, users can do so if they have accumulated enough tokens.

Lazy Syncing

When lazy syncing is enabled, rate-limiting counters are stored in-memory and are only synchronized between Aperture Agent instances on-demand. This allows for fast and low-latency rate-limiting decisions, at the cost of slight inaccuracy within a (small) time window (sync interval).


The Rate Limiter component accepts or rejects incoming flows based on per-label limits, configured as the maximum number of requests per a given period of time. The rate-limiting label is chosen from the flow-label with a specific key, enabling distinct limits per-user as identified by unique values of the label.


Refer to the Per-user Rate Limiting guide for more information on how to use the Rate Limiter using aperture-js SDK.