Skip to main content
Version: development

Rate Limiter

The Rate Limiter is a powerful component that can be used to prevent recurring overloads by proactively regulating heavy-hitters. It achieves this by accepting or rejecting incoming flows based on per-label limits, which are configured using the token bucket algorithm. The Rate Limiter is a component of Aperture's policy system, and it can be configured to work with different labels and limits depending on the needs of your application.

Distributed Counters

For each configured Rate Limiter Component, every matching Aperture Agent instantiates a copy of the Rate Limiter. Although each Agent has its own copy of the component, they all share counters through a distributed cache. This means that they work together as a single Rate Limiter, providing seamless coordination and control across agents. The agents within an agent group constantly share state and detect failures using a gossip protocol.

Token Bucket Algorithm

This algorithm allows users to execute a substantial number of requests in bursts, and then continue at a steady rate. Here are the key points to understand about the token bucket metaphor:

  • Each user (or any flow label) has access to a bucket, which can hold, say, 60 "tokens".
  • Every second, a token is added to the bucket (if there's room). In this way, the bucket is steadily refilled over time.
  • Each API request requires the user to remove a token from the bucket.
  • If the bucket is empty, the user gets an error and has to wait for new tokens to be added to the bucket before making more requests.

This model ensures that apps that handle API calls judiciously will always have a supply of tokens for a burst of requests when necessary. For example, if users average 20 requests ("tokens") per second but suddenly need to make 30 requests at once, users can do so if they have accumulated enough tokens. The basic principles of the token bucket algorithm apply to all our rate limits, regardless of the specific methods used to implement them.

Lazy Syncing

When lazy syncing is enabled, rate-limiting counters are stored in-memory and are only synchronized between Aperture Agent instances on-demand. This allows for fast and low-latency rate-limiting decisions, at the cost of slight inaccuracy within a (small) time window (sync interval).


The Rate Limiter component accepts or rejects incoming flows based on per-label limits, configured as the maximum number of requests per a given period of time. The rate-limiting label is chosen from the flow-label with a specific key, enabling you to configure separate limits for different users or flows.


The limit value is treated as a signal within the circuit and can be dynamically modified or disabled at runtime.


The override mechanism allows for increasing the limit for a particular value of a label. For instance, you might want to increase the limit for the admin user. Please refer to the reference for more details on how to use this feature.