Version: development

Per-user Rate Limiting

Overview

Note to Developers

For implementing rate limiting using Aperture SDKs refer to the developer-centric Rate Limiting Guide.

Rate limiting is a critical strategy for managing the load on an API. By imposing restrictions on the number of requests a unique consumer can make within a specific time frame, rate limiting prevents a small set of users from monopolizing the majority of resources on a service, ensuring fair access for all API consumers.

Aperture implements this strategy through its high-performance, distributed rate limiter. This system enforces per-key limits based on fine-grained labels, thereby offering precise control over API usage. For each unique key, Aperture maintains a token bucket of a specified bucket capacity and fill rate. The fill rate dictates the sustained requests per second (RPS) permitted for a key, while transient overages over the fill rate are accommodated for brief periods, as determined by the bucket capacity.

The diagram depicts the distribution of tokens across Agents through a global token bucket. Each incoming request prompts the Agents to decrement tokens from the bucket. If the bucket has run out of tokens, indicating that the rate limit has been reached, the incoming request is rejected. Conversely, if tokens are available in the bucket, the request is accepted. The token bucket is continually replenished at a predefined fill rate, up to the maximum number of tokens specified by the bucket capacity.

note

The following policy is based on the Rate Limiting blueprint.

Configuration

This policy is based on the Rate Limiting blueprint. It applies a rate limiter to the ingress control point on the service catalog-service.prod.svc.cluster.local and identifies unique users by referencing the user_id header present in the HTTP traffic. Provided by the Envoy proxy, this header can be located under the label key http.request.header.user_id (see Flow Labels for more information).

Each user is allowed 2 requests every 1s (1 second) period. A burst of up to 40 requests is allowed. This means that the user can send up to 40 requests in the first second, and then 2 requests every second after that. The bucket gets replenished at the rate of 2 requests per second (the fill rate).

The below values.yaml file can be generated by following the steps in the Installation section.

aperturectl values.yaml

# yaml-language-server: $schema=../../../../../blueprints/rate-limiting/base/gen/definitions.json
blueprint: rate-limiting/base
uri: ../../../../../../blueprints
policy:
  policy_name: "static-rate-limiting"
  rate_limiter:
    bucket_capacity: 40
    fill_amount: 2
    selectors:
      - service: "catalog-service.prod.svc.cluster.local"
        control_point: "ingress"
        agent_group: "default"
    parameters:
      limit_by_label_key: "http.request.header.user_id"
      interval: 1s

Generated Policy

info

Circuit Diagram for this policy.

Installation

Generate a values file specific to the policy. This can be achieved using the command provided below.

aperturectl blueprints values --name=rate-limiting/base --version=main --output-file=values.yaml

Apply the policy using the aperturectl CLI or kubectl.

aperturectl (Aperture Cloud)
aperturectl (self-hosted controller)
kubectl (self-hosted controller)

aperturectl cloud blueprints apply --values-file=values.yaml

Pass the --kube flag with aperturectl to directly apply the generated policy on a Kubernetes cluster in the namespace where the Aperture Controller is installed.

aperturectl blueprints generate --values-file=values.yaml --output-dir=policy-gen
aperturectl apply policy --file=policy-gen/policies/static-rate-limiting.yaml --kube
    

Apply the generated policy YAML (Kubernetes Custom Resource) with kubectl.

aperturectl blueprints generate --values-file=values.yaml --output-dir=policy-gen
kubectl apply -f policy-gen/policies/static-rate-limiting-cr.yaml -n aperture-controller
    

Policy in Action

When the policy is applied at a service, no more than 2 requests per second period (after an initial burst of 40 requests) are accepted for a user.

Static Rate Limiting

Overview​

Configuration​

Installation​

Policy in Action​

Overview

Configuration

Installation

Policy in Action