Skip to content

Rate Limiting & Throttling

Distributed systems need to protect themselves from abuse, overload, and uneven usage.
Two key techniques are rate limiting and throttling.


1. Rate Limiting

Concept

  • Restricts the number of requests per user/IP/service in a given time window.
  • Prevents abuse (e.g., brute force login, API spamming).

Algorithms

  1. Fixed Window

    • Allow X requests per second/minute.
    • Simple but can allow bursts at window edges.
  2. Sliding Window

    • Tracks requests in rolling intervals.
    • Smoother than fixed window.
  3. Token Bucket

    • Tokens added at fixed rate.
    • Each request consumes a token.
    • Allows bursts within limits.
  4. Leaky Bucket

    • Requests flow at constant rate.
    • Excess requests dropped.

2. Throttling

Concept

  • Slows down requests when usage exceeds threshold.
  • Instead of blocking, adds delays or reduced priority.

Example

  • Streaming service reduces video quality if bandwidth is limited.
  • API may degrade response speed under heavy load.

3. Use Cases

  • APIs → prevent one user from hogging resources.
  • Login systems → prevent brute force attacks.
  • Databases → protect from overload.
  • Streaming/CDNs → adaptive throttling under load.

4. Real-World Examples

  • NGINX / Envoy → built-in rate limiting modules.
  • Cloudflare / AWS API Gateway → configurable per-client rate limits.
  • Twitter API → strict per-user and per-app limits.

5. Interview Tips

  • Say: “I’d use rate limiting to prevent abuse and throttling to gracefully handle overload.”
  • Mention algorithms (token bucket, leaky bucket).
  • Tie to security (prevent DDoS) and fairness (all clients get share).

6. Diagram

[ Client Requests ] → [ Rate Limiter ] → [ Service ]
  • Excess requests dropped (rate limit) or delayed (throttle).

7. Next Steps


Connect: LinkedIn

© 2025 Official CTO. All rights reserved.