Rate Limiter Algorithms: A Comprehensive Guide

4 min readJul 24, 2024

In the world of web services and APIs, rate limiting is a crucial technique for controlling the rate of incoming requests. It helps prevent abuse, ensures fair usage, and protects services from being overwhelmed. This article will explore various rate limiter algorithms, their implementations, and use cases.

1. Token Bucket Algorithm

The Token Bucket algorithm is one of the most popular and versatile rate limiting algorithms.

How it works:

Imagine a bucket that holds tokens.
Tokens are added to the bucket at a fixed rate.
When a request arrives, it must obtain a token from the bucket to proceed.
If the bucket is empty, the request is denied or delayed.

Implementation:

import time

class TokenBucket:
    def __init__(self, capacity, fill_rate):
        self.capacity = capacity
        self.fill_rate = fill_rate
        self.tokens = capacity
        self.last_fill = time.time()

    def allow(self):
        now = time.time()
        time_passed = now - self.last_fill
        self.tokens = min(self.capacity, self.tokens + time_passed * self.fill_rate)
        self.last_fill = now

        if self.tokens >= 1:
            self.tokens -= 1
            return True
        return False

# Usage
limiter = TokenBucket(capacity=10, fill_rate=1)  # 10 tokens, refill 1 token per second
if limiter.allow():
    print("Request allowed")
else:
    print("Request denied")

Pros:

Allows for bursts of traffic
Smooth rate limiting over time

Cons:

Can be memory-intensive for many users

2. Leaky Bucket Algorithm

The Leaky Bucket algorithm is similar to Token Bucket but focuses on smoothing out bursty traffic.

How it works:

Imagine a bucket with a small hole in the bottom.
Requests are added to the top of the bucket.
Requests leak out of the hole at a constant rate.
If the bucket overflows, new requests are discarded.

Implementation:

import time

class LeakyBucket:
    def __init__(self, capacity, leak_rate):
        self.capacity = capacity
        self.leak_rate = leak_rate
        self.current_volume = 0
        self.last_leak = time.time()

    def allow(self, volume=1):
        now = time.time()
        time_passed = now - self.last_leak
        self.current_volume = max(0, self.current_volume - time_passed * self.leak_rate)
        self.last_leak = now

        if self.current_volume + volume <= self.capacity:
            self.current_volume += volume
            return True
        return False

# Usage
limiter = LeakyBucket(capacity=10, leak_rate=1)  # 10 unit capacity, leak 1 unit per second
if limiter.allow():
    print("Request allowed")
else:
    print("Request denied")

Pros:

Smooths out bursty traffic
Good for network traffic shaping

Cons:

Less flexible than Token Bucket
Can’t handle short-term bursts well

3. Fixed Window Counter

The Fixed Window Counter is a simple algorithm that counts requests in fixed time windows.

How it works:

Define a fixed time window (e.g., 1 minute)
Count requests within each window
Reset the counter at the start of each new window

Implementation:

import time

class FixedWindowCounter:
    def __init__(self, limit, window):
        self.limit = limit
        self.window = window
        self.current_window = int(time.time() / window)
        self.counter = 0

    def allow(self):
        current_time = int(time.time() / self.window)
        if current_time > self.current_window:
            self.current_window = current_time
            self.counter = 0

        if self.counter < self.limit:
            self.counter += 1
            return True
        return False

# Usage
limiter = FixedWindowCounter(limit=10, window=60)  # 10 requests per 60 seconds
if limiter.allow():
    print("Request allowed")
else:
    print("Request denied")

Pros:

Simple to understand and implement
Low memory usage

Cons:

Can allow twice the rate limit if requests cluster around the window boundary

4. Sliding Window Log

The Sliding Window Log algorithm keeps a log of timestamps for each request.

How it works:

Store a timestamp for each request
Remove timestamps older than the rate limit window
Count the number of remaining timestamps

Implementation:

import time

class SlidingWindowLog:
    def __init__(self, limit, window):
        self.limit = limit
        self.window = window
        self.log = []

    def allow(self):
        now = time.time()
        self.log = [timestamp for timestamp in self.log if timestamp > now - self.window]

        if len(self.log) < self.limit:
            self.log.append(now)
            return True
        return False

# Usage
limiter = SlidingWindowLog(limit=10, window=60)  # 10 requests per 60 seconds
if limiter.allow():
    print("Request allowed")
else:
    print("Request denied")

Pros:

More precise than fixed window
Handles boundary conditions well

Cons:

Can be memory-intensive for high volumes

5. Sliding Window Counter

The Sliding Window Counter combines the Fixed Window Counter with a weighted previous window.

How it works:

Use a fixed window counter
Consider a portion of the previous window’s count based on the overlap

Implementation:

import time

class SlidingWindowCounter:
    def __init__(self, limit, window):
        self.limit = limit
        self.window = window
        self.current_window = int(time.time() / window)
        self.counter = 0
        self.previous_counter = 0

    def allow(self):
        current_time = time.time()
        current_window = int(current_time / self.window)

        if current_window > self.current_window:
            self.previous_counter = self.counter
            self.counter = 0
            self.current_window = current_window

        window_elapsed = current_time % self.window / self.window
        rate = self.counter + self.previous_counter * (1 - window_elapsed)

        if rate < self.limit:
            self.counter += 1
            return True
        return False

# Usage
limiter = SlidingWindowCounter(limit=10, window=60)  # 10 requests per 60 seconds
if limiter.allow():
    print("Request allowed")
else:
    print("Request denied")

Pros:

Smooth rate limiting
Handles boundary conditions well

Cons:

Slightly more complex to implement

Conclusion

Each rate limiting algorithm has its strengths and use cases:

Token Bucket: Best for allowing short bursts of traffic.
Leaky Bucket: Ideal for smoothing out traffic and shaping network usage.
Fixed Window Counter: Simple and memory-efficient, but less precise.
Sliding Window Log: Precise, but potentially memory-intensive.
Sliding Window Counter: A good balance of precision and efficiency.

When implementing rate limiting, consider factors such as the nature of your traffic, the resources you’re protecting, and the granularity of control you need. Often, a combination of these algorithms or a custom implementation might be the best solution for complex systems.

Remember, the goal of rate limiting is not just to protect your services, but also to ensure fair usage and maintain a high-quality experience for all users. Regular monitoring and adjustments to your rate limiting strategy are key to achieving this balance.

written/generated by: ChatGPT — Master Spring TER / https://claude.ai

Rate Limiter Algorithms: A Comprehensive Guide

1. Token Bucket Algorithm

How it works:

Implementation:

Pros:

Cons:

2. Leaky Bucket Algorithm

How it works:

Implementation:

Pros:

Cons:

3. Fixed Window Counter

How it works:

Implementation:

Pros:

Cons:

4. Sliding Window Log

How it works:

Implementation:

Pros:

Cons:

5. Sliding Window Counter

How it works:

Implementation:

Pros:

Cons:

Conclusion

Sign up to discover human stories that deepen your understanding of the world.

Free

Membership

Written by Master Spring Ter

No responses yet