Rate Limiter Algorithms: A Comprehensive Guide

Master Spring Ter
4 min readJul 24, 2024

In the world of web services and APIs, rate limiting is a crucial technique for controlling the rate of incoming requests. It helps prevent abuse, ensures fair usage, and protects services from being overwhelmed. This article will explore various rate limiter algorithms, their implementations, and use cases.

1. Token Bucket Algorithm

The Token Bucket algorithm is one of the most popular and versatile rate limiting algorithms.

How it works:

  • Imagine a bucket that holds tokens.
  • Tokens are added to the bucket at a fixed rate.
  • When a request arrives, it must obtain a token from the bucket to proceed.
  • If the bucket is empty, the request is denied or delayed.

Implementation:

import time

class TokenBucket:
def __init__(self, capacity, fill_rate):
self.capacity = capacity
self.fill_rate = fill_rate
self.tokens = capacity
self.last_fill = time.time()

def allow(self):
now = time.time()
time_passed = now - self.last_fill
self.tokens = min(self.capacity, self.tokens + time_passed * self.fill_rate)
self.last_fill = now

if self.tokens >= 1:
self.tokens -= 1
return True
return False

# Usage
limiter = TokenBucket(capacity=10, fill_rate=1) # 10 tokens, refill 1 token per second
if limiter.allow():
print("Request allowed")
else:
print("Request denied")

Pros:

  • Allows for bursts of traffic
  • Smooth rate limiting over time

Cons:

  • Can be memory-intensive for many users

2. Leaky Bucket Algorithm

The Leaky Bucket algorithm is similar to Token Bucket but focuses on smoothing out bursty traffic.

How it works:

  • Imagine a bucket with a small hole in the bottom.
  • Requests are added to the top of the bucket.
  • Requests leak out of the hole at a constant rate.
  • If the bucket overflows, new requests are discarded.

Implementation:

import time

class LeakyBucket:
def __init__(self, capacity, leak_rate):
self.capacity = capacity
self.leak_rate = leak_rate
self.current_volume = 0
self.last_leak = time.time()

def allow(self, volume=1):
now = time.time()
time_passed = now - self.last_leak
self.current_volume = max(0, self.current_volume - time_passed * self.leak_rate)
self.last_leak = now

if self.current_volume + volume <= self.capacity:
self.current_volume += volume
return True
return False

# Usage
limiter = LeakyBucket(capacity=10, leak_rate=1) # 10 unit capacity, leak 1 unit per second
if limiter.allow():
print("Request allowed")
else:
print("Request denied")

Pros:

  • Smooths out bursty traffic
  • Good for network traffic shaping

Cons:

  • Less flexible than Token Bucket
  • Can’t handle short-term bursts well

3. Fixed Window Counter

The Fixed Window Counter is a simple algorithm that counts requests in fixed time windows.

How it works:

  • Define a fixed time window (e.g., 1 minute)
  • Count requests within each window
  • Reset the counter at the start of each new window

Implementation:

import time

class FixedWindowCounter:
def __init__(self, limit, window):
self.limit = limit
self.window = window
self.current_window = int(time.time() / window)
self.counter = 0

def allow(self):
current_time = int(time.time() / self.window)
if current_time > self.current_window:
self.current_window = current_time
self.counter = 0

if self.counter < self.limit:
self.counter += 1
return True
return False

# Usage
limiter = FixedWindowCounter(limit=10, window=60) # 10 requests per 60 seconds
if limiter.allow():
print("Request allowed")
else:
print("Request denied")

Pros:

  • Simple to understand and implement
  • Low memory usage

Cons:

  • Can allow twice the rate limit if requests cluster around the window boundary

4. Sliding Window Log

The Sliding Window Log algorithm keeps a log of timestamps for each request.

How it works:

  • Store a timestamp for each request
  • Remove timestamps older than the rate limit window
  • Count the number of remaining timestamps

Implementation:

import time

class SlidingWindowLog:
def __init__(self, limit, window):
self.limit = limit
self.window = window
self.log = []

def allow(self):
now = time.time()
self.log = [timestamp for timestamp in self.log if timestamp > now - self.window]

if len(self.log) < self.limit:
self.log.append(now)
return True
return False

# Usage
limiter = SlidingWindowLog(limit=10, window=60) # 10 requests per 60 seconds
if limiter.allow():
print("Request allowed")
else:
print("Request denied")

Pros:

  • More precise than fixed window
  • Handles boundary conditions well

Cons:

  • Can be memory-intensive for high volumes

5. Sliding Window Counter

The Sliding Window Counter combines the Fixed Window Counter with a weighted previous window.

How it works:

  • Use a fixed window counter
  • Consider a portion of the previous window’s count based on the overlap

Implementation:

import time

class SlidingWindowCounter:
def __init__(self, limit, window):
self.limit = limit
self.window = window
self.current_window = int(time.time() / window)
self.counter = 0
self.previous_counter = 0

def allow(self):
current_time = time.time()
current_window = int(current_time / self.window)

if current_window > self.current_window:
self.previous_counter = self.counter
self.counter = 0
self.current_window = current_window

window_elapsed = current_time % self.window / self.window
rate = self.counter + self.previous_counter * (1 - window_elapsed)

if rate < self.limit:
self.counter += 1
return True
return False

# Usage
limiter = SlidingWindowCounter(limit=10, window=60) # 10 requests per 60 seconds
if limiter.allow():
print("Request allowed")
else:
print("Request denied")

Pros:

  • Smooth rate limiting
  • Handles boundary conditions well

Cons:

  • Slightly more complex to implement

Conclusion

Each rate limiting algorithm has its strengths and use cases:

  1. Token Bucket: Best for allowing short bursts of traffic.
  2. Leaky Bucket: Ideal for smoothing out traffic and shaping network usage.
  3. Fixed Window Counter: Simple and memory-efficient, but less precise.
  4. Sliding Window Log: Precise, but potentially memory-intensive.
  5. Sliding Window Counter: A good balance of precision and efficiency.

When implementing rate limiting, consider factors such as the nature of your traffic, the resources you’re protecting, and the granularity of control you need. Often, a combination of these algorithms or a custom implementation might be the best solution for complex systems.

Remember, the goal of rate limiting is not just to protect your services, but also to ensure fair usage and maintain a high-quality experience for all users. Regular monitoring and adjustments to your rate limiting strategy are key to achieving this balance.

written/generated by: ChatGPT — Master Spring TER / https://claude.ai

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Master Spring Ter
Master Spring Ter

Written by Master Spring Ter

https://chatgpt.com/g/g-dHq8Bxx92-master-spring-ter Specialized ChatGPT expert in Spring Boot, offering insights and guidance for developers.

No responses yet

Write a response