Def
In the HTTP world, a rate limiter limits the number of client requests allowed to be sent over a specified period. all the excess calls are blocked if exceeds conditions of rate limiter.
Benefits
- Denial of Service attack prevention;
- Reduce costs
- Prevent from being overloaded.
Step 1 - Understand the problem and establish design scope
#design-problems
- Client-side or Server-side
- What is the Rate limiter based on?(IP, user Id, etc.)
- What is the scale of the system?
- Will it in distributed system?
Step 2 - Propose high-level design and get buy-in
Where to put
API gateways/middleware.
The HTTP 429 response status code indicates a user has sent too many requests.
Algorithms
Token bucket
Pros:
• The algorithm is easy to implement.
• Memory efficient.
• Token bucket allows a burst of traffic for short periods. A request can go through as long as there are tokens left.
Cons:
• Two parameters in the algorithm are bucket size and token refill rate.
Fixed window counter algorithm
Pros: • Easy to understand. • Resetting available quota at the end of a unit time window fits certain use cases. Cons: • Spike in traffic at the edges of a window could cause more requests than the allowed quota to go through.
Sliding window log algorithm
improved the cons of [[#Fixed window counter algorithm]]
Sliding window counter algorithm
Pros • It smooths out spikes in traffic because the rate is based on the average rate of the previous window. • Memory efficient. Cons • It only works for not-so-strict look back window. It is an approximation of the actual rate because it assumes requests in the previous window are evenly distributed.
High-level design
If counter is larger than the limit, the request is disallowed.
Use Redis as it in-memory cache is fast and supports time-based expiration strategy.
Step 3 - Design deep dive
How are rate limiting rules created? Where are the rules stored?
How to handle requests that are rate limited
HTTP 429
HTTP response headres
- X-Ratelimit-Remaining: The remaining number of allowed requests within the window.
- X-Ratelimit-Limit: It indicates how many calls the client can make per time window.
- X-Ratelimit-Retry-After: The number of seconds to wait until you can make a request again without being throttled.
Detailed Design
Cached rules
In a distributed env
Race condition
not same value(Locks promblem) Use: Lua script and sorted sets data structure in Redis.
Synchronization issue
Sticky sessions: send to the same rate limiter.