Rate Limiter for a distributed and scalable system.

Jinia Konar
Groww Engineering
Published in
4 min readJul 4, 2022

--

All the systems nowadays have to be designed for a distributed environment which can support scalability. Therefore it is equally important to think of a scalable solution for even small components like Rate Limiter.

There are many rate limit algorithms that we have already discussed here, among them we will discuss the Sliding window counter (also known as rolling-rate-limiter) in depth with the combination of redis sorted sets for distributed and scalable system in this blog.

What is Sliding window counter?

It is a hybrid approach that combines both the fixed window counter and sliding window log algorithm discussed earlier.

Sliding window counter algorithm

Sliding window counter is similar to fixed window counter but it smooths out bursts of traffic by adding a weighted count in previous window to the count in current window. For example, suppose the limit is 4 requests/minute in the above diagram. There are 4 requests in window [1:00:00, 1:01:00) and 2 requests in window [1:01:00, 1:02:00). For 2 requests that arrives at 1:01:15, which is at 25% position of window [1:01:00, 1:02:00), we calculate the request count by the formula: 3 x (1 - 25%) + 2 = 4.25 > 4. Thus we reject this request. Even though both windows don’t exceed the limit, the request is rejected because the weighted sum of previous and current window does exceeded the limit.

Pros: It smooths out spikes in traffic because the rate is based on the average rate of the previous window.
Cons: Works only for not-so-strict look back window times, especially for smaller unit times.

Above solution can be used in case of single system architecture, but in case of distributed system architecture weird race condition can occur while updating counter value. Let’s discuss below a solution to this problem😁.

A Solution for distributed systems

Along with the sliding window counter logic we can use redis sorted sets data structure to provide a solution for distributed systems.

You can checkout my Github Repository, where I have created a simple employee management system in java using Spring Boot, where the read apis are rate limited. I have used Redis Template and JedisConnectionFactory, which will run all the redis commands in an atomic fashion that will avoid all the weird race conditions😁. Let’s discuss the logic in depth here.

Basically the logic is same as discussed earlier i.e whenever any new request will come we will do the following things:-

  1. Remove all elements of the set which were requested before the allowed time window . This can be accomplished with Redis’s ZREMRANGEBYSCORE command, here in the above code it is line 10 :
    * Here operations are the RedisOperations out of which we are using opsForZSet which will provide all the redis sorted sets functionality.
    * removeRangeByScore will remove all the elements for the redis key rate-id which are within the range 0-(currentTime-slidingWindowTime) .
  2. Add the current timestamp to the set, using ZADD command, line 11: above we have added current timestamp as score as well as value for the redis key rate-id.
  3. Add a TTL equal to the rate-limiting interval on the set (to save space), line 12 in the above code.
  4. Then get all the entries in the set which will help us to calculate the number of calls of the API for the time window using ZRANGE(0,-1) command, here line 13: for the redis key rate-id we get all the set elements from the beginning to the end.
  5. And lastly wrap all the above operations using Redis Transaction (MULTI / EXEC commands) to ensure that all commands are executed together in an atomic fashion.

Let’s get more clarity with a small example.

Sliding window counter algorithm example

In the above diagram the rate limit is 2 requests/minute, the log is empty when a new request arrives at 1:00:01. Thus, the request is processed with size = 1. A new request arrives at 1:01:40, all the older timestamps before 1:01:40–00:00:60(Allowed time window) = 1:00:40 are removed hence, the timestamp 1:00:01 is removed and 1:01:40is inserted into the log with size = 1, not larger than the allowed count, thus, the request is processed. Similarly a new request arrives at 1:01:50, and all the older timestamps before 1:01:50–00:00:60 = 1:00:50 are removed, but there is no such timestamp so nothing is removed and 1:01:50 is inserted into the log with size = 2, not larger than the allowed count, thus, the request is processed. A new request arrives at 1:01:60, and all the older timestamps before 1:01:60–00:00:60 = 1:00:00 are removed, but still there is no such timestamp so nothing is removed and 1:01:60 is inserted into the log with size = 3, larger than the allowed size 2. Therefore, this request is rejected.

I hope I was able to share my understanding with you, thanks for your support till here. If you’ve enjoyed this story, please click the 👏 button and share it, to help others find it as well. Also, feel free to leave a comment below.😁

Groww Engineering publishes technical anecdotes, the latest technologies, and better ways to tackle common programming problems. You can subscribe here to get the latest updates.

--

--