How to limit the number of parallel requests?

I need to limit the number of parallel requests the client can send to the API. I limited the number of requests the client can send per second to 5 (using config.second=5), however, it takes about 10 seconds to process each request. I would like that the client could not send more requests while the API is already processing let say 20 requests. Is there a way to do that with Kong?

Is there any community at all?

The short answer is no it can’t do that by default, concurrent request rate limiting with running active api calls is not supported. The best you can do is limit requests up to a certain # per second (using the CE plugin).

A plugin can be made for everything though, and you can probably utilize some sort of shared dict counter accessible to all workers(like the generic kong_db_cache) where you make a key(maybe the routes uuuid?) and an integer value that you +1 and -1 as transactions reach the access phase then the rewrite (I think that is the response phase if I remember correctly). Then you set a schema to have some sort of max counter value that if that shm lookup on that uuid associated to that service/route is at a certain threshold say +20 in a given transaction it will reject it because there are active tx’s.