Kong Load Shedding Plugin

Hello, fellow Kong Community members.
We at Dream11 have been (happily) using Kong for a few months now and in the process, we developed a bunch of custom plugins, some of which we have open-sourced. Refer-
kong-circuit-breaker
kong-scalable-rate-limiter
kong-host-interpolate-by-header
We also have a couple of plugins, which we plan to open source, in the pipeline.

We plan to create a “Load Shedding” plugin to increase resiliency at the API Gateway layer. On a high level, the plugin would shed any additional load Kong is unable to handle, thereby protecting Kong from a possible crash by overloading and maintaining Quality of Service.
A Load Shedding plugin could also save costs by reducing the buffer infra provisioned to handle spiky traffic. If we have a load shedding plugin that guarantees Kong does not go down while also maintaining QoS, then we could reduce the amount of (safe) over-provisioning to handle any sudden increase in traffic.

We went through the following resources to get started-
https://vikas-kumar.medium.com/handling-overload-with-concurrency-control-and-load-shedding-part-1-1a7f76d2a1dd
https://tech.olx.com/load-shedding-with-nginx-using-adaptive-concurrency-control-part-1-e59c7da6a6df
https://netflixtechblog.com/keeping-netflix-reliable-using-prioritized-load-shedding-6cc827b02f94
https://eng.uber.com/qalm-qos-load-management-framework/
https://www.youtube.com/watch?v=XNEIkivvaV4

We thought shedding load on the basis of In-flight Requests would be a good start, so we created a custom plugin and tested it out. We found that setting a rigid limit for IFR does not work as IFR depends on latency as well. So using an IFR limit obtained from a load having 5ms average latency, would not work for a load having 20ms average latency. In the latter case we would shed load even if the system is not overloaded (low CPU usage).

So we are now in the process of including the CPU usage for deciding whether to shed load or not. Using CPU usage along with IFR would ensure we are only shedding load when Kong is actually overloaded by countering the latency issue described above. This approach however is not final and could change as we gain more understanding of which approach works best for this use case.

We would love to know your thoughts on this. Please share with us if you have faced similar problems while using Kong. Feel free to mention any other similar use cases as well. We look forward to our discussions and possible collaborations on this Plugin.

1 Like

Hey @Chirag_Manwani , I was wondering why you didn’t mention your config-by-env plugin in this list. I am actively using it, and it works out quite well. I have also made a Redis based caching plugin that we plan to open-source after a few improvements.
We were also looking for sliding window-based rate limiting solutions for our use cases. I saw it mentioned as one of your future tasks on the scalable rate limiter plugin and would love to collaborate or know more about it.

Hey Reetik, Thanks, I’ll edit the post. Yes we have plans of working on improving the scalable-rate-limiter plugin as well. We will probably start dev on it in a couple of weeks. Let’s discuss more in the Issue section in the repo. I’ll create a new Issue for this feature.