I have a Kong cluster sitting behind an AWS Network Load Balancer that uses the Kong rate limiting plugin with a local policy (i.e., the limit is tracked and enforced by individual members of the cluster rather than by the cluster as a whole). Because AWS NLBs balance TCP connections rather than HTTP requests, I want to make sure that a client that receives a throttling response has their TCP connection closed so that the next request they send will establish a new connection and therefore likely be handled by a separate node in the cluster. Ideally, this behavior should also occur if the Kong node handling the request is unable to connect to the upstream service.
Is there any way to direct Kong to close connections from clients that receive a certain HTTP status code (e.g., 429 or 502)? Does Kong/Nginx do this by default in any case?
You can’t do that out of the box but could write a custom plugin. Maybe.
You are doing something wrong though. Rate-limiting is an HTTP (L7) concept here and you are closing an L4 connection.
Why do you want to close the TCP connection? You should program the client to retry later.
I’m using an AWS network load balancer (which has no visibility into L7 activity) so that the cluster is addressable via a static IP within a VPC. That particular load balancer type will evenly distribute TCP connections amongst the upstream cluster, but HTTP traffic on a given connection could be much higher or lower than on another connection.
Fair. But closing the TCP connection doesn’t really solve the problem. The probability that it will solve the problem seems slim.
How high are your rate-limits? A somewhat imbalanced Kong clusters should still be able to handle this fine. What’s your RPS around here?
Currently, rate limits are 100 TPS per client so that throttling is a rare event. Clients are retrying throttles and certain 5xx-level errors with jittered exponential backoff.
Being able to break the impedance mismatch between an L4 load balancer and a cluster of L7 gateways by kicking noisy client connections off a given node would allow me to more evenly spread load across the cluster.
In the case of a single node not being able to connect to an upstream endpoint and returning a 502, pairing this response with a TCP close message would give the client a chance to retry the request on a separate node that may not be experiencing the same connectivity issue. An unhealthy node will eventually be removed from the LB’s pool, but until then, rebalancing clients through the balancer will mitigate the blast radius of such an event.
You can probably write a custom plugin and send different
Connection header to client to tell it to keep-alive or not, based on the upstream status code. And this will require the client to cooperate as @hbagdi pointed out. But I don’t think there’s a clean way to terminate tcp connection. (Maybe crash the worker can : ) )