Failed recreating loadbalancer after few days when pods running

Matthis_Holleville · August 5, 2021, 11:44am

We have been using kong on our production clusters for a while. When we only had 20% of our applications we had no problems.

We recently migrated the rest of our workload and since then we’ve had an error on part of our kong replica:


failed recreating balancer for app.namespace.80.svc: timeout waiting for balancer for a9a7385e-ad84-474e-9b1c-2....

To “solve” this problem, we are forced to restart the pod that returns this error.

Unfortunately if we don’t manage to detect the problem quickly enough, all the pods end up having the error and we have 100% of our traffic interrupted.

We did not notice any excessive RAM consumption on the pods and no other errors were returned by the pod before the error below.

traines · August 6, 2021, 11:15pm

Can you file a bug at Issues · Kong/kong · GitHub? They should be able to guide you through diagnosing this particular issue and can develop a fix if it’s a code issue.

For background, our Kubernetes tooling creates Kong configuration, and the proxy then takes that and uses it to build a number of optimized internal structures for routing requests. The balancer is one of those internal structures, derived from Kong services and upstreams (roughly analogous to Kubernetes Services and Endpoints). When something in that area changes (e.g. a Pod is rescheduled, so the set of IPs in its attached Service’s Endpoints changes), the proxy has to rebuild the balancer.

Balancer builds shouldn’t fail, and I’m not sure offhand what would cause a build to time out. The core team (who review the Kong repo issues) are more familiar with that codebase.

Topic		Replies	Views
Error: balancers.lua:228: get_balancer(): balancer not found for call-reminder-service.prod-khatabook.80.svc, Questions	1	712	January 31, 2022
Timeout error when accessing routes (apis) kubernetes , kong-gateway	3	1568	July 30, 2019
Kong fails to create balancer for service Questions service-mesh	4	1984	January 31, 2022
Kong 1.4, K8S, DB-less, 504 Gateway timeout kubernetes , kong-gateway	19	6351	January 23, 2020
Timeout in connection to upstream Questions kubernetes	4	6573	December 5, 2019

Failed recreating loadbalancer after few days when pods running

Related topics