Hello Guys,
We use NLB as loadbalancer for Kong service with IP targets (service.beta.kubernetes.io/aws-load-balancer-type: nlb-ip) and we need to attach security group to pod for restricting access to our services, exposed via kong. I followed official guide for attaching security group to our Kong ingress controller pods, but unfortunately ingress controller readiness/liveness probe failed in this setup
Readiness probe failed: Get http://10.11.21.68:10254/healthz: dial tcp 10.11.21.68:10254: connect: connection refused
Liveness probe failed: Get http://10.11.21.68:10254/healthz: dial tcp 10.11.21.68:10254: connect: connection refused
Rediness/liveness probes for proxy container passed.
There is no any error logs, but when I run netstat -ntlp from ingress controller container, I don’t see 10254 port allocated:
tcp 0 0 0.0.0.0:8443 0.0.0.0:* LISTEN -
tcp 0 0 127.0.0.1:8444 0.0.0.0:* LISTEN -
tcp 0 0 0.0.0.0:8000 0.0.0.0:* LISTEN -
tcp 0 0 0.0.0.0:8100 0.0.0.0:* LISTEN -
AWS VPC CNI documentations doesn’t have any related recommendations (just add DISABLE_TCP_EARLY_DEMUX=true variable), but it fixed only probes for proxy container.
We use Helm chart v1.14.2 for Kong installation.
I found that kong ingress-controller pod can’t connect to kubernetes api service (https://172.20.0.1:443) after adding security group directly to the pod. Probably that’s why probes failed.
Don’t know why it can’t connect to kubernetes api service, still investigating
After fixing api connection issue, currently I bumped into the next one:
level=info msg=“retry 4 to fetch metadata from kong: making HTTP request: Get "https://localhost:8444/\”: context deadline exceeded"
The healthy pods (without attached security group) has some similar logs, but not exactly the same:
level=info msg=“retry 2 to fetch metadata from kong: making HTTP request: Get "https://localhost:8444/\”: dial tcp 127.0.0.1:8444: connect: connection refused"
Both kong installation (with and without attached SG) has
admin:
enabled: false
I also tried to use http instead of https for admin configuration, but it didn’t help:
level=info msg=“retry 4 to fetch metadata from kong: making HTTP request: Get "http://localhost:8001/\”: context deadline exceeded"
I found the issue.
The problem is that pod can’t connect to CoreDNS service if I use NodeLocal DNSCache. After deleting NodeLocal DNSCache everything works correct. Looks like it’s issue with security group per pod realization.
So, there is no any issue with kong, but I leave it here in case of anyone else will have the same problem
1 Like
Hi @burakovsky, i am facing the same issue. I am using AWS CNI, and trying to get kong create a NLB. But I am getting the same below, error.
Warning Unhealthy 64s (x6 over 114s) kubelet Liveness probe failed: Get "http://10.0.7.233:10254/healthz": dial tcp 10.0.7.233:10254: connect: connection refused
Normal Killing 64s (x2 over 94s) kubelet Container ingress-controller failed liveness probe, will be restarted
You mentioned adding security group to the pod, can you guide me through this please??
Also,can you tell me what was the fix for the api issue??
Do you attach security group directly to kong pods or not. It should work correctly with nodelocal dnscache if you don’t attach any security group to long pods
If you use NodeLocal DNSCache and attach AWS security group directly to Kong pods, you need to configure dnsPolicy and dnsConfig as described here.