Kong deploy In Kubernetes db mode , getting DNS resolution failed:

Hi Team I am getting below error while doing the DB mode deployment .

Kindly suggest me no debug or trace logs gives correct info .

2020/03/16 14:47:32 [error] 26#0: *51 [kong] kong.lua:42 [postgres] [cosocket] DNS resolution failed: dns server error: 3 name error. Tried: ["(short)postgres:(na) - cache-hit/dereferencing SRV","(short)postgres.cf-gateway-dev.svc.cluster.dt-ue:(na) - cache-miss",“postgres.cf-gateway-dev.svc.cluster.dt-ue.cf-gateway-dev.svc.cluster.dt-ue2:33 - cache-hit/stale/scheduled/dns server error: 3 name error”,“postgres.cf-gateway-dev.svc.cluster.dt-ue.svc.cluster.dt-ue2:33 - cache-hit/stale/scheduled/dns server error: 3 name error”,“postgres.cf-gateway-dev.svc.cluster.dt-ue.cluster.dt-ue2:33 - cache-hit/stale/scheduled/dns server error: 3 name error”,“postgres.cf-gateway-dev.svc.cluster.dt-ue.us-east-2.compute.internal:33 - cache-hit/stale/scheduled/dns server error: 3 name error”,“postgres.cf-gateway-dev.svc.cluster.dt-ue.ec2.internal:33 - cache-hit/stale/scheduled/dns server error: 3 name error”,“postgres.cf-gateway-dev.svc.cluster.dt-ue:33 - cache-hit/stale/scheduled/dns server error: 3 name error”,“postgres.cf-gateway-dev.svc.cluster.dt-ue.cf-gateway-dev.svc.cluster.dt-ue2:1 - cache-hit/stale/scheduled/dns server error: 3 name error”,“postgres.cf-gateway-dev.svc.cluster.dt-ue.svc.cluster.dt-ue2:1 - cache-hit/stale/scheduled/dns server error: 3 name error”,“postgres.cf-gateway-dev.svc.cluster.dt-ue.cluster.dt-ue2:1 - cache-hit/stale/scheduled/dns server error: 3 name error”,“postgres.cf-gateway-dev.svc.cluster.dt-ue.us-east-2.compute.internal:1 - cache-hit/stale/scheduled/dns server error: 3 name error”,“postgres.cf-gateway-dev.svc.cluster.dt-ue.ec2.internal:1 - cache-hit/stale/scheduled/dns server error: 3 name error”,“postgres.cf-gateway-dev.svc.cluster.dt-ue:1 - cache-hit/stale/scheduled/dns server error: 3 name error”,“postgres.cf-gateway-dev.svc.cluster.dt-ue.cf-gateway-dev.svc.cluster.dt-ue2:5 - cache-hit/stale/scheduled/dns server error: 3 name error”,“postgres.cf-gateway-dev.svc.cluster.dt-ue.svc.cluster.dt-ue2:5 - cache-hit/stale/scheduled/dns server error: 3 name error”,“postgres.cf-gateway-dev.svc.cluster.dt-ue.cluster.dt-ue2:5 - cache-hit/stale/scheduled/dns server error: 3 name error”,“postgres.cf-gateway-dev.svc.cluster.dt-ue.us-east-2.compute.internal:5 - cache-hit/stale/scheduled/dns server error: 3 name error”,“postgres.cf-gateway-dev.svc.cluster.dt-ue.ec2.internal:5 - cache-hit/stale/scheduled/dns server error: 3 name error”,“postgres.cf-gateway-dev.svc.cluster.dt-ue:5 - cache-hit/stale/scheduled/dns server error: 3 name error”], client: 127.0.0.1, server: kong_admin, request: “GET / HTTP/1.1”, host: “localhost:8444”
127.0.0.1 - - [16/Mar/2020:14:47:32 +0000] “GET / HTTP/1.1” 500 42 “-” “Go-http-client/1.1”

On a cursory look, it seems like this might be another instance of https://github.com/Kong/kong/issues/5455

Do we have any observation for this issue ?
I am deploying the kong in db mode

Now getting this error , is kong stable with kubernetes is it safe to migrate from api management (ECS) to declarative way ?

2020/03/17 07:23:06 [error] 22#0: *3 lua entry thread aborted: runtime error: /usr/local/share/lua/5.1/resty/dns/balancer/ring.lua:432: failed to create timer for background DNS resolution: process exiting
stack traceback:
coroutine 0:
[C]: in function ‘assert’
/usr/local/share/lua/5.1/resty/dns/balancer/ring.lua:432: in function ‘new’
/usr/local/share/lua/5.1/kong/runloop/balancer.lua:404: in function ‘create_balancer’
/usr/local/share/lua/5.1/kong/runloop/balancer.lua:762: in function ‘init’
/usr/local/share/lua/5.1/kong/runloop/handler.lua:813: in function </usr/local/share/lua/5.1/kong/runloop/handler.lua:812>, context: ngx.timer

other –

2020/03/17 09:38:45 [error] 25#0: *47325 lua entry thread aborted: runtime error: /usr/local/share/lua/5.1/resty/healthcheck.lua:225: attempt to index field ‘targets’ (a nil value)
stack traceback:
coroutine 0:
/usr/local/share/lua/5.1/resty/healthcheck.lua: in function ‘get_target’
/usr/local/share/lua/5.1/resty/healthcheck.lua:399: in function ‘get_target_status’
/usr/local/share/lua/5.1/kong/runloop/balancer.lua:196: in function ‘populate_healthchecker’
/usr/local/share/lua/5.1/kong/runloop/balancer.lua:352: in function ‘create_healthchecker’
/usr/local/share/lua/5.1/kong/runloop/balancer.lua:425: in function ‘create_balancer’
/usr/local/share/lua/5.1/kong/runloop/balancer.lua:762: in function ‘init’
/usr/local/share/lua/5.1/kong/runloop/handler.lua:813: in function </usr/local/share/lua/5.1/kong/runloop/handler.lua:812>, context: ngx.timer
2020/03/17 09:38:45 [error] 24#0: *11 [lua] events.lua:194: do_handlerlist(): worker-events: event callback failed; source=lua-resty-healthcheck [_i66in1wdo.test.upstreams.gateway], event=healthy, pid=24 error=’/usr/local/share/lua/5.1/resty/healthcheck.lua:225: attempt to index field ‘targets’ (a nil value)
stack traceback:
/usr/local/share/lua/5.1/resty/healthcheck.lua:225: in function ‘get_target’
/usr/local/share/lua/5.1/resty/healthcheck.lua:942: in function </usr/local/share/lua/5.1/resty/healthcheck.lua:940>
[C]: in function ‘xpcall’
/usr/local/share/lua/5.1/resty/worker/events.lua:185: in function ‘do_handlerlist’
/usr/local/share/lua/5.1/resty/worker/events.lua:217: in function ‘do_event_json’
/usr/local/share/lua/5.1/resty/worker/events.lua:361: in function ‘post’
/usr/local/share/lua/5.1/resty/healthcheck.lua:1021: in function ‘raise_event’
/usr/local/share/lua/5.1/resty/healthcheck.lua:281: in function ‘fn’
/usr/local/share/lua/5.1/resty/healthcheck.lua:206: in function ‘locking_target_list’
/usr/local/share/lua/5.1/resty/healthcheck.lua:247: in function ‘add_target’
/usr/local/share/lua/5.1/kong/runloop/balancer.lua:192: in function ‘populate_healthchecker’
/usr/local/share/lua/5.1/kong/runloop/balancer.lua:352: in function ‘create_healthchecker’
/usr/local/share/lua/5.1/kong/runloop/balancer.lua:425: in function ‘create_balancer’
/usr/local/share/lua/5.1/kong/runloop/balancer.lua:762: in function ‘init’
/usr/local/share/lua/5.1/kong/runloop/handler.lua:813: in function </usr/local/share/lua/5.1/kong/runloop/handler.lua:812>’, data={“port”:80,“ip”:“34.230.193.231”,“hostname”:“httpbin.org”}, context: ngx.timer
2020/03/17 09:38:45 [error] 24#0: *11 lua entry thread aborted: runtime error: /usr/local/share/lua/5.1/resty/healthcheck.lua:225: attempt to index field ‘targets’ (a nil value)
stack traceback:
coroutine 0:
/usr/local/share/lua/5.1/resty/healthcheck.lua: in function 'get_ta

hi @hbagdi - kindly help me to understand the issue …

  1. I have a centralized database which is being used for kong ECS deployment
  2. I am trying to deploy kong in Kubernetes with same database(external service) but getting above error.

Kindly suggest me.

2020/03/17 12:02:22 [error] 87#0: *779655 [lua] balancer.lua:283: [healthchecks] failed setting peer status: no peer found by name ‘s3.amazonaws.com’ and address 52.216.187.69:443, context: ngx.timer
2020/03/17 12:02:22 [error] 88#0: *772579 [lua] balancer.lua:283: [healthchecks] failed setting peer status: no peer found by name ‘s3.amazonaws.com’ and address 52.217.38.110:443, context: ngx.timer
2020/03/17 12:02:22 [error] 88#0: *772579 [lua] balancer.lua:283: [healthchecks] failed setting peer status: no peer found by name ‘s3.amazonaws.com’ and address 52.217.38.110:443, context: ngx.timer
2020/03/17 12:02:24 [error] 37#0: *783828 [lua] balancer.lua:283: [healthchecks] failed setting peer status: no peer found by name ‘s3.amazonaws.com’ and address 52.217.38.110:443, context: ngx.timer
2020/03/17 12:02:24 [error] 88#0: *772579 [lua] balancer.lua:283: [healthchecks] failed setting peer status: no peer found by name ‘s3.amazonaws.com’ and address 52.216.226.163:443, context: ngx.timer
2020/03/17 12:02:25 [error] 37#0: *788121 [lua] balancer.lua:283: [healthchecks] failed setting peer status: no peer found by name ‘s3.amazonaws.com’ and address 52.216.226.163:443, context: ngx.timer
2020/03/17 12:02:25 [error] 38#0: *788186 [lua] balancer.lua:283: [healthchecks] failed setting peer status: no peer found by name ‘s3.amazonaws.com’ and address 52.216.226.163:443, context: ngx.timer
2020/03/17 12:02:25 [error] 38#0: *788186 [lua] balancer.lua:283: [healthchecks] failed setting peer status: no peer found by name ‘s3.amazonaws.com’ and address 54.231.83.18:443, context: ngx.timer
2020/03/17 12:02:25 [error] 89#0: *781556 [lua] balancer.lua:283: [healthchecks] failed setting peer status: no peer found by name ‘s3.amazonaws.com’ and address 54.231.83.18:443, context: ngx.timer
2020/03/17 12:02:25 [error] 38#0: *789965 [lua] balancer.lua:283: [healthchecks] failed setting peer status: no peer found by name ‘httpbin.org’ and address 3.232.168.170:80, context: ngx.timer
2020/03/17 12:02:25 [error] 89#0: *781556 [lua] balancer.lua:283: [healthchecks] failed setting peer status: no peer found by name ‘httpbin.org’ and address 3.232.168.170:80, context: ngx.timer
2020/03/17 12:02:25 [error] 38#0: *789965 [lua] balancer.lua:283: [healthchecks] failed setting peer status: no peer found by name ‘httpbin.org’ and address 52.202.2.199:80, context: ngx.timer
2020/03/17 12:02:25 [error] 89#0: *781556 [lua] balancer.lua:283: [healthchecks] failed setting peer status: no peer found by name ‘httpbin.org’ and address 52.202.2.199:80, context: ngx.timer
2020/03/17 12:02:26 [error] 37#0: *791344 [lua] balancer.lua:283: [healthchecks] failed setting peer status: no peer found by name ‘httpbin.org’ and address 3.232.168.170:80, context: ngx.timer
2020/03/17 12:02:26 [error] 37#0: *791344 [lua] balancer.lua:283: [healthchecks] failed setting peer status: no peer found by name ‘httpbin.org’ and address 52.202.2.199:80, context: ngx.timer
2020/03/17 12:02:27 [error] 37#0: *795235 [lua] balancer.lua:283: [healthchecks] failed setting peer status: no peer found by name ‘s3.amazonaws.com’ and address 52.216.251.150:443, context: ngx.timer
2020/03/17 12:02:27 [error] 38#0: *795300 [lua] balancer.lua:283: [healthchecks] failed setting peer status: no peer found by name ‘s3.amazonaws.com’ and address 52.216.251.150:443, context: ngx.timer

2020/03/17 11:59:46 [error] 21#0: *44 lua entry thread aborted: runtime error: /usr/local/share/lua/5.1/resty/dns/balancer/ring.lua:432: failed to create timer for background DNS resolution: process exiting
stack traceback:
coroutine 0:
[C]: in function ‘assert’
/usr/local/share/lua/5.1/resty/dns/balancer/ring.lua:432: in function ‘new’
/usr/local/share/lua/5.1/kong/runloop/balancer.lua:404: in function ‘create_balancer’
/usr/local/share/lua/5.1/kong/runloop/balancer.lua:762: in function ‘init’
/usr/local/share/lua/5.1/kong/runloop/handler.lua:813: in function </usr/local/share/lua/5.1/kong/runloop/handler.lua:812>, context: ngx.timer
2020/03/17 11:59:46 [error] 22#0: *5 [lua] timer.lua:106: resty_timer(): [resty-timer] failed to create timer: process exiting, context: ngx.timer
2020/03/17 11:59:46 [error] 22#0: *5 lua entry thread aborted: runtime error: /usr/local/share/lua/5.1/resty/dns/balancer/ring.lua:432: failed to create timer for background DNS resolution: process exiting
stack traceback:
coroutine 0:
[C]: in function ‘assert’
/usr/local/share/lua/5.1/resty/dns/balancer/ring.lua:432: in function ‘new’
/usr/local/share/lua/5.1/kong/runloop/balancer.lua:404: in function ‘create_balancer’
/usr/local/share/lua/5.1/kong/runloop/balancer.lua:762: in function ‘init’
/usr/local/share/lua/5.1/kong/runloop/handler.lua:813: in function </usr/local/share/lua/5.1/kong/runloop/handler.lua:812>, context: ngx.timer
2020/03/17 11:59:46 [error] 31#0: *61497 [lua] timer.lua:106: resty_timer(): [resty-timer] failed to create timer: process exiting, context: ngx.timer
2020/03/17 11:59:46 [error] 31#0: *61497 lua entry thread aborted: runtime error: /usr/local/share/lua/5.1/resty/dns/balancer/ring.lua:432: failed to create timer for background DNS resolution: process exiting
stack traceback:
coroutine 0:
[C]: in function ‘assert’
/usr/local/share/lua/5.1/resty/dns/balancer/ring.lua:432: in function ‘new’
/usr/local/share/lua/5.1/kong/runloop/balancer.lua:404: in function ‘create_balancer’
/usr/local/share/lua/5.1/kong/runloop/balancer.lua:762: in function ‘init’
/usr/local/share/lua/5.1/kong/runloop/handler.lua:813: in function </usr/local/share/lua/5.1/kong/runloop/handler.lua:812>, context: ngx.timer
2020/03/17 11:59:47 [crit] 25#0: *42438 [lua] balancer.lua:766: init(): failed creating balancer for _xxpgx0ymg.test.upstreams.gateway: failed to get from node cache: could not acquire callback lock: timeout, context: ngx.timer
2020/03/17 11:59:47 [error] 25#0: *42438 [lua] timer.lua:106: resty_timer(): [resty-timer] failed to create timer: process exiting, context: ngx.timer
2020/03/17 11:59:47 [error] 25#0: *42438 lua entry thread aborted: runtime error: /usr/local/share/lua/5.1/resty/dns/balancer/ring.lua:432: failed to create timer for background DNS resolution: process exiting
stack traceback:

Can you open a Github issue with the above errors?
Please include Kong settings and configuration in the issue as well.

Hi @hbagdi Thanks For your response. I am using the same manifest files from github .
After debugging I came to know, for proxy container only ( working in case of migrations) Postgres DNS was not getting resolved. after putting the hardcoded value of IP and post it’s started working.

Can you please guide me what can be set to resolve this issue

problem

  • name: KONG_DATABASE

        value: "postgres"
    
      - name: KONG_PG_HOST
    
        value: postgres
    
      - name: KONG_PG_PASSWORD
    
        value: kong
    

Temporary Solution

  • name: KONG_PG_HOST

        value: "10.X.X.X"
    
      - name: KONG_PG_PORT
    
        value: "5432"
    

now , for my real production traffic as well facing the dns resolver issue

Can you show all the services in Kubernetes?
kubectl get svc --all-namespaces

D:\work\gitlab\CloudGateway\cloud gateway\CloudGateway\k8-deployment\gateway-deployment\kong-test-service\services\echo-service\ingress-rules>kubectl get svc -n cf-gateway-1
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
admin ClusterIP 10.6.14.12 8001/TCP 79m
kong-proxy LoadBalancer 10.6.210.227 9f811eab8640a6373067e1-78860b5c054f2715.elb.us-east-2.amazonaws.com 80:30241/TCP,443:30741/TCP 79m
kong-validation-webhook ClusterIP 10.6.46.3 443/TCP 78m
postgres ClusterIP 10.6.55.210 5432/TCP 78m

Sorry I dont have access on cluster level . sharing the service of namespace

I missed the point that postgres is service of type ExternalName and I’m not sure how KubeDNS resolves the DNS in this case.

You could try using the DNS name of the external postgres database directly and skipping the ExternalName service redirection to solve this.

Can you check if s3.amazonaws.com is resolvable by using “nslookup” command from the Kong container?

I may be a bit late here but @hbagdi I’m also facing a similar issue wherein we have a CloudFront CDN distribution and to add it as a kong route/service I’m using below manifest:

apiVersion: v1
kind: Service
metadata:
  annotations:
    configuration.konghq.com: upstream-https-proto
  labels:
    app: my-content
  name: my-content
  namespace: qa
spec:
  externalName: content.mydomain.com
  ports:
  - port: 443
    targetPort: 443
  type: ExternalName
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  annotations:
    configuration.konghq.com: strippath-and-dontpreservehost
    kubernetes.io/ingress.class: kong
  labels:
    app: my-content
  name: my-content
  namespace: qa
spec:
  rules:
  - host: qa.mydomain.com
    http:
      paths:
      - backend:
          serviceName: my-content
          servicePort: 443
        path: /content

But as we know CloudFront distribution doesn’t have static IP so once the IPs change I start getting below errors in proxy pod logs similar to what @Ashish_Mishra is getting :

ingress-kong-lhcl4 proxy 2020/05/01 14:49:25 [error] 21#0: *52194498 [lua] balancer.lua:294: [healthchecks] failed setting peer     status: no peer found by name 'content.mydomain.com' and address 13.224.197.6:443, context: ngx.timer
ingress-kong-lhcl4 proxy 2020/05/01 14:49:25 [error] 21#0: *52194498 [lua] balancer.lua:294: [healthchecks] failed setting peer     status: no peer found by name 'content.mydomain.com' and address 13.224.197.65:443, context: ngx.timer
ingress-kong-8pmfx proxy 2020/05/01 14:49:25 [error] 22#0: *52535607 [lua] balancer.lua:294: [healthchecks] failed setting peer     status: no peer found by name 'content.mydomain.com' and address 13.224.197.107:443, context: ngx.timer
ingress-kong-8pmfx proxy 2020/05/01 14:49:25 [error] 22#0: *52535607 [lua] balancer.lua:294: [healthchecks] failed setting peer status: no peer found by name 'content.mydomain.com' and address 13.224.197.51:443, context: ngx.timer

I’m evaluating kong-ingress latest version with kong versions 1.4.3 and 2.0.
In the existing infrastructure we have kong 0.14.1 where we are using the APIs and that doesn’t show this behavior.

One more thing I noticed was that in our upstreams with version 0.14.1 API setup we receive Host header value as the service domain name ex. myservice.svc.cluster.local but with 1.4.3/2/0 services/routes setup we get IP address as the value of Host header.