we are using the 3.9.0 Ubuntu Image available at hub.docker.com
we are deploying image to Azure AKS environment. This is an upgrade, so bootstrap is not being leveraged.
dns resolution issues are being reported such as
Error: [PostgreSQL error] failed to retrieve PostgreSQL server_version_num: [cosocket] DNS resolution failed: DNS server error: failed to receive reply from UDP server 10.0.0.10:53: timeout, took 276 ms. Tried: [[“psql-hcp-apim-dmz-cp-dev-centralus.postgres.database.azure.com:A”,“DNS server error: failed to receive reply from UDP server 10.0.0.10:53: timeout, took 276 ms”]]
we are on Postgresql V14. if i connect to the pod running kong I am unable to do a nslookup on the endpoint without issue.
here are some of the logs such as when i do a kong migrations list
i exec’d into the pod running the cp and did a kong migrations --v list
$ kong migrations --v list
2025/01/17 19:16:59 [notice] 1449#0: using the “epoll” event method
2025/01/17 19:16:59 [notice] 1449#0: openresty/1.25.3.2
2025/01/17 19:16:59 [notice] 1449#0: OS: Linux 5.15.173.1-1.cm2
2025/01/17 19:16:59 [notice] 1449#0: getrlimit(RLIMIT_NOFILE): 1048576:1048576
2025/01/17 19:16:59 [notice] 1449#0: *2 [lua] client.lua:161: new(): [dns_client] supported types: srv ipv4 ipv6 , context: ngx.timer
2025/01/17 19:16:59 [verbose] Kong: 3.9.0
2025/01/17 19:16:59 [verbose] no config file found at /etc/kong/kong.conf
2025/01/17 19:16:59 [verbose] no config file found at /etc/kong.conf
2025/01/17 19:16:59 [verbose] no config file, skip loading
2025/01/17 19:16:59 [verbose] prefix in use: /kong_prefix
2025/01/17 19:16:59 [notice] 1449#0: *2 [lua] client.lua:161: new(): [dns_client] supported types: srv ipv4 , context: ngx.timer
2025/01/17 19:16:59 [verbose] preparing nginx prefix directory at /kong_prefix
2025/01/17 19:16:59 [verbose] SSL enabled on proxy, no custom certificate set: using default certificates
2025/01/17 19:16:59 [verbose] proxy SSL certificate found at /kong_prefix/ssl/kong-default.crt
2025/01/17 19:16:59 [verbose] proxy SSL certificate found at /kong_prefix/ssl/kong-default-ecdsa.crt
2025/01/17 19:16:59 [verbose] SSL enabled on admin_gui, no custom certificate set: using default certificates
2025/01/17 19:16:59 [verbose] admin_gui SSL certificate found at /kong_prefix/ssl/admin-gui-kong-default.crt
2025/01/17 19:16:59 [verbose] admin_gui SSL certificate found at /kong_prefix/ssl/admin-gui-kong-default-ecdsa.crt
2025/01/17 19:16:59 [verbose] generating trusted certs combined file in /kong_prefix/.ca_combined
2025/01/17 19:16:59 [info] 1449#0: *2 [lua] node.lua:303: new(): kong node-id: 6bcb69bd-c5a8-4a66-abea-2cbc5ee0c453, context: ngx.timer
Error:
/usr/local/share/lua/5.1/kong/cmd/migrations.lua:101: [PostgreSQL error] failed to retrieve PostgreSQL server_version_num: [cosocket] DNS resolution failed: DNS server error: failed to receive reply from UDP server 10.0.0.10:53: timeout, took 409 ms. Tried: [[“psql-hcp-apim-dmz-cp-dev-centralus.postgres.database.azure.com:A”,“DNS server error: failed to receive reply from UDP server 10.0.0.10:53: timeout, took 409 ms”]]
stack traceback:
[C]: in function 'assert'
/usr/local/share/lua/5.1/kong/cmd/migrations.lua:101: in function 'cmd_exec'
/usr/local/share/lua/5.1/kong/cmd/init.lua:31: in function </usr/local/share/lua/5.1/kong/cmd/init.lua:31>
[C]: in function 'xpcall'
/usr/local/share/lua/5.1/kong/cmd/init.lua:31: in function </usr/local/share/lua/5.1/kong/cmd/init.lua:15>
(command line -e):5: in function 'inline_gen'
init_worker_by_lua(nginx.conf:185):44: in function <init_worker_by_lua(nginx.conf:185):43>
[C]: in function 'xpcall'
init_worker_by_lua(nginx.conf:185):52: in function <init_worker_by_lua(nginx.conf:185):50>
which sort of lines up with what the kong discussion was, but yes, that looks like it was 3.7.1
if i do debug mode i get some additional info
2025/01/17 19:19:01 [debug] 1465#0: *2 [lua] client.lua:550: resolve_all(): [dns_client] resolve_all psql-hcp-apim-dmz-cp-dev-centralus.postgres.database.azure.com:-1
2025/01/17 19:19:01 [debug] 1465#0: *2 [lua] client.lua:534: [dns_client] cache miss, try to query psql-hcp-apim-dmz-cp-dev-centralus.postgres.database.azure.com:-1
2025/01/17 19:19:02 [debug] 1465#0: *2 [lua] client.lua:362: resolve_query(): [dns_client] r:query(psql-hcp-apim-dmz-cp-dev-centralus.postgres.database.azure.com:1) ans:- t:451 ms
2025/01/17 19:19:02 [debug] 1465#0: *2 [lua] client.lua:567: resolve_all(): [dns_client] cache lookup psql-hcp-apim-dmz-cp-dev-centralus.postgres.database.azure.com:-1 ans:- hlv:fail
Error:
/usr/local/share/lua/5.1/kong/cmd/migrations.lua:101: [PostgreSQL error] failed to retrieve PostgreSQL server_version_num: [cosocket] DNS resolution failed: DNS server error: failed to receive reply from UDP server 10.0.0.10:53: timeout, took 451 ms. Tried: [[“psql-hcp-apim-dmz-cp-dev-centralus.postgres.database.azure.com:A”,“DNS server error: failed to receive reply from UDP server 10.0.0.10:53: timeout, took 451 ms”]]
stack traceback:
[C]: in function 'assert'
/usr/local/share/lua/5.1/kong/cmd/migrations.lua:101: in function 'cmd_exec'
/usr/local/share/lua/5.1/kong/cmd/init.lua:31: in function </usr/local/share/lua/5.1/kong/cmd/init.lua:31>
[C]: in function 'xpcall'
/usr/local/share/lua/5.1/kong/cmd/init.lua:31: in function </usr/local/share/lua/5.1/kong/cmd/init.lua:15>
(command line -e):5: in function 'inline_gen'
init_worker_by_lua(nginx.conf:185):44: in function <init_worker_by_lua(nginx.conf:185):43>
[C]: in function 'xpcall'
init_worker_by_lua(nginx.conf:185):52: in function <init_worker_by_lua(nginx.conf:185):50>
here is the nslookup of the endpoint in question that i see
nslookup psql-hcp-apim-dmz-cp-dev-centralus.postgres.database.azure.com
;; Got recursion not available from 10.0.0.10
;; Got recursion not available from 10.0.0.10
;; Got recursion not available from 10.0.0.10
;; Got recursion not available from 10.0.0.10
Server: 10.0.0.10
Address: 10.0.0.10#53
Non-authoritative answer:
psql-hcp-apim-dmz-cp-dev-centralus.postgres.database.azure.com canonical name = psql-hcp-apim-dmz-cp-dev-centralus.privatelink.postgres.database.azure.com.
Name: psql-hcp-apim-dmz-cp-dev-centralus.privatelink.postgres.database.azure.com
Address: 10.15.34.69
so at the pod level it does resolve, but when running migrations it doesn’t.