Kong kubernetes pod does not start after system restart

Prashant_Shandilya · April 8, 2019, 3:15pm

Installed kong api gateway on my laptop’s docker-for-desktop k8s cluster
Works well
k8s cluster is restarted, k8s get up and postgres pod starts successfully
However kong pod does not start. It shows ‘Terminated: error’ status
Deleted the pod, and k8s tried to create new pod. But it goes in to ‘Waiting: PodInitializing’ status forever.

Summary - after cluster restart kong pod does not start successfully

Please help, if anyone experience this issue before.

hbagdi · April 8, 2019, 3:48pm

Hi @Prashant_Shandilya

How have you installed Kong on Kubernetes? Could you share your deployment spec?
Did you check the logs of the pods which failed to come up?

Prashant_Shandilya · April 9, 2019, 6:53am

Hi @hbagdi
Please refer below screen shot of my k8s deployment.

I did check the logs, its read as below

container “kong” in pod “bm-kongv1-kong-d55969f-x542v” is waiting to start: PodInitializing

I was able to reproduce the issue consiatntly for another inastance.

Installed kong with

helm install name=kg stable/kong

It installed kong gateway, everything was up and running.
Restarted docker-desktop
All other k8s services except for kong service started correctly.

Logs -

2019/04/09 09:15:51 [notice] 1#0: using the “epoll” event method

2019/04/09 09:15:51 [notice] 1#0: openresty/1.13.6.2

2019/04/09 09:15:51 [notice] 1#0: built by gcc 6.3.0 (Alpine 6.3.0)

2019/04/09 09:15:51 [notice] 1#0: OS: Linux 4.9.125-linuxkit

2019/04/09 09:15:51 [notice] 1#0: getrlimit(RLIMIT_NOFILE): 1048576:1048576

2019/04/09 09:15:51 [notice] 1#0: start worker processes

2019/04/09 09:15:51 [notice] 1#0: start worker process 37

2019/04/09 09:15:51 [notice] 1#0: start worker process 38

10.1.0.1 - - [09/Apr/2019:09:15:51 +0000] “GET /status HTTP/1.1” 200 205 “-” “kube-probe/1.10”

10.1.0.1 - - [09/Apr/2019:09:16:01 +0000] “GET /status HTTP/1.1” 200 205 “-” “kube-probe/1.10”

10.1.0.1 - - [09/Apr/2019:09:16:04 +0000] “GET /status HTTP/1.1” 200 205 “-” “kube-probe/1.10”

10.1.0.1 - - [09/Apr/2019:09:16:11 +0000] “GET /status HTTP/1.1” 200 205 “-” “kube-probe/1.10”

10.1.0.1 - - [09/Apr/2019:09:16:21 +0000] “GET /status HTTP/1.1” 200 205 “-” “kube-probe/1.10”

10.1.0.1 - - [09/Apr/2019:09:16:31 +0000] “GET /status HTTP/1.1” 200 205 “-” “kube-probe/1.10”

10.1.0.1 - - [09/Apr/2019:09:16:34 +0000] “GET /status HTTP/1.1” 200 205 “-” “kube-probe/1.10”

10.1.0.1 - - [09/Apr/2019:09:16:41 +0000] “GET /status HTTP/1.1” 200 205 “-” “kube-probe/1.10”

10.1.0.1 - - [09/Apr/2019:09:16:51 +0000] “GET /status HTTP/1.1” 200 207 “-” “kube-probe/1.10”

192.168.65.3 - - [09/Apr/2019:09:16:51 +0000] “GET / HTTP/1.1” 200 5567 “-” “Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36”

192.168.65.3 - - [09/Apr/2019:09:16:52 +0000] “GET /favicon.ico HTTP/1.1” 404 23 “https://localhost:31713/” “Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36”

10.1.0.1 - - [09/Apr/2019:09:17:01 +0000] “GET /status HTTP/1.1” 200 208 “-” “kube-probe/1.10”

10.1.0.1 - - [09/Apr/2019:09:17:04 +0000] “GET /status HTTP/1.1” 200 208 “-” “kube-probe/1.10”

2019/04/09 09:17:10 [notice] 37#0: signal 15 (SIGTERM>) received, exiting

hutchic · April 9, 2019, 10:36am

What’s the output if you describe the pod?

Prashant_Shandilya · April 9, 2019, 11:00am

Name: kg-kong-6cc76cdcb9-xdlbr
Namespace: kg
Node: docker-for-desktop/192.168.65.3
Start Time: Tue, 09 Apr 2019 14:42:41 +0530
Labels: app=kong
component=app
pod-template-hash=2773278765
release=kg
Annotations:
Status: Pending
IP: 10.1.1.147
Controlled By: ReplicaSet/kg-kong-6cc76cdcb9
Init Containers:
wait-for-db:
Container ID: docker://71dc0fab06deb5ce555b5fc9b788bb642a4939f220d0989dc596de0ae1b92347
Image: kong:1.0.2
Image ID: docker-pullable://kong@sha256:555863cf0b3cfae8fc9265f8dd36f0db30fafc0ac7791be0c29f70f8c9b130e8
Port:
Host Port:
Command:
/bin/sh
-c
until kong start; do echo ‘waiting for db’; sleep 1; done; kong stop
State: Running
Started: Tue, 09 Apr 2019 14:51:07 +0530
Ready: False
Restart Count: 1
Environment:
KONG_PROXY_ACCESS_LOG: /dev/stdout
KONG_ADMIN_ACCESS_LOG: /dev/stdout
KONG_PROXY_ERROR_LOG: /dev/stderr
KONG_ADMIN_ERROR_LOG: /dev/stderr
KONG_PG_HOST: kg-postgresql
KONG_PG_PORT: 5432
KONG_PG_PASSWORD: <set to the key ‘postgresql-password’ in secret ‘kg-postgresql’> Optional: false
KONG_DATABASE: postgres
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-wqfg6 (ro)
Containers:
kong:
Container ID: docker://d8104d677642e1c8f002405fb86e4316016f277932447dc2ce702c095052803a
Image: kong:1.0.2
Image ID: docker-pullable://kong@sha256:555863cf0b3cfae8fc9265f8dd36f0db30fafc0ac7791be0c29f70f8c9b130e8
Ports: 8444/TCP, 8000/TCP, 8443/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP
State: Terminated
Reason: Error
Exit Code: 255
Started: Tue, 09 Apr 2019 14:45:12 +0530
Finished: Tue, 09 Apr 2019 14:49:25 +0530
Ready: False
Restart Count: 0
Liveness: http-get https://:admin/status delay=30s timeout=5s period=30s #success=1 #failure=5
Readiness: http-get https://:admin/status delay=30s timeout=1s period=10s #success=1 #failure=5
Environment:
KONG_ADMIN_LISTEN: 0.0.0.0:8444 ssl
KONG_PROXY_LISTEN: 0.0.0.0:8000,0.0.0.0:8443 ssl
KONG_NGINX_DAEMON: off
KONG_PROXY_ACCESS_LOG: /dev/stdout
KONG_ADMIN_ACCESS_LOG: /dev/stdout
KONG_PROXY_ERROR_LOG: /dev/stderr
KONG_ADMIN_ERROR_LOG: /dev/stderr
KONG_DATABASE: postgres
KONG_PG_HOST: kg-postgresql
KONG_PG_PORT: 5432
KONG_PG_PASSWORD: <set to the key ‘postgresql-password’ in secret ‘kg-postgresql’> Optional: false
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-wqfg6 (ro)
Conditions:
Type Status
Initialized False
Ready False
PodScheduled True
Volumes:
default-token-wqfg6:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-wqfg6
Optional: false
QoS Class: BestEffort
Node-Selectors:
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events: [quote=“hutchic, post:4, topic:3220, full:true”]
What’s the output if you describe the pod?
[/quote]

Prashant_Shandilya · April 9, 2019, 2:36pm

To narrow down to exact root cause, i tried out several permutations -

Scale down - scale up kong pod: from 0 to 3, works
Restart k8s cluster works
Restart ‘docker for desktop’ does not work and even not able to recover with scale up - scale down

hutchic · April 9, 2019, 4:56pm

Based on that snippet it looks like k8s hasn’t told Kong to start because the initContainer hasn’t triggered it(?)

What logs show up in that initContainer. Can you exec into it and determine why it’s hung?

Prashant_Shandilya · April 10, 2019, 7:36am

Error log -

waiting for db

database needs bootstrapping; run ‘kong migrations bootstrap’

Error: /usr/local/share/lua/5.1/kong/cmd/start.lua:50: nginx: [error] init_by_lua error: /usr/local/share/lua/5.1/kong/init.lua:281: database needs bootstrap; run ‘kong migrations bootstrap’

stack traceback:

[C]: in function ‘error’

/usr/local/share/lua/5.1/kong/init.lua:281: in function ‘init’

init_by_lua:3: in main chunk

ua:3: in main chunk

2019-04-10T07:35:37.237272700Z

2019-04-10T07:35:37.237277900Z

Run with --v (verbose) or --vv (debug) for more details

waiting for db

hutchic · April 10, 2019, 10:15am

very interesting. Continuing down this rabbit hole while exec'd into that container can you connect and introspect the database. Are all the prerequisite databases / tables present?

Prashant_Shandilya · April 12, 2019, 7:17am

Tried it setup on k8s set up on ubuntu (WAS EC2). Had stability issue there as well

hutchic · April 12, 2019, 4:24pm

With the same symptoms? ie

hbagdi · April 12, 2019, 4:52pm

I had suspected that Docker for Desktop’s Kubernetes implementation might have a bug in your previous case, but since this is possible on EKS (Are you using EKS?) is very odd.

I myself have Kong running in a GKE cluster and that doesn’t seem to have this problem (yet).

I’d like to point out that there are two separate problems here:

The first error of database needing bootstrap means, that the Postgres Pod’s backing store had a problem and a database reset happened somehow.
The problem you see on AWS, is a different error of a particular migration missing.

Could you check your Postgres deployment?
Were there any pod restarts for Postgres around the time Kong started failing?

As @hutchic points out, could you please log into the Psql DB and list the tables that are in the database. Additionally please paste the content of schema and schema_meta tables into a Github Gist and paste the URL here.

koushik.raghu · March 15, 2022, 11:54am

I am stuck with this issue for a long time…
I can see that on cluster restart the kong pods get stuck in Init:1/2 stage.
Below are the logs.

waiting for db
Error: /usr/local/share/lua/5.1/kong/cmd/utils/migrations.lua:16: Database needs bootstrapping or is older than Kong 1.0.

On making these changes to kong chart -
[ "/bin/sh", "-c", "export KONG_NGINX_DAEMON=on; export KONG_PREFIX= mktemp -d ; until kong start; do echo 'waiting for db'; sleep 15; done; kong stop; rm -fv '/kong_prefix//stream_rpc.sock'"]

Here i have changed sleep 1 to sleep 15 so it takes some sufficient time for the postgres pod to come up completely. Also added the rm -fv '/kong_prefix//stream_rpc.sock'" ( refering this issue Leftover socket file interferes with startup when wait-for-db initContainer is enabled · Issue #295 · Kong/charts · GitHub )

On cluster restart the migrations pod is not coming up and job - kong-kong-init-migrations is completed .
Postgres is being created with a database kong in its deployment configuration and a user name and pwd.
Below is the volume attached to kong config -

 userDefinedVolumes:
  - name: "postgres-pv-claim2"
    persistentVolumeClaim:
      claimName: postgres-pv-claim2
  userDefinedVolumeMounts:
  - name: "postgres-pv-claim2"
    mountPath: "/var/lib/postgresql/data"

On normal helm install the pods come up and the functionality is as expected.
Why does kong get stuck in init stage each time cluster restarts ??
Do help if there is a workaround for this.

Tireli_Efe · November 8, 2023, 2:09pm

I haven’t tried Docker or Linux installation but Kong (via Helm Chart) cannot be installed properly on Kubernetes.
DB-less config doesn’t work.
Helm Chart with Postgresql stuck init state, I have the same problem.
There is no any documentation or workaround indicates to the real problems.

JohnWilliams · November 10, 2023, 12:57pm

@Tireli_Efe - I see you are trying out a new installation of Kong using helm, the documentation surely helps. If you are not successful here is my hands-on blog I wrote recently and it works on my Minikube, so should work. Let me know if you face any issues.

Tireli_Efe · November 10, 2023, 3:13pm

Hello @JohnWilliams
Thanks for the blog.
I read your tutorial and need to ask you cpl of CA related questions:

Is the certificate used to CP<>DP communication or also for external access to CP? Is it possible decouple these connections from each others? I mean, I want to encyrpt CP<>DP communication but don’t want to use SSL for external connection requests (because LB offload SSL).
When CA is enabled, can Kong be used as a healthy gateway to route traffic? Have you tried it?
If LB offloads SSL before ingress, do you think Certificate is still needed?

Thanks & Regards

JohnWilliams · November 10, 2023, 4:08pm

@Tireli_Efe

The certificates mentioned in the tutorial are primarily for CP <> DP communications.

You can still serve traffic over HTTP, default ports are on 8000 (DP), 8001 (CP). Shown below are the ports exposed based on the values.yml (you can fine tune and remove unwanted ports or configure them)

Since DP is the one which receives runtime traffic, this can be TLS terminated at LB or at Kong. But enabling CA would be right way for Production grade traffic.
If LB terminates the TLS, then you can use non-TLS ports behind the LB.

Hope this clarifies.

The certificates section on the values.yml is where you control which components needs cert generation. In the below example its enabled only for cluster (CP <> DP)

certificates:
  enabled: true
  clusterIssuer: letsencrypt-issuer-staging
  cluster:
    enabled: true
    commonName: cluster.example.in
  proxy:
    enabled: false
  admin:
    enabled: false
  portal:
    enabled: false

Tireli_Efe · November 10, 2023, 8:54pm

thank you @JohnWilliams
before heading to topics you mentioned in your blog, I need to figure out a problem described here.

Maybe you have advises to find a workaround.

Thanks & Regards

Topic		Replies	Views
Kong-controlplane is always waiting to start kubernetes , kong-gateway	4	2648	October 17, 2019
Kong k8s POD crashes after few hours Questions	5	1409	April 15, 2019
Kong not coming back after reboot / docker restart Questions kubernetes	0	895	June 2, 2020
Database needs bootstrap; run 'kong migrations bootstrap' Questions kubernetes	6	14922	May 28, 2019
Kong Helm Chart and External Postgres database Questions service-mesh , kubernetes	23	12632	November 8, 2023

Kong kubernetes pod does not start after system restart

Related topics