Hi guys,
We are using Kong Ingress Controller to router all incoming requests to the several GUIs that our app has. We are randomly experiencing an error where we get a “Kong Error” message in the browser. Looking at kong logs we found this:
2019/12/03 16:35:33 [error] 36#0: *28506728 upstream timed out (110: Operation timed out) while connecting to upstream, client: X.X.X.X, server: kong, request: "GET /mat/preprod/workflow/ HTTP/1.1", upstream: "http://10.233.69.49:8080/mat/preprod/workflow/", host: "mat.iquall.net"
We suspected an issue in the pod serving the application, so we restarted the app pod to no avail. Only after restarting kong pods (it is deployed as a daemonset), we got the service up and running.
Our Kubertenes cluster has 4 worker nodes, and one of the pods in the daemonset worked correctly, while the other 3 pods where unable to contact the upstream server.
We are using kong version 1.2.2, and our configuration is as follows:
INGRESS
# kubectl describe ingresses -n mat workflow-preprod-webserver
Name: workflow-preprod-webserver
Namespace: mat
Address:
Default backend: default-http-backend:80 (<none>)
Rules:
Host Path Backends
---- ---- --------
mat.iquall.net
/mat/preprod/workflow/ workflow-preprod-webserver:8080 (10.233.84.241:8080)
Annotations:
configuration.konghq.com: kongingress-webserver-preprod-workflow
kubernetes.io/ingress.class: kong
Events: <none>
########### SERVICE #################
root@mat-master2:~# kubectl describe svc workflow-preprod-webserver -n mat
Name: workflow-preprod-webserver
Namespace: mat
Labels: apps=workflow-webserver
env=preprod
Annotations: <none>
Selector: app=workflow-preprod-webserver
Type: ClusterIP
IP: 10.233.47.57
Port: webserver 8080/TCP
TargetPort: 8080/TCP
Endpoints: 10.233.84.241:8080
Session Affinity: None
Events: <none>
########### POD ###################
root@mat-master2:~# kubectl describe pods -n mat workflow-preprod-webserver-6566cf6f46-6d89m
Name: workflow-preprod-webserver-6566cf6f46-6d89m
Namespace: mat
Priority: 0
Node: mat-worker1/10.48.72.39
Start Time: Tue, 03 Dec 2019 13:26:56 -0300
Labels: app=workflow-preprod-webserver
pod-template-hash=6566cf6f46
Annotations: <none>
Status: Running
IP: 10.233.84.241
Controlled By: ReplicaSet/workflow-preprod-webserver-6566cf6f46
Containers:
workflow:
.....
############# KONG CONFIGURATION #############
kong=# SELECT * FROM routes WHERE "paths" = '{/mat/preprod/workflow/}';
-[ RECORD 1 ]--------------+-------------------------------------
id | 07fd60f7-78cc-4327-982e-d53591e528aa
created_at | 2019-12-03 16:51:07+00
updated_at | 2019-12-03 16:51:07+00
service_id | 2edac309-4c14-4e78-b471-6e1c09bd3b0b
protocols | {http,https}
methods |
hosts | {mat-operaciones.claro.amx}
paths | {/mat/preprod/workflow/}
regex_priority | 0
strip_path | f
preserve_host | t
name | mat.workflow-preprod-webserver.00
snis |
sources |
destinations |
tags | {managed-by-ingress-controller}
https_redirect_status_code | 426
kong=# SELECT * FROM services WHERE "id" = '2edac309-4c14-4e78-b471-6e1c09bd3b0b';
-[ RECORD 1 ]---+-------------------------------------
id | 2edac309-4c14-4e78-b471-6e1c09bd3b0b
created_at | 2019-10-10 18:43:04+00
updated_at | 2019-10-10 18:43:04+00
name | mat.workflow-preprod-webserver.8080
retries | 5
protocol | http
host | workflow-preprod-webserver.mat.svc
port | 80
path | /
connect_timeout | 60000
write_timeout | 60000
read_timeout | 60000
tags | {managed-by-ingress-controller}
What caught my attention is that the IP in the Backend configuration of the Ingress is the one of the POD and not the IP of the Service. Is this correct?
Any clues of where to look or further troubleshooting steps we could make?
Thanks in advance.
Regards,
Diego