Kong Nginx 503 error


#1

Hi all.
we just set up Kong cluster with 2 kong nodes on our production environment. We didn’t use the load balancer in front of Kong , but using Nginx service on one of Kong nodes to proxy requests to Kong nodes

While we did the pressure test using Apach ab tool ,we found a lot of failed requests .

[work@DWD-BETA ~]$ab -n 1000 -c 100 "http://www,abc.com/api/"
This is ApacheBench, Version 2.3 <$Revision: 1430300 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking www.abc.com (be patient)
Completed 100 requests
Completed 200 requests
Completed 300 requests
Completed 400 requests
Completed 500 requests
Completed 600 requests
Completed 700 requests
Completed 800 requests
Completed 900 requests
Completed 1000 requests
Finished 1000 requests


Server Software:
Server Hostname:        www.abc.com
Server Port:            80

Document Path:          /api
Document Length:        206 bytes

Concurrency Level:      100
Time taken for tests:   7.894 seconds
Complete requests:      1000
Failed requests:        599
   (Connect: 0, Receive: 0, Length: 599, Exceptions: 0)
Write errors:           0
Non-2xx responses:      401
Total transferred:      599127 bytes
HTML transferred:       365933 bytes
Requests per second:    126.67 [#/sec] (mean)
Time per request:       789.428 [ms] (mean)
Time per request:       7.894 [ms] (mean, across all concurrent requests)
Transfer rate:          74.12 [Kbytes/sec] received

after checking the kong proxy access logs ,I found a lot of 503 errors:

xx.xx.xx.xx - - [21/Dec/2018:16:24:47 +0800] "GET /api/ HTTP/1.0" 503 206 "-" "-" "ApacheBench/2.3 "xx.xx.xx.xx"-" 0.001 "0.001"

I did some optimizations for kong Nginx . the nginx.conf files as below:

[root@kong-node2 nginx]$cat /data/kong/nginx.conf
worker_processes auto;
daemon on;

pid pids/nginx.pid;
error_log logs/error.log notice;

worker_rlimit_nofile 65535;


events {

use epoll;
worker_connections 65535;
multi_accept on;
}

http {
    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" "$http_cookie" "$http_user_agent"'
                      '$remote_addr $server_addr $upstream_addr $host'
                      '"$http_x_forwarded_for" $upstream_response_time "$request_time"';

    default_type  application/octet-stream;
    access_log  /data/logs/nginx/access.log  main;

    sendfile            on;
    tcp_nopush          on;
    tcp_nodelay         on;

    fastcgi_connect_timeout 5;
    fastcgi_send_timeout 600;
    fastcgi_read_timeout 600;
    fastcgi_buffer_size 64k;
    fastcgi_buffers 4 64k;
    fastcgi_busy_buffers_size 128k;
    fastcgi_temp_file_write_size 128k;

    keepalive_timeout  60;
    keepalive_requests 1024;
    client_header_buffer_size 4k;
    large_client_header_buffers 4 32k;
    types_hash_max_size 2048;

    client_body_buffer_size 2m;
    client_body_timeout 180;
    client_header_timeout 10;
    send_timeout 240;

    proxy_connect_timeout   1000ms;
    proxy_send_timeout      5000ms;
    proxy_read_timeout      5000ms;
    proxy_buffers           64 8k;
    proxy_busy_buffers_size    128k;
    proxy_temp_file_write_size 64k;
    proxy_redirect off;
    proxy_next_upstream off;

    gzip on;
    gzip_min_length 1k;
    gzip_buffers 4 16k;
    gzip_http_version 1.0;
    gzip_comp_level 2;
    gzip_types text/plain application/x-javascript text/css application/xml;
    gzip_vary on;

    server_tokens  off;
    include conf.d/*.conf;
    include 'nginx-kong.conf';
}

But it didn’t work .

Then I try to using a load balancer to proxy requests to Kong nodes.But the problem still happens.

This problem disappeared if the load balancer proxy requests directly to the real servers on the backend.

The Kong version is the latest(v1.14) and the cassandra database is 3.11.3

I have been troubleshooting this problem for a whole day , and there is no any helpful information on Google.

Could anyone help me please ?

Thank you very much in advance. Forgive my bad English!


#2

Hi @jesse,

I’m not sure I get exactly what your configuration is, but my tips to debug this:

  • check error.log also there might be some indication of where the problem is.
  • Do you get failures only when using high load? or single requests also fail?
  • At least in the beginning, try to configure kong through kong.conf to keep it as standard as possible.

#3

HI Raimon_Grau

Thank you for your kind response. I am sorry for replying so late .

  1. check error.log also there might be some indication of where the problem is

while we did the pressure test, the Kong’s error log didn’t record anything. But there is some errors In a real scenario:

2018/12/24 03:50:52 [error] 27097#0: *14469262 upstream timed out (110: Connection timed out) while reading response header from upstream
2018/12/24 03:50:52 [warn] 27097#0: *14471407 upstream server temporarily disabled while reading response header from upstream
2018/12/24 03:50:52 [error] 27097#0: *14471407 no live upstreams while connecting to upstream
2018/12/24 07:41:56 [warn] 27097#0: *14611324 a client request body is buffered to a temporary file /data/kong/client_body_temp/0000000
427
  1. Do you get failures only when using high load? or single requests also fail?

our production environment is running on the Kong API, Basically running normally except the occasional 503 errors

  1. At least in the beginning, try to configure kong through kong.conf to keep it as standard as possible.

below is my kong.conf configure:

[work@kong-node2 kong]$sed -n '/^#/!p' kong.conf | sed -rn '/^[[:space:]]+#/!p' | sed '/^$/d'
prefix = /data/kong/       # Working directory. Equivalent to Nginx's
proxy_access_log = logs/access.log       # Path for proxy port request access
proxy_error_log = logs/error.log         # Path for proxy port request error
admin_listen = xx.xx.xx.xx:8001, 127.0.0.1:8444 ssl
database = cassandra
cassandra_contact_points = xx.xx.xx.xx  # A comma-separated list of contact
cassandra_port = 9042           # The port on which your nodes are listening
cassandra_keyspace = kong       # The keyspace to use in your cluster.
cassandra_username = username       # Username when using the
cassandra_password = passowrd          # Password when using the
cassandra_consistency = QUORUM     # Consistency setting to use when reading/
db_update_frequency = 5         # Frequency (in seconds) at which to check for
db_update_propagation = 2       # Time (in seconds) taken for an entity in the