Kong performance test, why 10 cores of 20 core CPUs utilization 40-50%, the remaining 10 cores utilization is 0 or very few? ??


#1

Additional Details & Logs

  • Kong version :0.14.1
  • Kong debug-level startup : kong start / error level
  • Kong configuration :

/usr/local/kong/nginx.conf

worker_processes auto;
worker_cpu_affinity auto;
daemon on;

pid pids/nginx.pid;
error_log /opt/data/kong/logs/error.log error;

worker_rlimit_nofile 300000;

events {
worker_connections 102400;
multi_accept on;
}

http {
include ‘nginx-kong.conf’;
}


/usr/local/kong/kong-nginx.conf

charset UTF-8;

error_log syslog:server=kong-hf.mashape.com:61828 error;

error_log /opt/data/kong/logs/error.log error;

server_tokens off;
include mime.types;
default_type application/octet-stream;
client_header_buffer_size 32k;

keepalive_timeout 300s 300s;
keepalive_requests 10000;

log_format main ‘$time_local $remote_addr $status $server_addr $http_host “$request” $body_bytes_sent “$http_referer” $http_user_agent $upstream_addr $request $upstream_addr $upstream_response_time $request_time $http_x_forwarded_for $http_tenant_id’;

sendfile on;
tcp_nopush on;
tcp_nodelay on;

#reset_timeout_connection on;
#limit_conn_zone $binary_remote_addr zone=addr:5m;
#limit_conn addr 100;
gzip on;
gzip_disable “msie6”;

gzip_static on;

gzip_proxied any;
gzip_min_length 1000;
gzip_comp_level 4;
gzip_types text/plain text/css application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript;

open_file_cache max=100000 inactive=20s;
open_file_cache_valid 30s;
open_file_cache_min_uses 2;
open_file_cache_errors on;
client_max_body_size 50m;
client_body_buffer_size 128k;
proxy_connect_timeout 300;
proxy_send_timeout 300;
proxy_read_timeout 300;
proxy_buffer_size 64k;
proxy_buffers 4 32k;
proxy_busy_buffers_size 64k;
proxy_temp_file_write_size 64k;
proxy_ignore_client_abort on;

client_max_body_size 50m;

proxy_ssl_server_name on;
underscores_in_headers on;

lua_package_path ‘./?.lua;./?/init.lua;;;’;
lua_package_cpath ‘;;’;
lua_socket_pool_size 30;
lua_max_running_timers 4096;
lua_max_pending_timers 16384;
lua_shared_dict kong 50m;
lua_shared_dict kong_db_cache 256m;
lua_shared_dict kong_db_cache_miss 120m;
lua_shared_dict kong_locks 80m;
lua_shared_dict kong_process_events 500m;
lua_shared_dict kong_cluster_events 50m;
lua_shared_dict kong_healthchecks 50m;
lua_shared_dict kong_rate_limiting_counters 120m;
lua_socket_log_errors off;

injected nginx_http_* directives

lua_shared_dict prometheus_metrics 5m;


#2

my server is hyperthreaded CPU

[root@local_11p21 ~]# kong start
[root@local_11p21 ~]# ps -ef | grep nginx
root 2603 2038 0 10:25 pts/3 00:00:00 grep --color=auto nginx
root 21153 1 0 Nov26 ? 00:00:00 nginx: master process /usr/local/openresty/nginx/sbin/nginx -p /usr/local/kong -c nginx.conf
nobody 21154 21153 7 Nov26 ? 01:07:27 nginx: worker process
nobody 21155 21153 8 Nov26 ? 01:16:40 nginx: worker process
nobody 21156 21153 7 Nov26 ? 01:07:59 nginx: worker process
nobody 21157 21153 7 Nov26 ? 01:07:59 nginx: worker process
nobody 21158 21153 7 Nov26 ? 01:10:20 nginx: worker process
nobody 21159 21153 7 Nov26 ? 01:07:28 nginx: worker process
nobody 21160 21153 7 Nov26 ? 01:10:00 nginx: worker process
nobody 21161 21153 6 Nov26 ? 01:02:26 nginx: worker process
nobody 21162 21153 6 Nov26 ? 01:01:11 nginx: worker process
nobody 21163 21153 6 Nov26 ? 01:04:26 nginx: worker process
nobody 21164 21153 6 Nov26 ? 01:05:01 nginx: worker process
nobody 21165 21153 7 Nov26 ? 01:13:22 nginx: worker process
nobody 21166 21153 7 Nov26 ? 01:12:34 nginx: worker process
nobody 21167 21153 7 Nov26 ? 01:14:17 nginx: worker process
nobody 21168 21153 8 Nov26 ? 01:15:09 nginx: worker process
nobody 21170 21153 8 Nov26 ? 01:19:40 nginx: worker process
nobody 21171 21153 8 Nov26 ? 01:22:30 nginx: worker process
nobody 21172 21153 8 Nov26 ? 01:24:04 nginx: worker process
nobody 21173 21153 9 Nov26 ? 01:27:30 nginx: worker process
nobody 21174 21153 10 Nov26 ? 01:34:18 nginx: worker process

Total number of nginx worker process 20


#3

set accept_mutex on;


#4

What % are you calling “very few”? Like 1-10% or any estimate based on what you have been monitoring? Personally this may be a good question for nginx themselves since Kong doesn’t really manage the workload distribution, although I doubt nginx has an inherent flaw that prevents 10+ core machines from being able to leverage extra cores and distribute the workload evenly. Your configs look fine to me at a glance with:

worker_processes auto;
worker_cpu_affinity auto;

But I am still more of a beginner in the realm of how to best tune nginx for perf.

I personally configured ours to set # of worker processes because I noticed better performance out of that actually as opposed to letting it solve for it itself:

nginxConf