Kong timeouts and duplicate key warnings in log with new docker swarm setup

#1

Hello Kong people,

first of all thank you fort his great product! We love the way kong manages our api calls!

We changed setup from single kong node to a docker swarm setup.
The first days everything was ok. But we had to rollback to the single node because of problems.
timeouts and db duplicate keys.
we are now planning to do load tests before going live. But we do not know what the problem is. Am I missing something? do i have to work with sticky sessions on loadbalancer? Or do I have to do some extra settings on the kong nodes?

we see allot of timeouts on our apps error log

In kong log
insert(): ERROR: duplicate key value violates unique constraint “acls_cache_key_key”

in postgress log
ERROR: duplicate key value violates unique constraint “acls_cache_key_key”

DETAIL: Key (cache_key)=(acls:02bb8f64-dd67-459d-a163-1d36dcdb8d18:allroles:::slight_smile: already exists.

Our configuration

Docker swarm

  • 4 Loadbalanced kong nodes
    • 1 postgresql DB with these settings (used pgtune for this)
      • max_connections=200
      • shared_buffers=1GB
      • effective_cache_size=1536MB
      • maintenance_work_mem=256MB
      • checkpoint_completion_target=0.7
      • wal_buffers=16MB
      • default_statistics_target=100
      • random_page_cost=1.1
      • effective_io_concurrency=300
      • work_mem=2621kB
      • min_wal_size=1GB
      • max_wal_size=2GB
      • max_worker_processes=4
      • max_parallel_workers_per_gather=2
    • Kong setting (versio 1.0.3)
      • KONG_CLIENT_BODY_BUFFER_SIZE=80

We use oauth and acls plugins on all users. We have 20000 plugins installed at this moment

I hope somebody can push me in the right direction. If more info is needed, just let me know

docker swarm compose file --> https://gist.github.com/guushamann/ed1e365c9bcb126428246eb45c837114

Thanks
Guus Hamann
Xaurum