Hi all,
I’m experimenting a blocking issue using Kong 1.0.3 on a Cassandra Datastax DSE6.0 cluster.
While deploying Kong 1.0.3 on a single node DSE6.0 cluster works fine, trying to deploy it on a 3 nodes cluster always ends with the following error during the database bootstrap process.
$ kong migrations bootstrap --vv -c ./kong.conf --db-timeout 120 --lock-timeout 120
2019/03/21 13:50:28 [verbose] Kong: 1.0.3
2019/03/21 13:50:28 [debug] ngx_lua: 10013
2019/03/21 13:50:28 [debug] nginx: 1013006
2019/03/21 13:50:28 [debug] Lua: LuaJIT 2.1.0-beta3
2019/03/21 13:50:28 [verbose] reading config file at ../etc/kong.conf
2019/03/21 13:50:28 [debug] reading environment variables
2019/03/21 13:50:28 [debug] KONG_ADMIN_LISTEN ENV found with "0.0.0.0:8001"
2019/03/21 13:50:28 [debug] KONG_PREFIX ENV found with "/opt/test/var"
2019/03/21 13:50:28 [debug] KONG_CASSANDRA_USERNAME ENV found with "admtest"
2019/03/21 13:50:28 [debug] KONG_CASSANDRA_PASSWORD ENV found with "******"
2019/03/21 13:50:28 [debug] KONG_DB_UPDATE_PROPAGATION ENV found with "3"
2019/03/21 13:50:28 [debug] KONG_CASSANDRA_PORT ENV found with "1524"
2019/03/21 13:50:28 [debug] KONG_DATABASE ENV found with "cassandra"
2019/03/21 13:50:28 [debug] KONG_CASSANDRA_CONTACT_POINTS ENV found with "10.10.163.1, 10.10.162.255, 10.10.162.254"
2019/03/21 13:50:28 [debug] KONG_LOG_LEVEL ENV found with "debug"
2019/03/21 13:50:28 [debug] KONG_NGINX_DAEMON ENV found with "off"
2019/03/21 13:50:28 [debug] admin_access_log = "logs/admin_access.log"
2019/03/21 13:50:28 [debug] admin_error_log = "logs/error.log"
2019/03/21 13:50:28 [debug] admin_listen = {"0.0.0.0:8001"}
2019/03/21 13:50:28 [debug] admin_ssl_cert = "/opt/test/etc/certs/changeme-kong_test_com-super-bundle.pem"
2019/03/21 13:50:28 [debug] admin_ssl_cert_key = "/opt/test/etc/certs/changeme-kong_test_com-key.pem"
2019/03/21 13:50:28 [debug] anonymous_reports = false
2019/03/21 13:50:28 [debug] cassandra_consistency = "LOCAL_QUORUM"
2019/03/21 13:50:28 [debug] cassandra_contact_points = {"10.10.163.1","10.10.162.255","10.10.162.254"}
2019/03/21 13:50:28 [debug] cassandra_data_centers = {"dc1:1"}
2019/03/21 13:50:28 [debug] cassandra_keyspace = "kong"
2019/03/21 13:50:28 [debug] cassandra_lb_policy = "RequestRoundRobin"
2019/03/21 13:50:28 [debug] cassandra_local_datacenter = "dc1"
2019/03/21 13:50:28 [debug] cassandra_password = "******"
2019/03/21 13:50:28 [debug] cassandra_port = 1524
2019/03/21 13:50:28 [debug] cassandra_repl_factor = 1
2019/03/21 13:50:28 [debug] cassandra_repl_strategy = "NetworkTopologyStrategy"
2019/03/21 13:50:28 [debug] cassandra_schema_consensus_timeout = 120000
2019/03/21 13:50:28 [debug] cassandra_ssl = false
2019/03/21 13:50:28 [debug] cassandra_ssl_verify = false
2019/03/21 13:50:28 [debug] cassandra_timeout = 5000
2019/03/21 13:50:28 [debug] cassandra_username = "admtest"
2019/03/21 13:50:28 [debug] client_body_buffer_size = "8k"
2019/03/21 13:50:28 [debug] client_max_body_size = "0"
2019/03/21 13:50:28 [debug] client_ssl = false
2019/03/21 13:50:28 [debug] database = "cassandra"
2019/03/21 13:50:28 [debug] db_cache_ttl = 0
2019/03/21 13:50:28 [debug] db_resurrect_ttl = 30
2019/03/21 13:50:28 [debug] db_update_frequency = 5
2019/03/21 13:50:28 [debug] db_update_propagation = 3
2019/03/21 13:50:28 [debug] dns_error_ttl = 1
2019/03/21 13:50:28 [debug] dns_hostsfile = "/etc/hosts"
2019/03/21 13:50:28 [debug] dns_no_sync = false
2019/03/21 13:50:28 [debug] dns_not_found_ttl = 30
2019/03/21 13:50:28 [debug] dns_order = {"LAST","SRV","A","CNAME"}
2019/03/21 13:50:28 [debug] dns_resolver = {}
2019/03/21 13:50:28 [debug] dns_stale_ttl = 4
2019/03/21 13:50:28 [debug] error_default_type = "application/json"
2019/03/21 13:50:28 [debug] headers = {"server_tokens","latency_tokens"}
2019/03/21 13:50:28 [debug] log_level = "debug"
2019/03/21 13:50:28 [debug] lua_package_cpath = ""
2019/03/21 13:50:28 [debug] lua_package_path = "/opt/test/lib/?.lua;;"
2019/03/21 13:50:28 [debug] lua_socket_pool_size = 30
2019/03/21 13:50:28 [debug] lua_ssl_trusted_certificate = "/opt/test/etc/combined-serverCA.crt"
2019/03/21 13:50:28 [debug] lua_ssl_verify_depth = 9
2019/03/21 13:50:28 [debug] mem_cache_size = "128m"
2019/03/21 13:50:28 [debug] nginx_admin_directives = {{value="/opt/test/lib/TEST_nginx_admin.conf",name="include"}}
2019/03/21 13:50:28 [debug] nginx_admin_include = "/opt/test/lib/TEST_nginx_admin.conf"
2019/03/21 13:50:28 [debug] nginx_daemon = "off"
2019/03/21 13:50:28 [debug] nginx_http_directives = {{value="/opt/test/lib/TEST_nginx_http.conf",name="include"}}
2019/03/21 13:50:28 [debug] nginx_http_include = "/opt/test/lib/TEST_nginx_http.conf"
2019/03/21 13:50:28 [debug] nginx_optimizations = true
2019/03/21 13:50:28 [debug] nginx_proxy_directives = {{value="/opt/test/lib/TEST_nginx_proxy.conf",name="include"}}
2019/03/21 13:50:28 [debug] nginx_proxy_include = "/opt/test/lib/TEST_nginx_proxy.conf"
2019/03/21 13:50:28 [debug] nginx_user = "bob test"
2019/03/21 13:50:28 [debug] nginx_worker_processes = "auto"
2019/03/21 13:50:28 [debug] origins = {}
2019/03/21 13:50:28 [debug] pg_database = "kong"
2019/03/21 13:50:28 [debug] pg_host = "127.0.0.1"
2019/03/21 13:50:28 [debug] pg_port = 5432
2019/03/21 13:50:28 [debug] pg_ssl = false
2019/03/21 13:50:28 [debug] pg_ssl_verify = false
2019/03/21 13:50:28 [debug] pg_timeout = 5000
2019/03/21 13:50:28 [debug] pg_user = "kong"
2019/03/21 13:50:28 [debug] plugins = {"bundled"}
2019/03/21 13:50:28 [debug] prefix = "/opt/test/var"
2019/03/21 13:50:28 [debug] proxy_access_log = "logs/access.log"
2019/03/21 13:50:28 [debug] proxy_error_log = "logs/error.log"
2019/03/21 13:50:28 [debug] proxy_listen = {"127.0.0.1:8000","0.0.0.0:8443 ssl"}
2019/03/21 13:50:28 [debug] real_ip_header = "X-Real-IP"
2019/03/21 13:50:28 [debug] real_ip_recursive = "off"
2019/03/21 13:50:28 [debug] ssl_cert = "/opt/test/etc/certs/changeme-kong_test_com-super-bundle.pem"
2019/03/21 13:50:28 [debug] ssl_cert_key = "/opt/test/etc/certs/changeme-kong_test_com-key.pem"
2019/03/21 13:50:28 [debug] ssl_cipher_suite = "modern"
2019/03/21 13:50:28 [debug] ssl_ciphers = "ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256"
2019/03/21 13:50:28 [debug] stream_listen = {"off"}
2019/03/21 13:50:28 [debug] trusted_ips = {}
2019/03/21 13:50:28 [debug] upstream_keepalive = 60
2019/03/21 13:50:28 [verbose] prefix in use: /opt/test/var
2019/03/21 13:50:29 [debug] loading subsystems migrations...
2019/03/21 13:50:29 [verbose] retrieving keyspace schema state...
2019/03/21 13:50:29 [verbose] schema state retrieved
2019/03/21 13:50:29 [info] bootstrapping database...
2019/03/21 13:50:29 [debug] creating 'kong' keyspace if not existing...
2019/03/21 13:50:29 [debug] successfully created 'kong' keyspace
2019/03/21 13:50:29 [debug] creating 'schema_meta' table if not existing...
2019/03/21 13:50:29 [debug] successfully created 'schema_meta' table
2019/03/21 13:50:29 [debug] creating 'locks' table if not existing...
2019/03/21 13:50:29 [debug] successfully created 'locks' table
2019/03/21 13:50:29 [verbose] waiting for Cassandra schema consensus (120000ms timeout)...
2019/03/21 13:50:30 [verbose] Cassandra schema consensus: reached
Error:
/usr/local/share/lua/5.1/kong/cmd/utils/migrations.lua:75: [Cassandra error] failed to insert cluster lock: [Write timeout] Operation timed out - received only 0 responses.
stack traceback:
[C]: in function 'error'
/usr/local/share/lua/5.1/kong/cmd/utils/migrations.lua:75: in function 'bootstrap'
/usr/local/share/lua/5.1/kong/cmd/migrations.lua:118: in function 'cmd_exec'
/usr/local/share/lua/5.1/kong/cmd/init.lua:87: in function </usr/local/share/lua/5.1/kong/cmd/init.lua:87>
[C]: in function 'xpcall'
/usr/local/share/lua/5.1/kong/cmd/init.lua:87: in function </usr/local/share/lua/5.1/kong/cmd/init.lua:44>
/usr/local/bin/kong:7: in function 'file_gen'
init_worker_by_lua:54: in function <init_worker_by_lua:52>
[C]: in function 'xpcall'
init_worker_by_lua:61: in function <init_worker_by_lua:59>
The keyspace is created, the early tables are created, the consensus after these updates is reached… and then it looks the cluster lock before starting the overall schema creation cannot be obtained.
The issue is systematic.
I’ve been looking for similar issues, analyzing https://github.com/Kong/kong/issues/4226, https://github.com/Kong/kong/issues/4228, https://github.com/Kong/kong/issues/4229 and https://github.com/Kong/kong/issues/4335, but no way to make it work .
I tried applying --db-timeout 120 --lock-timeout 120
but no positive result as well (and anyway the error is raised only after a couple of seconds or execution, and not at all after 120 seconds).
I encounter the error specifically on a Datastax DSE6.0; it is working well on DSE 5.
Is there any positive feedbacks from Kong team or Kong’s community regarding the usage of DSE 6.0?
Any hint on what could be the root cause of this issue?
Thanks a lot for your help.