Having issues with kong v6

hi.

we were early adopters of kong thats why we are still in v6 and have not upgraded.
it helped us worry less about cross cutting concerns for our services.

now that traffic/usage has picked up, we are encountering issues.

we have a single node kong deployment. we run kong on a 4core16gb VM on-premise.
the only plugin currently running is the oauth2 plugin.

when testing using ab(apache bench) we see that we are not able to handle 50 concurrent requests.

we get the following types of errors in /usr/local/kong/logs/error.log:

2018/11/20 12:13:06 [error] 26498#0: *79910 [lua] responses.lua:97: generate_token(): Cassandra error: ResponseError: [Write timeout] Operation timed out - received only 0 responses., server: _, request: “POST /accounts/oauth2/token HTTP/1.1”

2018/06/22 12:41:53 [error] 25342#0: *204611 [lua] responses.lua:97: generate_token(): Cassandra error: NoHostAvailableError: All hosts tried for query failed. 127.0.0.1:9042: Host considered DOWN., , server: _, request: “POST /accounts/oauth2/token HTTP/1.1”,

any suggestions to mitigate the issue before we upgrade would be quite helpful.

thanks.

@apollojess Are you referring to Kong 0.6.0 that was released almost 3 years ago?

yeah. 0.6.0

I know right!!! upgrade to latest is in the pipeline.

but we are currently experiencing order of magnitude traffic growth so we are under a lot of pressure.

hoping someone here with experience can chime in and help. we are now considering taking kong out of our stack just to keep the services up.

Sounds like you are load testing OAauth2.0 token generation specifically and receiving error? I imagine this would be likely with all the fast write/reads that action requires, especially if running on only 1 Cassandra node. I would suggest switching to HS-256 client JWT authentication which would not be as intensive on your DB if that’s an option. Otherwise upgrading Kong versions will likely be the only path forward.

hey jeremy! thanks for taking the time to reply.

We are not load testing. We are getting the errors with actual production load.
Its a happy problem since the service is getting traction but at the same time, its horrible as the user experience is degrading.

We have now resorted to autostarting kong and cassandra whenever the load goes over 8.

server specs:
4 cores / 16gb mem / 100gb spinning disk storage

help???