I see that in
basic-auth plugin, the primary key is
id and there is an index on
username. While this may work well with postgres, but in case of Cassandra, secondary indexes are costly. Why isn’t
username a primary/partition key when its also a
I see that in
I would like to share results of a few performance runs that we did, and why I suspect the lookups in basic auth plugin to be inefficient.
- 3 kong nodes, running on c5.2xlarge, backed by cassandra, with
- 5 node cassandra cluster on
- A dummy upstream service with basic auth plugin enabled.
- 1 million consumers and 1 million credentials pre populated, with 1 credential per consumer.
- Test runs made calls to the endpoint with random users/credentials within this range.
- Scenario 1: Set of runs without cache warmup.
Scenario 2: Set of runs after
basicauth_credentialswere warmed up in the cache, but not
Scenario 3: Set of runs after
consumerswere warmed up in the cache, but not
- Scenario 1 had the worst performance, with kong proxy latency in seconds. We’ll keep it out of the discussion here.
- Scenario 2 showed an initial spike in kong proxy latency (p99 around 100ms) for around 5 minutes, after which it stabilized to ~5ms. The test gave ~6K rps.
- Scenario 3 was run with same load as that of scenario 2, but kong proxy latency was consistently high (p99 going up to 1s) and didn’t stabilize during the duration of the run.
- During run 3, cassandra nodes showed high CPU usage (~90%), high network utilisation and high number of threadpool operations (with pending ops touching ~5K).
- The above parameters were under limits in scenario 2.
- Also, some spike was seen in cassandra metrics when cache was warming up for
basicauth_credentials. It took ~3 minutes for cache to be populated in each run.
Attached are some graphs for refernce. Scenario 2 ran from ~11:40 till ~11:55, while scenario 2 started at ~12:57
The performance tests were again run after forking
basic-auth plugin and making the above mentioned changes. The performance bottleneck completely went away and the plugin performed flawlessly with over 1 million non-cached consumers and credentials. p99 proxy latency was observed to be around 12ms till the cache got warmed up.
A proposal to update the
basic-auth plugin with the suggested changes is made here : https://github.com/Kong/kong/pull/5914