OAuth2/JWT under high load ideas for improvement

#1

Realistically people should cache their OAuth2 Bearer tokens for their ___ ttl lifespan. Unfortunately too many teams do not do so, some options I would like to run by Kong and the community to see if we maybe could add some resiliency to the plugin to prevent abuse/improve perf:

Ideas for OAuth2 client_credential flow:

  1. Add a per consumer configurable rate limit to the token generation logic, so consumer(or ip) generating a token @ greater than say 20-30 tps gets 429’ed. I know I figured out a way to log the consumer at token creation time so adding a little lookup cache and counter for that user should be reasonable to do eh?

  2. Could Kong support db keepalive per worker process in the context of oauth2 token generation calls? I am thinking another problem might be the kong on this hot path(kong.db.*) actually is opening and closing db connections on token generation which over SSL and under concurrency to say a Cassandra db with full cluster replication it doesn’t take long before someone is rekking you at 500 tps and token generation creeps to 5-10 seconds(and would impact all other OAuth2 users!). If it is reusing an active connection on every call then this one goes out the window as it already does that :sob: .

  3. Maybe some super smart logic that caches what a user has generated and for like the first 1/2 of the tokens validity(kinda arbitrary I know, maybe do so even up till like 30 seconds of its validity or some arbitrary config value) returns a bad customer their same token generated with (original ttl - current nginx cache timestamp in response) through a lookup cache. Then its like Kong is doing the proper caching for them :laughing: (yeah yeah its probably not in the RFC but would work to prevent tons of db read/writes!).

These are all things I a mulling over thinking about how to make OAuth2 a bit more bulletproof, will try to come up with some better ideas as well.

For JWT perf Improvement I was thinking this:

Right now the logic always does the crypto validation/decoding of the JWT every pass right? What if we do the validation once, take the ttl(exp) into consideration and drop the jwt into a local cache with a ttl on it? Then when subsequent JWTs come in of the same value we can do cache compare to see if JWT is present and if so proxy back(saving on the crypto/decoding validation work). Under real circumstances I am not sure how useful this would be, and its likely only a perf/cpu IF users are re-using their JWT up to its EXP validity.

Just the weird things I ponder before bed :sleeping: . Gotta always get better/move faster!

0 Likes