OAuth2/JWT under high load ideas for improvement

Realistically people should cache their OAuth2 Bearer tokens for their ___ ttl lifespan. Unfortunately too many teams do not do so, some options I would like to run by Kong and the community to see if we maybe could add some resiliency to the plugin to prevent abuse/improve perf:

Ideas for OAuth2 client_credential flow:

  1. Add a per consumer configurable rate limit to the token generation logic, so consumer(or ip) generating a token @ greater than say 20-30 tps gets 429’ed. I know I figured out a way to log the consumer at token creation time so adding a little lookup cache and counter for that user should be reasonable to do eh?

  2. Could Kong support db keepalive per worker process in the context of oauth2 token generation calls? I am thinking another problem might be the kong on this hot path(kong.db.*) actually is opening and closing db connections on token generation which over SSL and under concurrency to say a Cassandra db with full cluster replication it doesn’t take long before someone is rekking you at 500 tps and token generation creeps to 5-10 seconds(and would impact all other OAuth2 users!). If it is reusing an active connection on every call then this one goes out the window as it already does that :sob: .

  3. Maybe some super smart logic that caches what a user has generated and for like the first 1/2 of the tokens validity(kinda arbitrary I know, maybe do so even up till like 30 seconds of its validity or some arbitrary config value) returns a bad customer their same token generated with (original ttl - current nginx cache timestamp in response) through a lookup cache. Then its like Kong is doing the proper caching for them :laughing: (yeah yeah its probably not in the RFC but would work to prevent tons of db read/writes!).

These are all things I a mulling over thinking about how to make OAuth2 a bit more bulletproof, will try to come up with some better ideas as well.

For JWT perf Improvement I was thinking this:

Right now the logic always does the crypto validation/decoding of the JWT every pass right? What if we do the validation once, take the ttl(exp) into consideration and drop the jwt into a local cache with a ttl on it? Then when subsequent JWTs come in of the same value we can do cache compare to see if JWT is present and if so proxy back(saving on the crypto/decoding validation work). Under real circumstances I am not sure how useful this would be, and its likely only a perf/cpu IF users are re-using their JWT up to its EXP validity.

Just the weird things I ponder before bed :sleeping: . Gotta always get better/move faster!

Well I gave this a go in the OAuth2 plugin logic:

access.lua relevant snippit

local function oauth2_produce_token(credential, service_id, authenticated_userid, token_expiration, refresh_token, scope, refresh_token_ttl)
  
  local token, err = kong.db.oauth2_tokens:insert({
    service = service_id and { id = service_id } or nil,
    credential = { id = credential.id },
    authenticated_userid = authenticated_userid,
    expires_in = token_expiration,
    refresh_token = refresh_token,
    scope = scope
  }, {
    -- Access tokens (and their associated refresh token) are being
    -- permanently deleted after 'refresh_token_ttl' seconds
    ttl = token_expiration > 0 and refresh_token_ttl or nil
  })
  
  if not token then
    return nil, err
  end
 
 token.cache_time =  ngx.time()
 --ngx.log(ngx.ERR, "Timestamp Generated: ", ngx.time()) 
 return token, err
end

local function generate_token(conf, service, credential, authenticated_userid,
                              scope, state, expiration, disable_refresh)

  local token_expiration = expiration or conf.token_expiration

  local refresh_token
  if not disable_refresh and token_expiration > 0 then
    refresh_token = random_string()
  end

  local refresh_token_ttl
  if conf.refresh_token_ttl and conf.refresh_token_ttl > 0 then
    refresh_token_ttl = conf.refresh_token_ttl
  end

  local service_id
  if not conf.global_credentials then
    service_id = service.id
  end
  
  --Get from cache if seen this credential id 
  local token, err = kong.oauth_cache:get(credential.id, { ttl = 3300 }, oauth2_produce_token, credential, service_id, authenticated_userid, token_expiration, refresh_token, scope, refresh_token_ttl)

  --If the cache lookup did not fail
  if token then
    local timeElapsed = ngx.time() - token.cache_time -- Seconds passed since generated
    --ngx.log(ngx.ERR, "Time Elapsed: ", timeElapsed)
    --token_expiration = token_expiration - timeElapsed --Dynamic at this point for tokens generated
    token.expires_in = token_expiration - timeElapsed --Dynamic at this point for tokens generated
    --ngx.log(ngx.ERR, "Time Till expire: ", token.expires_in)
  end
  
  --Authenticate the consumer and credentials generating the token
  local cred, consumer, err2 = 
  load_credential_and_consumer_into_memory_from_credential(credential)
  
  if not err2 then
    kong.client.authenticate(consumer, cred)
  end

  if err then
    return internal_server_error(err)
  end

  return {
    access_token = token.access_token,
    token_type = "bearer",
    expires_in = token_expiration > 0 and token.expires_in or nil,
    refresh_token = refresh_token,
    state = state -- If state is nil, this value won't be added
  }
end

handler.lua add this

function OAuthHandler:init_worker()
  OAuthHandler.super.init_worker(self)
  
  local oauthcache, err = mlcache.new("kong_oauth_cache", "kong_oauth_cache", {
    shm_miss         = "kong_oauth_cache_miss",
    shm_locks        = "kong_oauth_cache_locks",
    shm_set_retries  = 3,
    lru_size         = 1000,  -- size of the L1 (Lua VM) cache
    ttl              = 3300,  --55 minute ttl
    neg_ttl          = 5,     --5 second ttl for misses
    resty_lock_opts  = {exptime = 10, timeout = 5,},
  })
  
  if not oauthcache then
    ngx.log(ngx.ERR, "failed to instantiate oauth mlcache: " .. err)
    return
  end
  
  kong.oauth_cache = oauthcache
end

Then in a custom conf add:

# exclusive oauth shm caches
lua_shared_dict kong_oauth_cache       5m;
lua_shared_dict kong_oauth_cache_miss  2m;
lua_shared_dict kong_oauth_cache_locks 1m;

Analysis results:

No OAuth token cache logic:
1.12 CORE of node used @ 200 threads = 208 TPS, total tokens made during 2 min window 25,400 tokens, 5 second latency max, p99 270ms(http logger report), avg latency on soapUI 161ms over vpn

After Change:
.79 CORE of node used @ 200 threads = 239 TPS, tx count during 2 min window 28,707, 1 second latency max(first call), p99 5ms(http logger report), avg latency on soapUI 73.87ms over vpn

Strengths:

  1. No DB token bloat, dramatically less R/W to DB in general
  2. OAuth2 token Endpoint becomes invulnerable to token abuse customers and can’t potentially take down db availability.
  3. Faster OAuth2 perf since returning cached value with dynamic TTL on tokens life to client(per kong node).
  4. Saves on CPU + Local node memory as well caching each of those unused tokens(since they go to waste)

Weaknesses:

  1. Could break clients that don’t properly follow OAuth2 token generation standards around caching(following what the expires says)

I don’t really think Kong would want to do something so specific, not to mention I don’t allow for the refresh token flow and invalidating the old token if its still valid. But if your just following client creds pattern and users are always generating a token per request or something awful this can actually be a nice way to do the caching they should be(and everyone knows the customer is always right and you can’t show them the errors of their ways :laughing: ) .

If Kong did drop logic like this in I would probably have it as a boolean flag like oauth2_cache_perf_dynamic_ttl: true/false . A little extra work and you could technically make it work for refresh tokens too. Maybe this snippit will help some of others too who want to insulate their oauth2 calls and db from legitimate clients that just implement bad practice(which happens more often than anyone would like with lots of clients or people who loadtest and write that token call in a nice big for loop over and over :grinning: ) .