The cache in question is part of client-go. We use the most basic version, which doesn’t have any mechanism for expiring objects other than deleting them entirely. It will load resources of interest at startup, add new ones as they’re created, and remove them if they’re deleted.
I didn’t dig into the implementation, but my brief read of the library docs indicates that it just uses a Go map–there’s no external cache implementation like memcached (not sure where you saw that–as far as I know the only place we use that anywhere in our product suite is the OIDC plugin, where it’s an optional store for session data)
There’s no way to size-limit that cache and I don’t think it’d be advisable to impose one: the purpose of the controller cache is to reduce hits to etcd, and clearing cached resources that it actually needs would require that it fetch them again from etcd, which could result in broader cluster instability if it’s fetching too much. The only stock policy to expire resources from cache is TTL-based, which is designed for applications that only operate on a resource briefly. The ingress controller generates configuration from the full set of relevant resources each time it generates configuration, so that wouldn’t work for us.
2.5GB for 100k objects does seem unusually high based on our testing. We observed 356MiB of usage with 30k resources (split evenly between 3 types of resources) and 560MiB for 40k, so about 10-14kB/resource. We don’t expect the distribution of resources to matter much, since they’re all roughly the same size and all cache the serialized Go struct built from their YAML spec only (we don’t generate any persistent derived resources from anything, and the ephemeral resources we generate when building configuration use very similar structs).
While your usage does seem beyond what we expect, most future performance optimization will be for the upcoming 2.x version of the controller, where we’ve refactored around new Kubernetes client libraries created since our controller’s original implementation. We observed a small reduction in memory usage (~80% of the 1.x usage in the larger test) with it, which is expected: it doesn’t use a different caching strategy, and the change is likely due to other, non-scaling memory consumption elsewhere. If you’re interested in testing against it, the latest alpha release is not expected to change much–we’re still working on finalizing documentation for a beta soon-ish, but it’s a drop-in replacement for 1.x for the most part (a few more obscure flags are the only breaking changes we know of). Alternately, if you can provide your test resource YAMLs, we can try to reproduce your results independently.
The proxy uses an internal Lua cache of entities in Postgres when using database-backed mode. The general forum may be better able to answer details on specifics, but in general it functions as you describe, just without any external caching system. It’s better able to impose a limit because it can evaluate entity usefulness based on observed requests and use a least-recently-used metric to discard cache for entities it hasn’t used recently.
The controller can’t really use the same strategy since it’s always generating a config representing all resources to either apply in its entirety (in DB-less mode) or diff against the current Kong configuration (in DB-backed mode). We can’t easily get away from that without the possibility of config drift, so our recommendation to users with larger configurations has been to move some of their configuration out of Kubernetes resources (it’s more common to have a number of consumers that’s an order of magnitude larger than the number of routes/Ingresses) and manage it in the Kong database alone.
In your case, are you able to reduce the number of Ingresses by using more rules per Ingress? I’m not sure how much that would reduce total size as we haven’t researched it much, but it should be some reduction by cutting down on repetitive boilerplate metadata in favor of meaningful configuration. Another strategy would be to separate the controller from your Deployment–in database-backed mode there’s no reason to run one in each Pod alongside Kong, and we simply default to that because it’s simple and allows us to hide the admin API as a basic zero-configuration security measure. You can run the controller independently, which would not reduce its memory consumption, but would mean that you don’t have to account for the limit on each Kong replica. In database mode, only one controller replica actively submits configuration updates, but each replica always pulls resources into cache.
I don’t know that our planned optimizations would likely help in your case: our first issue to tackle is making our Kubernetes resource cache less naive, as it currently pulls in all resources of a given type before filtering out which are relevant when building configuration. We will pull a Service into cache even if no Ingress references it, simply because we lack any filter, and thus pull in many resources we’ll never incorporate into config. Improving that to filter out irrelevant resources before they reach cache is an easy win for most environments, but wouldn’t make much difference if you actually have a large number of resources that will be rendered into config.