This is a great discussion topic, and all comments in this thread include great insights!
Jeremy’s comment is on point as it focuses on the architecture design: deciding to choose or not to use Hybrid Mode is not as much a matter of features and limitations, but an architectural choice.
This is not me trying to sugar-coat the limitations: what we call Hybrid Mode in Kong is really an architecture where you have a clear separation of Control Plane and Data Plane concerns. The Control Plane takes care of configuration, from its interface (for humans or other machines) to its storage (database); and the Data Plane takes care of the requests in flight. Under this light, the restrictions on what DP nodes can do are not only understood, but desired: you don’t want configuration work happening on these nodes.
One thing to have in mind is that, at peak performance, even in traditional mode Kong does not use the database: you get Kong’s full performance when all needed objects to serve requests fit in its in-memory cache. DB-less mode (which powers Hybrid Mode’s DP nodes) is this observation taken to the extreme: “well, just pre-fill all objects into the memory cache and get rid of the database, then!”
Some of the advantages of this DB-less mode are immediate: peak performance without having to warm up the cache! no database migrations in DP nodes! The limitation is that DP requests cannot write to the “mock database” that the DB-less cache implements. Why? One main reason is that DP nodes are fully independent from each other, and that is key to their performance, scalability and simplicity of operation. Plus, doing otherwise would amount to implementing an ad-hoc distributed database inside Kong, which doesn’t sound like the best of ideas.
If you need an actual database attached to your DP nodes, then we already support that in traditional non-Hybrid mode with industry-proven Postgres and Cassandra. And even in Hybrid mode, some plugins may have limited DB needs which can be solved on a case-by-case basic. For example, the rate-limiting plugin has an option to hook to Redis. The Kong data plane as a whole still runs DB-less (no migrations in proxy nodes, etc.), but you can still get cluster-wide rate-limits coordinated by all DP nodes sharing Redis tables.
A specific advantage of Hybrid as opposed to traditional is that the Hybrid DP is more resilient to CP outages than the traditional mode is to DB outages. At peak performance, a traditional node will run off the cache and short DB outages might even run unnoticed, but if it tries to write to the database while it is down you’ll run into errors (and it also implies extra care with which nodes can or cannot issue writes during live migrations). In Hybrid mode, when the CP is unavailable, the DP keeps running off the memory cache, and it also persists a copy of its configuration on disk so that it survives restarts.
Having said all that, Hybrid Mode is still new compared to traditional mode, having been introduced in version 2.0 in Kong OSS and in 2.1 in Kong Enterprise. Similarly to when we introduced DB-less mode back in Kong 1.1, we’ve been building out the feature set mostly driven by user feedback. DB-less has been a big success among our users, it is a proven pattern. This has given us confidence to use it as the foundation for Hybrid Mode.
In my observations, the adoption for Hybrid among our OSS userbase so far has been less quick than that of DB-less and I find that completely understandable: it is, in a sense, an even greater architectural shift than going DB-less for green field installations – one needs to shift their mindset towards this whole “planes” stuff. However, Jeremy is on point when observing that Hybrid Mode does give you a more uniform Kong experience: from an Admin API standpoint, you manage you Kong cluster pretty much the same way you always did.
But this is not the only pro. If we consider what it would entail to go DB-less as opposed to Hybrid Mode, which Jeremy described very well:
That would save me 1/2 the infra nodes not needing that CP node and DB to store the persistant configs (could store each DP yaml in some other independent file storage and push it out myself and restart Kong with custom orchestration).
At this point, you are pretty much implementing your own Control Plane by hand, managing the storage of your configuration by yourself, implementing custom orchestration and monitoring by yourself to push out updates to the data plane and ensuring the data plane is up-to-date, taking care of persistent caching at the DP side, and handling major migrations (any changes in the format of Kong objects) by updating YAML files by yourself… these are all things that a Control Plane node can do for you. (Small correction on the “1/2 the infra nodes”: you don’t need one CP node per DP node; one CP node can handle many DP nodes – you can scale your CP cluster and DP cluster independently according to your configuration vs. proxying needs for scaling and availability – that’s another benefit for CP/DP separation: these needs are often different.)
In a Kubernetes environment, some of these concerns get moved around, as the Kong Ingress Controller will handle the config synchronization and config storage moves into the Kubernetes paradigm, but the configuration and maintenance of Kong entities itself in YAML would still be up to you. (And even though K8s+KIC solves many of the problems, of course, not everybody wants to run K8s.)
As for the data segregation question: as of Kong 2.1, we don’t have automatic support for that, every DP node gets an identical configuration, meaning they essentially work to an outside observer as if you had a traditional DB-enabled Kong cluster, except that in the “CP nodes” you’d disable the proxy listening ports, and in the “DP nodes” you disabled the Admin API ports. Perhaps thinking of it this way makes it easier to compare the Hybrid deployment option to traditional DB-enabled Kong deployment. In that sense, I suppose that if you were doing a traditional DB-based Kong deployment and supporting multiple public clouds with nodes running with different configurations of routes, the solution would be to have separate clusters (possibly doing auth using a plugin that supports an external identity provider, for example).
We are very eager to gather feedback though on how people want to use Hybrid mode, of course, and we’re putting significant dev resources into its evolution, so the points raised here are very welcome!