Cassandra being deprecated?

I noticed the following note in the unreleased section of the kong changelog

Could someone please clarify or confirm is this (deprecation of Cassandra) is true?
if so will Kong be providing a migration path off of Cassandra?

the linked blog article doesn’t exist, so it is a little hard to understand what this means, but I would certainly like to start planning things if the direction of kong moving forward is “no cassandra”

1 Like

Kong version 2.7 is now released, the link to the blog post was removed, and there is still no clarification about what is meant by that statement. No link to an associated GitHub issue or commit.

If Cassandra is being deprecated I would like to know as earlier on in the process as possible, so we can start planning our migration/strategies.

Could someone point me to more information about this deprecation?

edit: found this similar question: Cassandra to postgres migration

1 Like

went through the github commits, found the following

but there is no additional information, and as one user commented the blog post link that was originally going to be included in the change log got removed.

I attended the Kong 2.7 release user call on Jan 11 2022, and asked about this issue during the Q and A. I was told that the missing blog post explaining what has going on would be available within the 2 weeks following the call. That blog post is still missing, but i will continue to wait for it to appear (the blog post). Hopefully the blog will explain the path forward.

The blog post has now been posted:

and while it is a good run down, there are few technical details. The blog post also fails to address what the expected migration paths or should be.

For people running Cassandra (that require a database), what should we migrate too, and how can Kong assure us of equivalent functionality.

In particular it seems that the only migration off Cassandra (in cases where db still required) is to Postgres. If that is the case then someone needs to explain to me how I achieve equivalent operational status.
How does one achieve the same level of various operational concerns with Postgresql as one did with Cassandra. Namely: horizontal scalability, failover, multi DC, redundancy.
Because none of those things appear to be easy to achieve with Postgresql (PostgreSQL: Documentation: 12: Chapter 26. High Availability, Load Balancing, and Replication), there does not appear to be a baked in way of achieving a true clustered environment (multi nodes handling read and writes). Because of this there also appears to be no easy ways to scale up a “cluster” when/if needed. This all leads me to wonder why drop Cassandra when it is the most robust clustered database available. When your gateway forms part of the core of your application/network it is worrying if you have to rely on tech that cannot achieve some level of HA and scaling.

The blog post ended by suggesting connecting with Kong if there were still questions

Please feel free to connect with us if you have any questions or concerns about deprecation.

I hope at some point this thread/forum post gets enough visibility that someone from Kong will reply to my question regarding the path forward for companies using Kong and Cassandra (that need a database)

Hi @jgrammen-agilitypr -

Thank you for taking the time to voice your concerns around the expected migration paths for Cassandra deployments and for reviewing the rationale behind why Kong is taking the path to deprecate Cassandra as a configuration datastore. We recognize that the fog of operational concerns surrounding Postgres can be complex with respect to horizontal scalability, failover, multi DC, and redundancy.

Starting first, I’m curious to learn more about your Gateway deployment and the places that intersect with Cassandra. Could you describe for us your reference architecture with respect to the requirement of geo-distributed and can you also share a rough estimate of the amount of traffic you’re serving? Secondly, could you give a sense of the plugins, either custom or provided by Kong, that are in use for this deployment? I’m more generally curious if you’re using custom entities in a custom plugin which makes use of Cassandra specific schemas. For a point of further clarification, the planned deprecation and removal of Cassandra is only scoped to its use as a primary configuration datastore for Kong Gateway and will not be removed as a storage dependency for Kong plugins.

Scaling out databases that are only ever read can be relatively simple to manage, but operations tend to get more challenging as updates are made to the database - particularly when handling distributed writes. With that in mind, we are planning to investigate, test and validate alternative databases that are more Postgres-like than Cassandra (aka having “Postgres compatibility”). This would enable a future where Gateway deployments using PostgresSQL can instead use Postgres compatible database like yugabyte db, cockroach db or others, and for us to give operators the assurances that Kong Gateway will be compatible. Gateway users can then take advantage of their Enterprise features around high-availability, horizontal scalability, and multi geo-support.

Lastly, having the technology side mapped is only one piece of the puzzle - establishing processes for planning, executing, documenting and monitoring a broader migration program for the existing configuration store are all required activities to make the transition a successful one. Given these realities and the planning required, we have scheduled Cassandra’s removal from Kong Gateway in 4.0 (12-18 months after our 3.0 series launch this Spring), to give the Kong teams and Gateway operators/developers ample time to investigate and navigate this transition. Looking forward to the continued conversation as we plot a path forward.

Hello Paul,

Thank you for taking the time to engage with me and he community on this issue. Kong’s willingness to talk about these concerns is reassuring and encouraging.
It sounds like Kong has plans to releases lots more information about the migration off Kong, but that it is still in the works. I am a little frustrated that Kong did not anticipate the questions that would pop up after announcing such a huge changed and have the materials ready in advance. Being able to say hey we are deprecating Cassandra, but here is exactly how and where you should go, would have been a much nicer and smoother approach. Yes, the actual deprecation is a long (12-18 months) off but for many businesses it may take nearly that long to plan the migration, purchase or re allocated resources (hardware or otherwise), and then execute the plan and do the migration.

AgilityPr’s use of Kong and Cassandra may not be considered quite small by most, but it has become fundamental to our microservice architecture and with a small operations team, the easy of operational management (while being open source) has been critical to our success with Kong. We have about 13 services all interacting with Kong, with about 1 million connections handled per day (7 million a week/ 7 days) serving about 7.5 GB per day egress bandwidth. We have a Local 3 node Cassandra cluster with a single geo distributed replica. The three nodes are physical servers located in a co location datacenter, and the single geo replica is in Amazon AWS EC2 in the Canada region. The distributed node is our disaster recovery, with a full replica of the local datacenter cluster’s data. This allows us to failover to that single node and then easily (because of Cassandra very simple scaling) add nodes to that geo distributed cluster to scale it up to the required capacity (read write requests per second). We only have about 300MB of data stored in the Cassandra data, but considering that a large part of that is customer credentials, that data is very important.

We are using about 10 plugins, with a mix of provided, provided but customized and fully custom. We are storing custom entities for user authentication. So we 100% need to have a database and could not migrate to a db-less implementation of kong.

acl
auth-agility
rate-limiting
aws-lambda
upstream-path-transformer
syslog-agility
request transfomer
cors
key-auth
prometheus-agility

It sounds like our only path forward from Cassandra is PostgreSQL, but you seem to be suggesting that maybe their might be other PostgreSQL compatible databases (yugabyte DB, coackroach DB) that may better match the Cassandra model operationally. This is intriguing since stock PostgreSQL while very powerful has limited built in replication that makes a database with modest amount of writes tricky when trying to scale and geo distribute. Although these solutions would need to be open source for us to adopt them.

Overall I am encouraged by this start of the conversation, and hope I have provided enough information about how we use Kong and Cassandra, that you can start to provide more detailed answer to how we migrate off of Cassandra and onto an new database while retaining the operational easy and flexibility, especially around clustering and scaling, that Cassandra provides.

@Paul_Fischer

I hope the information provided in the previous reply was sufficient for you to have a better understand of my use case of kong, and it will allow you to provide more feedback about my migration off of Cassandra.