Prometheus Plugin - Cardinality

flowdopip · April 8, 2019, 4:52pm

Hi,

I’m using Kong with Prometheus Plugin from the beginning and everything is ok. With the evolution of the system, I will have around 1000 services with the trend to increase.

Right now I have these metrics for a service:

kong_latency_bucket{type=“kong”,service=“auth-runtime”,le=“00001.0”} 2
kong_latency_bucket{type=“kong”,service=“auth-runtime”,le=“00002.0”} 2
kong_latency_bucket{type=“kong”,service=“auth-runtime”,le=“00005.0”} 2
kong_latency_bucket{type=“kong”,service=“auth-runtime”,le=“00007.0”} 2
kong_latency_bucket{type=“kong”,service=“auth-runtime”,le=“00010.0”} 2
kong_latency_bucket{type=“kong”,service=“auth-runtime”,le=“00015.0”} 2
kong_latency_bucket{type=“kong”,service=“auth-runtime”,le=“00020.0”} 2
kong_latency_bucket{type=“kong”,service=“auth-runtime”,le=“00025.0”} 2
kong_latency_bucket{type=“kong”,service=“auth-runtime”,le=“00030.0”} 2
kong_latency_bucket{type=“kong”,service=“auth-runtime”,le=“00040.0”} 2
kong_latency_bucket{type=“kong”,service=“auth-runtime”,le=“00050.0”} 2
kong_latency_bucket{type=“kong”,service=“auth-runtime”,le=“00060.0”} 3
kong_latency_bucket{type=“kong”,service=“auth-runtime”,le=“00070.0”} 3
kong_latency_bucket{type=“kong”,service=“auth-runtime”,le=“00080.0”} 3
kong_latency_bucket{type=“kong”,service=“auth-runtime”,le=“00090.0”} 3
kong_latency_bucket{type=“kong”,service=“auth-runtime”,le=“00100.0”} 3
kong_latency_bucket{type=“kong”,service=“auth-runtime”,le=“00200.0”} 3
kong_latency_bucket{type=“kong”,service=“auth-runtime”,le=“00300.0”} 3
kong_latency_bucket{type=“kong”,service=“auth-runtime”,le=“00400.0”} 3
kong_latency_bucket{type=“kong”,service=“auth-runtime”,le=“00500.0”} 4
kong_latency_bucket{type=“kong”,service=“auth-runtime”,le=“01000.0”} 4
kong_latency_bucket{type=“kong”,service=“auth-runtime”,le=“02000.0”} 4
kong_latency_bucket{type=“kong”,service=“auth-runtime”,le=“05000.0”} 4
kong_latency_bucket{type=“kong”,service=“auth-runtime”,le=“10000.0”} 4
kong_latency_bucket{type=“kong”,service=“auth-runtime”,le=“30000.0”} 4
kong_latency_bucket{type=“kong”,service=“auth-runtime”,le=“60000.0”} 4
kong_latency_bucket{type=“kong”,service=“auth-runtime”,le="+Inf"} 4

The cardinality will increase a lot:

bucket type * bucket service * latency
1100027 = 27000

A cardinality explosion is the sudden rapid creation of new series due to one or more labels on one or metrics being populated with high-cardinality data
High-cardinality data is any data that, when placed into a proper set, has a high number of discrete elements. In this context, we care about cardinalities in the tens-of-thousands and up.

Any way to control this?

Can i put this by configuration ?
local DEFAULT_BUCKETS = { 1, 2, 5, 7, 10, 15, 20, 25, 30, 40, 50, 60, 70,
80, 90, 100, 200, 300, 400, 500, 1000,
2000, 5000, 10000, 30000, 60000 }

hbagdi · April 17, 2019, 4:02pm

Hello @flowdopip,

This is indeed a problem that exists currently.
The solution here will be to introduce configuration in the plugin, where these the buckets and even the metrics which should be recorded and exposed can be configured.

Meanwhile, if you don’t need metrics for each of your service, you can apply the plugin to only a subset of your services, which will reduce the cardinality a little.

flowdopip · April 17, 2019, 9:14pm

Hi,

I want metrics to my services but i need to reduce the cardinality because i have around 500 services and this will create a lot of variations.

I´ve done a fork from the original prometheus repo, and now im able to setup the bucket list per service . I can add this feature to activate/deactivate the metric by configuration too.

Other thing that i will try to due, is expose lua metrics/openresty.

This make sense for you? This could be pull request for the original repo?

Thanks

hbagdi · April 18, 2019, 7:28am

What do you have in mind? Could you elaborate?

flowdopip · April 18, 2019, 12:01pm

Hi,

The idea is to expose all the metrics from the kong stack: Nginx, OpenResty and Lua.

hbagdi · April 18, 2019, 3:54pm

I see but what metrics are you referring to?
Could you please more specific as to which exact metrics you’d like to add?
I’ve thought about adding shm related metrics that we can get from OpenResty.

I’m very interested in knowing more about what you have in mind.

hbagdi · April 19, 2019, 10:57pm

@flowdopip I wanted to correct an oversight. I thought I had mentioned using StatsD, but I didn’t.

If having too many metrics is a concern, you can use the statsd plugin in Kong to emit StatsD events and have statsd-exporter slurp them in and expose those metrics in Prometheus Exposition Format.

flowdopip · April 22, 2019, 7:33am

Thanks for your feedback.

I didn´t think a lot about this, but the idea is to collect the most critical memory information from lua and openresty.

If something went wrong i want to know the current memory values used by LUA modules and OpenResty and NGINX.

Daniel_Lamando · May 28, 2021, 2:00pm

Hello from 2 years from this original post. Was there an upstream conclusion for this?
I have a somewhat lower number of services in my largest environment (~250) and regardless I have no need for so many histogram buckets (27 per service as above). In fact I don’t need the histogram buckets at all, but having a small number such as [10,100,1000,10000] would be nice to have and help reduce the cardinality without removing all benefit.

I’ve checked the plugin configuration Prometheus plugin | Kong and do not see any available option to address this.

Topic		Replies	Views
How to analysis service calls based on consumers? Questions	2	384	February 27, 2020
Prometheus /metrics info	4	841	August 11, 2020
Kong's Prometheus plugin not returning all metrics Questions	6	4056	April 21, 2025
【performance】Apply prometheus plugins in Kong Route, which adds an average of 6 milliseconds of response time per request Questions	4	952	February 11, 2019
Kong's Prometheus plugin not returning all metrics - Docker Questions	3	951	March 30, 2020

Prometheus Plugin - Cardinality

Related topics