Hey @traines thanks for the detailed reply. I have already seen the whole nginx custom log format config fighting stuff. After some trial and error, these helm values seem to result in actual user traffic only getting logged in a custom JSON format:
env:
nginx_http_log_format: >-
json_upstream_data escape=json
'{ "remote_addr": "$remote_addr",
"request": "$request", "status": "$status",
[....] }'
proxy_access_log: 'logs/access.log json_upstream_data'
Internal traffic (eg admin api) are still logged in a different format but that’s not much of a bother.
The ‘last 10%’ of including namespace/name info turned out to be a fair bit harder
So, it does seem like the desync is that, yes, the Kubernetes ecosystem values stdout logging. In effect the source of logs (the application) becomes uncoupled from the log aggregator when stdout is in use for every application. If an application writes JSON to stdout, a Kubernetes user will be able to grok those structures.
I did actually check out the logging plugins. Several notes there…
Unfortunately it’s not clear if any of the plugins are able to transmit logs to Datadog, the logging vendor I’m working with. I can transmit logs by:
- Sending one JSON object per line over TCP to a server
– prefixed by API key which tcp-log cannot do
- POSTing one or more logs at a time to an HTTPS endpoint along with API key
– which http-log cannot adopt the structure of
- Print the log to stdout and let the agents ingest it
– using file-log pointing to /dev/stdout
, but file-log says it shouldn’t be used in production due to the synchronous I/O model
Even if I were to use one of these logging plugins, they still do not include separate fields for namespace, svcName, & port. I believe this is enough information for most cases. And it makes sense that normal Kong puts them all in one string, but Kong “Ingress Controller” should really know how to at least break that string back up into components because they have real meaning in Kubernetes.
For example, if there’s a sudden pile of 500’s coming from related services on the gateway, engineers working within the associated namespace are the ones who should be alerted. (This is a wider problem with Kong + Kubernetes actually, the metrics plugins have the same issue.)
Finally, the logs sent by the plugin seem to just have too much info. I pay per byte so the whole route
block feels like a repetitive waste. And it seems to include full HTTP headers, including session ID cookies…
I don’t want everyone’s authorization strings sitting on a log server.
I can doctor the logs in post processing but there’s just so many different papercuts and minor concerns no matter how I look at it, so it’s good to hear that this is a known friction point.
It kinda seems like the most Kubernetes-friendly option is writing a custom sidecar container to sit next to each kong, receive tcp-log packets from localhost, and write a reformatted subset of the log to its own stdout to be indexed normally.
Another related topic: Nginx custom template http log with service name