Why my request is considered as a Mesh request?

(aka “Kong in API Gateway mode is mutating to Kong in Mesh mode” :smile:)

Hi all,

I have a very strange behavior : my single Kong 1.1.1 instance, deployed in db-less mode, sometimes behaves as an API Gateway, and sometimes as a service Mesh, depending on how it is called !!

I’m deploying Kong as a pure API Gateway (I understood that there is nothing special to configure in order to activate one mode in the other… but I have not set any specific parameter that makes me think that the Mesh mode can be up and running), and the startup logs says:

2019/04/08 16:29:24 [notice] 35#0: *1 [kong] init.lua:278 declarative config loaded from /tmp/kong-endpoints.yaml, context: init_worker_by_lua*
2019/04/08 16:29:24 [info] 35#0: *1 [lua] mesh.lua:64: init(): initialising cluster ca..., context: init_worker_by_lua*
2019/04/08 16:29:24 [warn] 35#0: *1 [lua] mesh.lua:86: init(): no cluster_ca in declarative configuration: cannot use node in mesh mode, context: init_worker_by_lua*

So it looks my instance is not in mesh mode (logs say cannot use node in mesh mode), which is what I’m expecting! This is confirmed when I curl on the Route: the plugin declared on the Route (with the default value first set on the run_on parameter) is correctly executed.

However, doing the exact same call using wget is generating an issue in Kong:

2019/04/08 16:45:08 [debug] 35#0: *8705 [lua] certificate.lua:18: log(): [ssl] no SNI registered for client-provided name: 'a.b.c'
2019/04/08 16:45:08 [debug] 35#0: *8705 [lua] init.lua:175: rebuilding plugins map
2019/04/08 16:45:08 [error] 35#0: *8704 lua entry thread aborted: runtime error: /usr/local/share/lua/5.1/kong/runloop/mesh.lua:219: missing X-Forwarded-Host
stack traceback:
coroutine 0:
        [C]: in function 'assert'
        /usr/local/share/lua/5.1/kong/runloop/mesh.lua:219: in function 'rewrite'
        /usr/local/share/lua/5.1/kong/runloop/handler.lua:577: in function 'before'
        /usr/local/share/lua/5.1/kong/init.lua:676: in function 'rewrite'
        rewrite_by_lua(nginx-kong.conf:97):2: in function <rewrite_by_lua(nginx-kong.conf:97):1>, client: 10.13.67.163, server: kong, request: "GET /ui/aota HTTP/1.1", host: "a.b.c:30443"

This is definitely a log generated in the context of a Mesh deployment ! (https://github.com/Kong/kong/blob/bc48efeabc5d7460271d09d20e8e95b1b2e97a04/kong/runloop/mesh.lua#L219)
Why did my Kong “jumped” into this mode ?? Let’s say is differently: why did Kong consider this request as a Mesh request?

Trying to access to the same url using a web browser produces another result :stuck_out_tongue: : the request is correctly processed by Kong without any error… but without executing the plugin declared on the Route. My interpretation is that:

  • Thanks to the presence of a front-end reverse proxy, the X-Forwarded-Host header is set in the request, so the HTTP 500 use case above is not happening
  • But as my plugin is declared as run_on set to first, then the plugin is not executed: again, the call is considered as a Mesh call and not an API Gateway call…

Can someone help on this ?

Some information on my context (as it may help :wink: ) :

  • Kong 1.1.1
  • Db-less mode, but no cluster_ca data set
  • Kong is run in a Kubernetes Pod, and the Pod also contains an init-container
  • I have set my own TLS certificates (I’m not using the default ones)
  • This was correctly working on Kong 1.0.3 with a DB (but I did not have time to test with Kong 1.1.1 and a DB)

Thanks for the report @pamiel! I was able to reproduce it. I’ll keep investigating and will continue the discussion in issue #4497.

Thanks a lot @hisham.

I fear I have a similar use case of API Gateway/Mesh mix, when using the Postgres database (i.e. there might be issues not only in db-less mode).

This time, it does not raise a HTTP 500 code, but I’ve seen use cases where the Route exists by I receive a 404 “no route…”, and sometime the call to the Route is referenced in the Access Logs (so it happened!), but the request is never proxied to the upstream server (more hanging, as if there was a network timeout to wait for !)
I will try to reproduce this into simpler and reproducible examples, in order to confirm or not.

Seems like @hisham already proposed a PR:

Yes, thanks, the PR solved the issue originally mentioned in my first post… but looks not for the second one I mentioned in my second post. Still trying to get a simple way to reproduce it.

Hi all

I´ve updated my solution of kong from kong 1.1.2 to Kong 1.2.1 and appear something like that

[lua] mesh.lua:86: init(): no cluster_ca in declarative configuration: cannot use node in mesh mode, context: init_worker_by_lua*

Any idea?

I didn´t change anything on my configurations

@flowdopip, the Kong to detect mesh requests (e.g. where another Kong calls another Kong in a same cluster), it needs cluster_ca so that nodes can do mTLS (that is part of this detection). If you don’t specify cluster_ca (with db we can autogenerate as nodes share the database), the detection will not work and all interactions with kong will be normal gateway (even calls from one kong node to another in a same cluster, as it is common in service mesh scenarios where all services ship a sidecar Kong in front of them and use that Kong also as a forward proxy). If you don’t need or care about this, then just ignore the error.