Why my request is considered as a Mesh request?

#1

(aka “Kong in API Gateway mode is mutating to Kong in Mesh mode” :smile:)

Hi all,

I have a very strange behavior : my single Kong 1.1.1 instance, deployed in db-less mode, sometimes behaves as an API Gateway, and sometimes as a service Mesh, depending on how it is called !!

I’m deploying Kong as a pure API Gateway (I understood that there is nothing special to configure in order to activate one mode in the other… but I have not set any specific parameter that makes me think that the Mesh mode can be up and running), and the startup logs says:

2019/04/08 16:29:24 [notice] 35#0: *1 [kong] init.lua:278 declarative config loaded from /tmp/kong-endpoints.yaml, context: init_worker_by_lua*
2019/04/08 16:29:24 [info] 35#0: *1 [lua] mesh.lua:64: init(): initialising cluster ca..., context: init_worker_by_lua*
2019/04/08 16:29:24 [warn] 35#0: *1 [lua] mesh.lua:86: init(): no cluster_ca in declarative configuration: cannot use node in mesh mode, context: init_worker_by_lua*

So it looks my instance is not in mesh mode (logs say cannot use node in mesh mode), which is what I’m expecting! This is confirmed when I curl on the Route: the plugin declared on the Route (with the default value first set on the run_on parameter) is correctly executed.

However, doing the exact same call using wget is generating an issue in Kong:

2019/04/08 16:45:08 [debug] 35#0: *8705 [lua] certificate.lua:18: log(): [ssl] no SNI registered for client-provided name: 'a.b.c'
2019/04/08 16:45:08 [debug] 35#0: *8705 [lua] init.lua:175: rebuilding plugins map
2019/04/08 16:45:08 [error] 35#0: *8704 lua entry thread aborted: runtime error: /usr/local/share/lua/5.1/kong/runloop/mesh.lua:219: missing X-Forwarded-Host
stack traceback:
coroutine 0:
        [C]: in function 'assert'
        /usr/local/share/lua/5.1/kong/runloop/mesh.lua:219: in function 'rewrite'
        /usr/local/share/lua/5.1/kong/runloop/handler.lua:577: in function 'before'
        /usr/local/share/lua/5.1/kong/init.lua:676: in function 'rewrite'
        rewrite_by_lua(nginx-kong.conf:97):2: in function <rewrite_by_lua(nginx-kong.conf:97):1>, client: 10.13.67.163, server: kong, request: "GET /ui/aota HTTP/1.1", host: "a.b.c:30443"

This is definitely a log generated in the context of a Mesh deployment ! (https://github.com/Kong/kong/blob/bc48efeabc5d7460271d09d20e8e95b1b2e97a04/kong/runloop/mesh.lua#L219)
Why did my Kong “jumped” into this mode ?? Let’s say is differently: why did Kong consider this request as a Mesh request?

Trying to access to the same url using a web browser produces another result :stuck_out_tongue: : the request is correctly processed by Kong without any error… but without executing the plugin declared on the Route. My interpretation is that:

  • Thanks to the presence of a front-end reverse proxy, the X-Forwarded-Host header is set in the request, so the HTTP 500 use case above is not happening
  • But as my plugin is declared as run_on set to first, then the plugin is not executed: again, the call is considered as a Mesh call and not an API Gateway call…

Can someone help on this ?

Some information on my context (as it may help :wink: ) :

  • Kong 1.1.1
  • Db-less mode, but no cluster_ca data set
  • Kong is run in a Kubernetes Pod, and the Pod also contains an init-container
  • I have set my own TLS certificates (I’m not using the default ones)
  • This was correctly working on Kong 1.0.3 with a DB (but I did not have time to test with Kong 1.1.1 and a DB)
0 Likes

#2

Thanks for the report @pamiel! I was able to reproduce it. I’ll keep investigating and will continue the discussion in issue #4497.

0 Likes

#3

Thanks a lot @hisham.

I fear I have a similar use case of API Gateway/Mesh mix, when using the Postgres database (i.e. there might be issues not only in db-less mode).

This time, it does not raise a HTTP 500 code, but I’ve seen use cases where the Route exists by I receive a 404 “no route…”, and sometime the call to the Route is referenced in the Access Logs (so it happened!), but the request is never proxied to the upstream server (more hanging, as if there was a network timeout to wait for !)
I will try to reproduce this into simpler and reproducible examples, in order to confirm or not.

0 Likes

#4

Seems like @hisham already proposed a PR:

0 Likes

#5

Yes, thanks, the PR solved the issue originally mentioned in my first post… but looks not for the second one I mentioned in my second post. Still trying to get a simple way to reproduce it.

0 Likes