Kong declarative config upstream healthcheck fails

Hi, having an issue with upstream health being set to “health”:“HEALTHCHECKS_OFF”, and not sure why, if i actually try to send request through it seems to work, but im not sure why HEALTHCHECKS_OFF is returned in query.

config.yml
----------
services:
  - name: my_service
    host: my_service_host
    routes:
    - name: my_service
      hosts:
      - my_service_host
      strip_path: false

upstreams:
  - name: my_service.upstream
    hash_fallback: none
    hash_on: header
    hash_on_header: X-User-ID
    targets:
    - target: my_service.route53.srv
      weight: 100

/ # curl -X GET http://localhost:8001/upstreams/my_service.upstream/health
{"next":null,"data":[{"created_at":1565091126.975,"upstream":{"id":"e4fa5d33-f807-5fa6-a3b6-6178c230d9d9"},"id":"c08fb74f-444b-5e4b-aa43-d459071b5c31","health":"HEALTHCHECKS_OFF","target":"my_service.route53.srv","weight":100}],"node_id":"d9725a4d-3835-4f4e-8b8b-bc6913497658"}

when i send the curl to cehck the health, i see this error in the kong logs.

2019/08/06 11:56:51 [error] 292#0: *394962 [lua] targets.lua:240: page_collection(): failed getting upstream health: balancer not found, client: 127.0.0.1, server: kong_admin, request: "GET /upstreams/my_service.upstream/health HTTP/1.1", host: "localhost:8001"

and none of the targets get listed in targets, there should be about 20 targets, in kong 0.14 there is a different issue with DNS_ERROR, but it actually lists all the targets.

/ # curl -X GET http://localhost:8001/upstreams/my_service.upstream/targets/all
{"next":null,"data":[{"created_at":1565091126.975,"upstream":{"id":"e4fa5d33-f807-5fa6-a3b6-6178c230d9d9"},"id":"c08fb74f-444b-5e4b-aa43-d459071b5c31","tags":null,"target":"my_service.route53.srv","weight":100}]}

Is there any issues with the declarative configuration used to create the service/route/upstream targets?

after kong reload or start, get this error

2019/08/06 13:45:40 [error] 584#0: *434182 lua entry thread aborted: runtime error: /usr/local/share/lua/5.1/resty/dns/balancer/base.lua:798: expected a hostname (string), got nil

Hi, mnk0.
Your configuration is missing the target port, so Kong can’t use it. It should be something like:

    targets:
    - target: my_service.route53.srv:80

Also, this configuration does not define any health check, that’s why Kong is reporting HEALTHCHECKS_OFF. Here is an example of a passive health check:

upstreams:
  - name: my_service.upstream
    healthchecks:
      passive:
        healthy:
          http_statuses:
          - 200
          - 304
          successes: 1
        type: http
        unhealthy:
          http_failures: 5
          http_statuses:
          - 429
          - 500
          - 503
          tcp_failures: 2
          timeouts: 2

You can read more about health checks on https://docs.konghq.com/1.2.x/health-checks-circuit-breakers/

Thanks for the reply!! been blocked on this for several days,

i tried some variations of adding the healthchecks and still failed, but if i dont add them explicityly in the config.yaml my understanding was default values would be used? testing this out now.

2- also, the target upstream is AWS ECS service discovery endpoint. so shouldnt kong pick up the port automatically here? my understanding is that it should retrieve the port from the lua-resty-dns-client. although i 've seen some issues related to this – for example : https://github.com/Kong/kong/issues/4781

for example the port is returned from dig query.

/etc/kong # dig staging-falcon-reader.awsvpc-private ANY

; <<>> DiG 9.14.3 <<>> staging-falcon-reader.awsvpc-private ANY
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 7095
;; flags: qr rd ra; QUERY: 1, ANSWER: 8, AUTHORITY: 4, ADDITIONAL: 8

;; QUESTION SECTION:
;staging-falcon-reader.awsvpc-private. IN ANY

;; ANSWER SECTION:
staging-falcon-reader.awsvpc-private. 3 IN SRV 1 1 35946 ccf0e092-5ac8-4c60-b9cd-2a310988ae53.staging-falcon-reader.awsvpc-private.
staging-falcon-reader.awsvpc-private. 3 IN SRV 1 1 35804 934e7d8a-8d66-49b3-ae1d-a660991a3a55.staging-falcon-reader.awsvpc-private.
staging-falcon-reader.awsvpc-private. 3 IN SRV 1 1 35453 dcebbf0c-48bb-44db-a333-432360e2b7b3.staging-falcon-reader.awsvpc-private.
staging-falcon-reader.awsvpc-private. 3 IN SRV 1 1 35679 16ab2c0b-18d6-4011-b47f-3c18d239244c.staging-falcon-reader.awsvpc-private.
staging-falcon-reader.awsvpc-private. 3 IN SRV 1 1 32867 0bef3aa7-3b68-4032-98ed-089e72578edc.staging-falcon-reader.awsvpc-private.
staging-falcon-reader.awsvpc-private. 3 IN SRV 1 1 35894 b7807815-f982-4cca-b43d-a6f567323e98.staging-falcon-reader.awsvpc-private.
staging-falcon-reader.awsvpc-private. 3 IN SRV 1 1 35811 056f7f32-21f1-47a3-a3f3-f3258b4bb118.staging-falcon-reader.awsvpc-private.
staging-falcon-reader.awsvpc-private. 3 IN SRV 1 1 35760 ed0402f3-234f-4d9b-84e0-c043ba2db2bb.staging-falcon-reader.awsvpc-private.

additionally, tried the configuration with and without that additional block of defining the health check parameters, but still same result.

/etc/kong # curl -X GET http://localhost:8001/upstreams/falcon.upstream
{"created_at":1565281012,"hash_on":"header","id":"e4fa5d33-f807-5fa6-a3b6-6178c230d9d9","tags":null,"name":"falcon.upstream","hash_fallback_header":null,"hash_on_cookie":null,"hash_on_header":"X-Tenant-ID","hash_on_cookie_path":"\/","healthchecks":{"active":{"https_verify_certificate":true,"https_sni":null,"http_path":"\/","timeout":1,"concurrency":10,"healthy":{"http_statuses":[200,302],"interval":0,"successes":0},"unhealthy":{"http_statuses":[429,404,500,501,502,503,504,505],"tcp_failures":0,"timeouts":0,"http_failures":0,"interval":0},"type":"http"},"passive":{"unhealthy":{"http_failures":5,"http_statuses":[429,500,503],"tcp_failures":2,"timeouts":2},"healthy":{"successes":1,"http_statuses":[200,304]},"type":"http"}},"hash_fallback":"none","slots":10000}

/etc/kong # curl -X GET http://localhost:8001/upstreams/falcon.upstream/health
{"next":null,"data":[{"created_at":1565281221.71,"upstream":{"id":"e4fa5d33-f807-5fa6-a3b6-6178c230d9d9"},"id":"c08fb74f-444b-5e4b-aa43-d459071b5c31","health":"HEALTHCHECKS_OFF","target":"staging-falcon.awsvpc-private","weight":10}],"node_id":"b7d5b720-9a5d-42e1-8a2a-89cfd829dfdf"}/

is there any configuration needed for changing the DNS acceptable payload size? or way to verify that the lua-resty-dns client can actually resolve the srv dns records?