Health check fails to upgrade nodes from UNHEALTHY to HEALTHY

I am trying out the new healthcheck functionality in 0.12.3 and i’m experiencing problems with it moving nodes from UNHEALTHY to HEALTHY.
Detecting when nodes have become unavailable and marking them as UNHEALTHY is working fine but when an upstream node returns to service, the health state does not change to HEALTHY as i would expect.
I am able to manually verify that the endpoint is returning a 401 and this is listed as a healthy status in my configuration. It is also verifying the node correctly when first detecting it.
I’m assuming that perhaps there is a mistake in my configuration but I’m not sure what it could be.

Is anyone able to provide assistance? Thanks for your help.

{
  "created_at": 1521219518968,
  "hash_on": "none",
  "id": "58320191-173b-49c8-b0c2-19f15d8c845e",
  "healthchecks": {
    "active": {
      "unhealthy": {
        "http_statuses": {},
        "tcp_failures": 3,
        "timeouts": 3,
        "http_failures": 3,
        "interval": 5
      },
      "http_path": "/health",
      "healthy": {
        "http_statuses": [
          200,
          302,
          401
        ],
        "interval": 5,
        "successes": 1
      },
      "timeout": 1,
      "concurrency": 10
    },
    "passive": {
      "unhealthy": {
        "http_failures": 0,
        "http_statuses": [
          429,
          500,
          503
        ],
        "tcp_failures": 0,
        "timeouts": 0
      },
      "healthy": {
        "http_statuses": [
          200,
          201,
          202,
          203,
          204,
          205,
          206,
          207,
          208,
          226,
          300,
          301,
          302,
          303,
          304,
          305,
          306,
          307,
          308
        ],
        "successes": 0
      }
    }
  },
  "name": "teststream1",
  "hash_fallback": "none",
  "slots": 100
}

At first glance, the configuration looks correct here. Is your service using HTTP or HTTPS? As of lua-resty-healthcheck 0.4.0, active health checks currently can only probe the health endpoint via HTTP.

HTTP. I ended up stripping the healthcheck right back and using v13 and it worked in the end. I’m not sure if it was the version change or the modification to the healthcheck parameters but I couldn’t see anything wrong with the initial config.
If i have time, i’ll try to replicate the new config (below) on the old v12 to isolate the cause of the issue. But at least it works for now :slight_smile:

curl -X PATCH http://localhost:8001/upstreams/$STREAMNAME --data "healthchecks.active.healthy.interval=5"
curl -X PATCH http://localhost:8001/upstreams/$STREAMNAME --data "healthchecks.active.healthy.successes=3"
curl -X PATCH http://localhost:8001/upstreams/$STREAMNAME --data "healthchecks.active.unhealthy.interval=5"
curl -X PATCH http://localhost:8001/upstreams/$STREAMNAME --data "healthchecks.active.unhealthy.tcp_failures=3"
 
curl -X PATCH \
http://localhost:8001/upstreams/$STREAMNAME \
-H 'content-type: application/json' \
-d '{"healthchecks.active.healthy.http_statuses":[200, 302, 401]}'