Kong Gateway plugin for routing OpenAI-compatible requests to local Ollama nodes

Hi Kong community,

We’ve published a Kong Gateway plugin for routing OpenAI-compatible chat completion traffic to one or more ollama-agent-router runtime nodes.

The plugin is available on LuaRocks: kong-plugin-ollama-agent-router

You can check the setup flow, configuration examples, and security details in the README here: kong-ollama-agent-router

What it provides:

  • routes OpenAI-compatible /v1/chat/completions traffic through Kong to one or more ollama-agent-router nodes

  • keeps routing decisions, model selection, runtime state, node capabilities, and async job handling behind the Kong Gateway layer

  • supports static node-router discovery with weighted nodes

  • can prefer already loaded models, respect node weights, and fail over on execution errors

  • fetches model specs, runtime capabilities, GPU/VRAM state, queues, and job policies from each node-router instead of duplicating them in Kong config

  • supports secured runtime-agent communication using bearer tokens, custom headers, per-node credentials, TLS verification, and optional client certificates

  • preserves an OpenAI-compatible gateway surface for clients while delegating local machine state and execution details to the runtime nodes

It’s free to use and MIT licenced. Enjoy and all the feedback is welcomed