Skip to content

KAITOΒΆ

KAITO is a Kubernetes operator that supports deploying and serving LLMs with vLLM. It offers managing large models via container images with built-in OpenAI-compatible inference, auto-provisioning GPU nodes and curated model presets.

Please refer to quick start for more details.