High availability for Open WebUI.

This guide is for Site Reliability Engineers (SREs) looking to deploy Open WebUI in a high-availability architecture beyond a single container. It outlines the necessary components and configurations for a distributed system capable of handling significant user traffic and ensuring 100% uptime. The focus is on achieving statelessness for the application containers while leveraging external services for persistence and state management.

Points clés

  • Open WebUI is a chat interface for local and hosted LLMs with a built-in inference engine for RAG.
  • Quickstart options for Open WebUI include Python and Docker for single-container deployments.
  • High Availability deployment requires multiple stateless WebUI containers (Kubernetes Pods, Swarm services, ECS tasks).
  • A highly-available Redis tier (stand-alone, Cluster, or Sentinel), such as Amazon’s ElastiCache, is necessary for app state and WebSocket fan-out.
  • External SQL (PostgreSQL preferred) or a PVC for the default SQLite file is needed for the database, with AWS Aurora as an example.
  • A load balancer that understands long-lived WebSocket upgrades, like an AWS Application Load Balancer (ALB), is required.
  • Everything is stateless except JWT/session cookies, app state & WebSocket fan-out (persisted in Redis), the database (Postgres), and persistent storage for uploaded files (AWS EFS/EBS or S3).
  • Mandatory environment variables for high availability include WEBUI_SECRET_KEY, REDIS_URL, ENABLE_WEBSOCKET_SUPPORT, WEBSOCKET_MANAGER, WEBSOCKET_REDIS_URL, DATABASE_URL, UVICORN_WORKERS, THREAD_POOL_SIZE, and WEBUI_URL.
  • Load balancer configuration should include an HTTP idle timeout of ≥ 65 s and health checks to /healthz.
  • Scaling considerations include addressing the in-container ChromaDB bottleneck with an external VECTOR_DB, choosing a RAG Content Extraction Engine, and configuring the RAG_EMBEDDING_ENGINE and TASK_MODEL settings for better performance.

À retenir

So, you’ve decided to ditch the cozy single container life for the thrilling world of high availability. Good for you! Just remember, with great uptime comes great responsibility… and a whole lot more configuration. But hey, at least you won’t be getting paged at 3 AM because someone accidentally deleted a pod. Probably. Now go forth and make your Open WebUI deployment as resilient as your coffee tolerance!

Sources