Chris's LLM Gateway
Self-hosted, OpenAI-compatible API for chat, embeddings, and audio transcription. Runs on a homelab RTX 5080 and an NVIDIA DGX Spark. Access is by API key — no accounts, no web signup.
TL;DR
Base URL: https://api.chris.hellotopia.io/v1
Auth: Authorization: Bearer <your-api-key>
Protocol: OpenAI-compatible — any OpenAI SDK works unchanged
Drop-in replacement for https://api.openai.com/v1 in any client that supports configurable base URLs.
What's here
- Chat completions — from a 3B ultra-fast model up to 80B MoE. See Models.
- Embeddings —
nomic-embed-text, 768-dim vectors. - Audio transcription — Whisper Large v3, CUDA-accelerated.
- OpenAI SDK compatible —
openai-python,openai-node, LangChain, LlamaIndex, Continue.dev, Cursor, Zed, etc.
What's not here
- No fine-tuning endpoints.
- No image generation.
- No function-calling/tool-use guarantees beyond what the underlying Ollama model supports.
- No SLA. This is a homelab. It may go down when Chris is traveling.
Start
- Get an API key from Chris (see Onboarding).
- Follow Getting Started for your first request.
- Browse Models and Examples.
- For the full endpoint reference, see the API Reference.