Chris's LLM Gateway

Self-hosted, OpenAI-compatible API for chat, embeddings, and audio transcription. Runs on a homelab RTX 5080 and an NVIDIA DGX Spark. Access is by API key — no accounts, no web signup.

TL;DR

Base URL:   https://api.chris.hellotopia.io/v1
Auth:       Authorization: Bearer <your-api-key>
Protocol:   OpenAI-compatible — any OpenAI SDK works unchanged

Drop-in replacement for https://api.openai.com/v1 in any client that supports configurable base URLs.

What's here

Chat completions — from a 3B ultra-fast model up to 80B MoE. See Models.
Embeddings — nomic-embed-text, 768-dim vectors.
Audio transcription — Whisper Large v3, CUDA-accelerated.
OpenAI SDK compatible — openai-python, openai-node, LangChain, LlamaIndex, Continue.dev, Cursor, Zed, etc.

What's not here

No fine-tuning endpoints.
No image generation.
No function-calling/tool-use guarantees beyond what the underlying Ollama model supports.
No SLA. This is a homelab. It may go down when Chris is traveling.

Start

Get an API key from Chris (see Onboarding).
Follow Getting Started for your first request.
Browse Models and Examples.
For the full endpoint reference, see the API Reference.