IDE Integration

All listed IDEs / plugins accept an OpenAI-compatible endpoint. Point them at this gateway.

Continue.dev (VS Code, JetBrains)

Edit your ~/.continue/config.json:

{
  "models": [
    {
      "title": "Qwen Coder 14B (5080)",
      "provider": "openai",
      "model": "coder/qwen2.5-coder:14b",
      "apiBase": "https://api.chris.hellotopia.io/v1",
      "apiKey": "sk-..."
    },
    {
      "title": "Llama 3.3 70B (Spark)",
      "provider": "openai",
      "model": "llama3.3:70b",
      "apiBase": "https://api.chris.hellotopia.io/v1",
      "apiKey": "sk-..."
    }
  ],
  "tabAutocompleteModel": {
    "title": "Qwen Coder 7B (5080)",
    "provider": "openai",
    "model": "coder/qwen2.5-coder:7b",
    "apiBase": "https://api.chris.hellotopia.io/v1",
    "apiKey": "sk-..."
  },
  "embeddingsProvider": {
    "provider": "openai",
    "model": "embed/nomic-embed-text",
    "apiBase": "https://api.chris.hellotopia.io/v1",
    "apiKey": "sk-..."
  }
}

Cursor

Settings → Models → Add custom model.

Model name: coder/qwen2.5-coder:14b
Base URL: https://api.chris.hellotopia.io/v1
API Key: sk-...

Toggle off the OpenAI-hosted models if you want to force-route everything here.

Zed

settings.json:

{
  "language_models": {
    "openai": {
      "api_url": "https://api.chris.hellotopia.io/v1",
      "available_models": [
        { "name": "llama3.3:70b", "max_tokens": 16384 },
        { "name": "coder/qwen2.5-coder:14b", "max_tokens": 16384 }
      ]
    }
  }
}

Set OPENAI_API_KEY in Zed's secret store.

Aider

export OPENAI_API_KEY=sk-...
export OPENAI_API_BASE=https://api.chris.hellotopia.io/v1
aider --model openai/llama3.3:70b

Open WebUI (the team chat UI)

If you're using Chris's hosted chat UI at llm.chrisremote.performance-software.com, it's already wired through this gateway — no config needed.

Choosing the right model for IDE work

Tab completion / inline suggestions: coder/qwen2.5-coder:7b — latency matters more than quality.
Chat / "explain this code": coder/qwen2.5-coder:14b or llama3.3:70b — quality matters more than latency.
Large-context refactor / multi-file edit: llama3.3:70b on the Spark.