POST /v1/chat/completions
Generate a chat response from a model.
Request
POST https://api.chris.hellotopia.io/v1/chat/completions
Authorization: Bearer <api_key>
Content-Type: application/json
Required
| Field | Type | Notes |
|---|---|---|
model |
string | Model ID from Models. |
messages |
array | {"role": "system\|user\|assistant", "content": "..."}. |
Common options
| Field | Type | Default | Notes |
|---|---|---|---|
stream |
bool | false |
SSE streaming. |
temperature |
number | 0.7 | 0–2. |
max_tokens |
int | model default | Cap on output tokens. |
top_p |
number | 1.0 | Nucleus sampling. |
stop |
string|array | — | Stop sequences. |
n |
int | 1 | Number of choices. |
seed |
int | — | Reproducible sampling (best-effort). |
Example — non-streaming
curl https://api.chris.hellotopia.io/v1/chat/completions \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "llama3.1:8b",
"messages": [
{"role": "system", "content": "You are terse."},
{"role": "user", "content": "What is 2+2?"}
],
"max_tokens": 20
}'
Example — streaming
from openai import OpenAI
client = OpenAI(
base_url="https://api.chris.hellotopia.io/v1",
api_key="sk-...",
)
stream = client.chat.completions.create(
model="llama3.3:70b",
messages=[{"role": "user", "content": "Write a haiku about compilers."}],
stream=True,
)
for chunk in stream:
delta = chunk.choices[0].delta.content or ""
print(delta, end="", flush=True)
Example — vision (image input)
from openai import OpenAI
import base64, pathlib
img = base64.b64encode(pathlib.Path("photo.jpg").read_bytes()).decode()
client = OpenAI(base_url="https://api.chris.hellotopia.io/v1", api_key="sk-...")
resp = client.chat.completions.create(
model="5080/llama3.2-vision:11b",
messages=[{
"role": "user",
"content": [
{"type": "text", "text": "What is in this image?"},
{"type": "image_url",
"image_url": {"url": f"data:image/jpeg;base64,{img}"}},
],
}],
)
print(resp.choices[0].message.content)
Response
Standard OpenAI shape:
{
"id": "chatcmpl-...",
"object": "chat.completion",
"created": 1713456789,
"model": "llama3.1:8b",
"choices": [
{
"index": 0,
"message": {"role": "assistant", "content": "4"},
"finish_reason": "stop"
}
],
"usage": {"prompt_tokens": 23, "completion_tokens": 1, "total_tokens": 24}
}
Tool / function calling
Supported only for models whose underlying Ollama build exposes tool-calling (llama3.1+, qwen3+, llama3.3). Payload format matches OpenAI's tools / tool_choice fields. No guarantee that a given model honors tool_choice: "required" — test first.
Errors
| HTTP | Meaning |
|---|---|
| 401 | Missing/invalid API key. |
| 404 | Unknown model ID. |
| 408/504 | Upstream Ollama didn't respond within 120s — usually a cold load on a busy model. Retry. |
| 429 | Rate-limited (budget/limit on your key, if configured). |
| 5xx | Gateway or backend error. Check status with Chris. |