API reference / Chat completions

Chat completions

GRONINGEN · NL

Generate text from a conversation. Supports streaming, system prompts, and multi-turn conversations.

Chat completions

POST /v1/chat/completions

Request body

Parameter Type Description
model string Model to use. Try qwen or gemma. Required
messages array Array of message objects with role and content. Required
stream boolean Stream partial responses as server-sent events. Default false.
temperature number Sampling temperature, 0-2. Lower = more deterministic. Default 1.
max_tokens integer Maximum tokens to generate. Model decides if not set.
top_p number Nucleus sampling threshold, 0-1. Default 1.

Message roles

Each message in the array has a role that determines how it’s treated.

Role Description
system Sets context for the conversation. Placed first in messages array.
user Messages from the user. The model generates a response to this.
assistant Previous model responses. Use for multi-turn conversations.

Example request

import os
from openai import OpenAI

client = OpenAI(
    base_url="https://router.appelon.ai/v1",
    api_key=os.environ["APPELON_API_KEY"]
)

response = client.chat.completions.create(
    model="qwen",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is the capital of the Netherlands?"}
    ]
)

print(response.choices[0].message.content)
# → "The capital of the Netherlands is Amsterdam."

Response

Returns a completion object with the generated message.

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "model": "Qwen/Qwen3.6-35B-A3B-FP8",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "The capital of the Netherlands is Amsterdam."
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 28,
    "completion_tokens": 12,
    "total_tokens": 40
  }
}

Streaming

Set stream: true to receive tokens as they’re generated.

stream = client.chat.completions.create(
    model="qwen",
    messages=[{"role": "user", "content": "Tell me a story"}],
    stream=True
)

for chunk in stream:
    print(chunk.choices[0].delta.content, end="")