Skip to content

Responses API

Generate responses using language models. Compatible with the OpenAI Responses API.

Base URL

https://api.getkawai.com/v1

Authentication

When authentication is enabled, include your token in the Authorization header:

Authorization: Bearer API_KEY

Responses

Create responses with language models using the Responses API format.

POST /responses

Create a response. Supports streaming responses with Server-Sent Events.

Authentication: Required when auth is enabled. Token must have 'responses' endpoint access.

Headers

Header Required Description
Authorization Yes Bearer token for authentication
Content-Type Yes Must be application/json

Request Body

Content-Type: application/json

Field Type Required Description
model string Yes ID of the model to use
input array Yes Array of input messages (same format as chat messages)
stream boolean No Enable streaming responses (default: false)
instructions string No System instructions for the model
tools array No List of tools the model can use
tool_choice string No How the model should use tools: auto, none, or required
parallel_tool_calls boolean No Allow parallel tool calls (default: true)
store boolean No Whether to store the response (default: true)
truncation string No Truncation strategy: auto or disabled (default: disabled)
temperature float32 No Controls randomness of output (default: 0.8)
top_k int32 No Limits token pool to K most probable tokens (default: 40)
top_p float32 No Nucleus sampling threshold (default: 0.9)
min_p float32 No Dynamic sampling threshold (default: 0.0)
max_tokens int No Maximum output tokens (default: context window)
repeat_penalty float32 No Penalty for repeated tokens (default: 1.1)
repeat_last_n int32 No Recent tokens to consider for repetition penalty (default: 64)
dry_multiplier float32 No DRY sampler multiplier for n-gram repetition penalty (default: 0.0, disabled)
dry_base float32 No Base for exponential penalty growth in DRY (default: 1.75)
dry_allowed_length int32 No Minimum n-gram length before DRY applies (default: 2)
dry_penalty_last_n int32 No Recent tokens DRY considers, 0 = full context (default: 0)
xtc_probability float32 No XTC probability for extreme token culling (default: 0.0, disabled)
xtc_threshold float32 No Probability threshold for XTC culling (default: 0.1)
xtc_min_keep uint32 No Minimum tokens to keep after XTC culling (default: 1)
enable_thinking string No Enable model thinking for non-GPT models (default: true)
reasoning_effort string No Reasoning level for GPT models: none, minimal, low, medium, high (default: medium)
return_prompt bool No Include prompt in response (default: false)
include_usage bool No Include token usage information in streaming responses (default: true)
logprobs bool No Return log probabilities of output tokens (default: false)
top_logprobs int No Number of most likely tokens to return at each position, 0-5 (default: 0)
stream bool No Stream response as server-sent events (default: false)

Response

Returns a response object, or streams Server-Sent Events if stream=true.

Content-Type: application/json or text/event-stream

Examples

Basic response:

curl -X POST https://api.getkawai.com/v1/responses \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3-8b-q8_0",
    "input": [
      {"role": "user", "content": "Hello, how are you?"}
    ]
  }'

Streaming response:

curl -X POST https://api.getkawai.com/v1/responses \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3-8b-q8_0",
    "input": [
      {"role": "user", "content": "Write a short poem about coding"}
    ],
    "stream": true
  }'

With tools:

curl -X POST https://api.getkawai.com/v1/responses \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3-8b-q8_0",
    "input": [
      {"role": "user", "content": "What is the weather in London?"}
    ],
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_weather",
          "description": "Get the current weather for a location",
          "parameters": {
            "type": "object",
            "properties": {
              "location": {"type": "string", "description": "City name"}
            },
            "required": ["location"]
          }
        }
      }
    ]
  }'

Response Format

The Responses API returns a structured response object with output items.

Response Object

The response object contains metadata, output items, and usage information.

Examples

{
  "id": "resp_abc123",
  "object": "response",
  "created_at": 1234567890,
  "status": "completed",
  "model": "qwen3-8b-q8_0",
  "output": [
    {
      "type": "message",
      "id": "msg_xyz789",
      "status": "completed",
      "role": "assistant",
      "content": [
        {
          "type": "output_text",
          "text": "Hello! I'm doing well, thank you for asking.",
          "annotations": []
        }
      ]
    }
  ],
  "usage": {
    "input_tokens": 12,
    "output_tokens": 15,
    "total_tokens": 27
  }
}

Streaming Events

When stream=true, the API returns Server-Sent Events with different event types.

Examples

event: response.created
data: {"type":"response.created","response":{...}}

event: response.in_progress
data: {"type":"response.in_progress","response":{...}}

event: response.output_item.added
data: {"type":"response.output_item.added","item":{...}}

event: response.output_text.delta
data: {"type":"response.output_text.delta","delta":"Hello"}

event: response.output_text.done
data: {"type":"response.output_text.done","text":"Hello! How are you?"}

event: response.completed
data: {"type":"response.completed","response":{...}}

Function Call Output

When the model calls a tool, the output contains a function_call item instead of a message.

Examples

{
  "output": [
    {
      "type": "function_call",
      "id": "fc_abc123",
      "call_id": "call_xyz789",
      "name": "get_weather",
      "arguments": "{\"location\":\"London\"}",
      "status": "completed"
    }
  ]
}