Tools API¶
Manage libraries, models, and catalog. These endpoints handle server administration tasks.
Base URL¶
Authentication¶
When authentication is enabled, include your token in the Authorization header:
Libs¶
Manage llama.cpp libraries installation and updates.
GET /libs¶
Get information about installed llama.cpp libraries.
Authentication: Optional when auth is enabled.
Headers¶
| Header | Required | Description |
|---|---|---|
Authorization |
No | Bearer token for authentication |
Response¶
Returns version information including arch, os, processor, latest version, and current version.
Content-Type: application/json
Examples¶
Get library information:
POST /libs/pull¶
Download and install the latest llama.cpp libraries. Returns streaming progress updates.
Authentication: Required when auth is enabled. Admin token required.
Headers¶
| Header | Required | Description |
|---|---|---|
Authorization |
Yes | Bearer token for admin authentication |
Response¶
Streams download progress as Server-Sent Events.
Content-Type: text/event-stream
Examples¶
Pull latest libraries:
Models¶
Manage models - list, pull, show, and remove models from the server.
GET /models¶
List all available models on the server.
Authentication: Optional when auth is enabled.
Headers¶
| Header | Required | Description |
|---|---|---|
Authorization |
No | Bearer token for authentication |
Response¶
Returns a list of model objects with id, owned_by, model_family, size, and modified fields.
Content-Type: application/json
Examples¶
List all models:
GET /models/{model}¶
Show detailed information about a specific model.
Authentication: Optional when auth is enabled.
Headers¶
| Header | Required | Description |
|---|---|---|
Authorization |
No | Bearer token for authentication |
Response¶
Returns model details including metadata, capabilities, and configuration.
Content-Type: application/json
Examples¶
Show model details:
GET /models/ps¶
List currently loaded/running models in the cache.
Authentication: Optional when auth is enabled.
Headers¶
| Header | Required | Description |
|---|---|---|
Authorization |
No | Bearer token for authentication |
Response¶
Returns a list of running models with id, owned_by, model_family, size, expires_at, and active_streams.
Content-Type: application/json
Examples¶
List running models:
POST /models/index¶
Rebuild the model index for fast model access.
Authentication: Required when auth is enabled. Admin token required.
Headers¶
| Header | Required | Description |
|---|---|---|
Authorization |
Yes | Bearer token for admin authentication |
Response¶
Returns empty response on success.
Content-Type: application/json
Examples¶
Rebuild model index:
POST /models/pull¶
Pull/download a model from a URL. Returns streaming progress updates.
Authentication: Required when auth is enabled. Admin token required.
Headers¶
| Header | Required | Description |
|---|---|---|
Authorization |
Yes | Bearer token for admin authentication |
Content-Type |
Yes | Must be application/json |
Request Body¶
Content-Type: application/json
| Field | Type | Required | Description |
|---|---|---|---|
model_url |
string |
Yes | URL to the model GGUF file |
proj_url |
string |
No | URL to the projection file (for vision/audio models) |
Response¶
Streams download progress as Server-Sent Events.
Content-Type: text/event-stream
Examples¶
Pull a model from HuggingFace:
curl -X POST https://api.getkawai.com/v1/models/pull \
-H "Content-Type: application/json" \
-d '{
"model_url": "https://huggingface.co/Qwen/Qwen3-8B-GGUF/resolve/main/Qwen3-8B-Q8_0.gguf"
}'
DELETE /models/{model}¶
Remove a model from the server.
Authentication: Required when auth is enabled. Admin token required.
Headers¶
| Header | Required | Description |
|---|---|---|
Authorization |
Yes | Bearer token for admin authentication |
Response¶
Returns empty response on success.
Content-Type: application/json
Examples¶
Remove a model:
Catalog¶
Browse and pull models from the curated model catalog.
GET /catalog¶
List all models available in the catalog.
Authentication: Optional when auth is enabled.
Headers¶
| Header | Required | Description |
|---|---|---|
Authorization |
No | Bearer token for authentication |
Response¶
Returns a list of catalog models with id, category, owned_by, model_family, and capabilities.
Content-Type: application/json
Examples¶
List catalog models:
GET /catalog/filter/{filter}¶
List catalog models filtered by category.
Authentication: Optional when auth is enabled.
Headers¶
| Header | Required | Description |
|---|---|---|
Authorization |
No | Bearer token for authentication |
Response¶
Returns a filtered list of catalog models.
Content-Type: application/json
Examples¶
Filter catalog by category:
GET /catalog/{model}¶
Show detailed information about a catalog model.
Authentication: Optional when auth is enabled.
Headers¶
| Header | Required | Description |
|---|---|---|
Authorization |
No | Bearer token for authentication |
Response¶
Returns full catalog model details including files, capabilities, and metadata.
Content-Type: application/json
Examples¶
Show catalog model details:
POST /catalog/pull/{model}¶
Pull a model from the catalog by ID. Returns streaming progress updates.
Authentication: Optional when auth is enabled.
Headers¶
| Header | Required | Description |
|---|---|---|
Authorization |
No | Bearer token for authentication |
Response¶
Streams download progress as Server-Sent Events.
Content-Type: text/event-stream
Examples¶
Pull a catalog model: