API Reference
Complete reference for the LLMAI API — endpoint, authentication, request format, and response structure.
Endpoint
Every request goes to:
https://api.llmai.dev/v1LLMAI implements the OpenAI Chat Completions interface, so any library or tooling built for that format works out of the box — just swap the base URL and API key.
Authentication
Attach your LLMAI key to every request using the Authorization header:
Authorization: Bearer YOUR_API_KEYKeys are generated in the developer console. A missing or expired key returns 401 Unauthorized.
Compatibility Summary
| Parameter | Value |
|---|---|
| Base URL | https://api.llmai.dev/v1 |
| Key format | sk-llmaai-... |
| Chat endpoint | /v1/chat/completions |
| Model list endpoint | /v1/models |
| Wire protocol | OpenAI Chat Completions |
Sending a Chat Request
Required Fields
{
"model": "gpt-5.4",
"messages": [
{ "role": "system", "content": "You are a precise technical writer." },
{ "role": "user", "content": "Describe API rate limits in two sentences." }
]
}| Field | Type | Description |
|---|---|---|
model | string | Exact model slug — see the Models page |
messages | array | List of message objects, each with role and content |
Optional Fields
| Field | Type | Default | Notes |
|---|---|---|---|
temperature | float | 1.0 | Controls output randomness; valid range 0–2 |
max_tokens | int | model max | Hard cap on tokens generated in this response |
stream | bool | false | Set true to receive incremental SSE chunks |
top_p | float | 1.0 | Nucleus sampling — alternative to temperature |
Response Structure
{
"id": "chatcmpl-xyz789",
"object": "chat.completion",
"created": 1714000000,
"model": "gpt-5.4",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Rate limits cap the number of requests per unit time..."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 30,
"completion_tokens": 18,
"total_tokens": 48
}
}The usage block reflects the exact token counts billed for this call.
Streaming
Pass "stream": true to receive a stream of server-sent events (SSE). The connection stays open and each chunk arrives as a data: line:
curl https://api.llmai.dev/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gemini-3-flash-preview",
"messages": [{"role": "user", "content": "List five programming languages."}],
"stream": true
}'The stream terminates with a final data: [DONE] line. Handle it in your client to know when the response is complete.
Error Reference
| Status | Meaning | Action |
|---|---|---|
400 | Malformed request body | Check JSON syntax and required fields |
401 | Invalid or missing API key | Verify the Authorization header |
402 | Insufficient balance | Add credit at console.llmai.dev/billing |
404 | Unknown model slug | Cross-check slug spelling in the Models reference |
429 | Request rate exceeded | Reduce request frequency or add retry backoff |
500 | Upstream error | Retry; open a support ticket if it persists |