// about llmai

One Key. Every Model.

LLMAI is an API gateway for AI inference. Point your existing code at our endpoint, pick a model, and start making calls — no extra accounts, no separate billing, no provider-specific SDKs.

65% avg savings vs direct provider rates

🎁Free $2 trial credit — just ask

// the problem we solve

Why LLMAI Exists

🔑

Before

·Separate account per provider
·Four different API keys to rotate
·Four billing dashboards to watch
·Different SDK quirks for each model
·Full list-price rates, no leverage

→

The Gap

·You want Kimi K2.7 Code today
·DeepSeek V4 Pro tomorrow
·MiMo V2.5 for cost control
·Without rewriting config every time
·Without tracking four spending tabs

✓

With LLMAI

·One endpoint, all providers
·One API key in your codebase
·One prepaid balance, all models
·OpenAI-compatible — zero SDK changes
·Per-token pricing, no subscriptions

// under the hood

How Requests Flow

request-lifecycle

[1] your app sends a request

Standard OpenAI chat completion format to api.llmai.dev/v1. The model field determines which provider handles it. Your API key is the only auth credential needed.

[2] llmai routes it upstream

We authenticate with the upstream provider, forward the request, and stream the response back to your client. The wire format stays identical on both ends.

[3] tokens are counted and billed

Input and output tokens are tallied from the response. The published per-token rate for that model is deducted from your prepaid credit balance. No surprises.

[4] console reflects it instantly

Your balance, per-call token breakdown, and cumulative usage are visible in the console the moment the request completes.

// who uses llmai

Built for Builders

// use case 01

Solo developers & side projects

Experiment across multiple models without committing to a paid tier on each. Deposit $10, test everything, pay only for what you use.

// use case 02

Startups shipping AI features

Keep your codebase clean — one endpoint, one key, one line in your config. Swap the underlying model any time without touching application code.

// use case 03

Teams managing multiple projects

Issue separate API keys per project or per environment. Track spending at the key level. No shared credentials, no blended costs.

// use case 04

Cost-conscious production workloads

Per-token billing with no monthly floor means you only pay for what actually runs. Route high-volume tasks to efficient models without a separate account to manage.

// current model catalog

Providers Available Today

Every provider below is accessible from the same endpoint. The catalog expands as we validate new upstream access.

Kimi

K2.7 Code, K2.6, K2.5

DeepSeek

V4 Pro, V4 Flash, V3.2

MiniMax

M3, M2.7

Alibaba

Qwen 3.6, 3.5 Plus

Xiaomi

MiMo V2.5 Pro, V2.5

Google

Gemma 4

Z.AI

GLM 5.2, GLM 5.1

Start Making Calls

Create an account, generate a key, and point your first request at api.llmai.dev/v1. The integration is a one-line config change if you're already on an OpenAI SDK.

🎁 New here? Request a free $2 trial credit before your first deposit.

Create Account →Read the Docs