// about llmai

One Key. Every Model.

LLMAI is an API gateway for AI inference. Point your existing code at our endpoint, pick a model, and start making calls — no extra accounts, no separate billing, no provider-specific SDKs.

65% avg savings vs direct provider rates
🎁Free $2 trial credit — just ask
// the problem we solve

Why LLMAI Exists

🔑
Before
  • ·Separate account per provider
  • ·Four different API keys to rotate
  • ·Four billing dashboards to watch
  • ·Different SDK quirks for each model
  • ·Full list-price rates, no leverage
The Gap
  • ·You want OpenAI today
  • ·DeepSeek tomorrow
  • ·Gemini Flash for cost control
  • ·Without rewriting config every time
  • ·Without tracking four spending tabs
With LLMAI
  • ·One endpoint, all providers
  • ·One API key in your codebase
  • ·One prepaid balance, all models
  • ·OpenAI-compatible — zero SDK changes
  • ·Per-token pricing, no subscriptions
// under the hood

How Requests Flow

request-lifecycle
[1] your app sends a request

Standard OpenAI chat completion format to api.llmai.dev/v1. The model field determines which provider handles it. Your API key is the only auth credential needed.

[2] llmai routes it upstream

We authenticate with the upstream provider, forward the request, and stream the response back to your client. The wire format stays identical on both ends.

[3] tokens are counted and billed

Input and output tokens are tallied from the response. The published per-token rate for that model is deducted from your prepaid credit balance. No surprises.

[4] console reflects it instantly

Your balance, per-call token breakdown, and cumulative usage are visible in the console the moment the request completes.

// who uses llmai

Built for Builders

// use case 01

Solo developers & side projects

Experiment across multiple models without committing to a paid tier on each. Deposit $10, test everything, pay only for what you use.

// use case 02

Startups shipping AI features

Keep your codebase clean — one endpoint, one key, one line in your config. Swap the underlying model any time without touching application code.

// use case 03

Teams managing multiple projects

Issue separate API keys per project or per environment. Track spending at the key level. No shared credentials, no blended costs.

// use case 04

Cost-conscious production workloads

Per-token billing with no monthly floor means you only pay for what actually runs. Route high-volume tasks to efficient models without a separate account to manage.

// current model catalog

Providers Available Today

Every provider below is accessible from the same endpoint. The catalog expands as we validate new upstream access.

OpenAI
GPT-4o, o1, o3
Google
Gemini 2.0, Flash
DeepSeek
V3, R1
Z.AI
GLM series

Start Making Calls

Create an account, generate a key, and point your first request at api.llmai.dev/v1. The integration is a one-line config change if you're already on an OpenAI SDK.

🎁 New here? Request a free $2 trial credit before your first deposit.