Quillium routes your prompts to OpenAI, Anthropic, or Google AI and shows you exactly what every word costs. No opaque credit systems. No surprise invoices.
POST /v1/generate
{
"prompt": "Write a product description for...",
"model": "auto",
"max_tokens": 500
}
// Response includes cost breakdown
{
"content": "Introducing the...",
"model_used": "claude-4-sonnet",
"cost": $0.0023,
"tokens": { "in": 42, "out": 487 }
}
Type what you need or hit the API. Stories, product copy, emails, code docs, anything.
Set model: "auto" and Quillium routes to the fastest, cheapest model that nails your task. Or choose manually.
Every response shows exact token count, model used, and cost in dollars. Not credits. Not "words." Dollars.
OpenAI, Anthropic, Google AI, all behind one endpoint. Switch models per request or let Quillium optimize automatically.
See cost-per-request in real time. Usage-based pricing via Stripe. Your invoice matches your usage, always.
REST API with SDKs for Node, Python, and Go. Build Quillium into your product, your workflow, your pipeline.
Identical prompts hit cache instead of the API. Same output, zero cost. Automatic, no config required.
You wouldn't pay for electricity without seeing the meter. Quillium treats AI the same way: transparent costs, predictable bills, and the freedom to use any model without vendor lock-in.