FinOps
FinOps turns your traces into financial and performance insight: what your AI costs, where the spend goes, how fast it responds, and how to spend less without sacrificing quality.
ℹ
FinOps relies on accurate cost data. Costs are computed from your generations' token usage and your model pricing definitions — keep those current for trustworthy numbers.
What's in FinOps
| Page | Answers |
|---|---|
| AI Usage | How much AI are we using, and what does it cost? |
| Cost & Latency Dashboards | Where does spend and latency go, by model and over time? |
| Cost & Performance Metrics | How does cost trade off against quality and speed? |
| AI Optimizer | Which cheaper/faster model could we switch to safely? |
| Models & Pricing | How is cost calculated, and how do I price custom models? |
How cost is calculated
For each LLM call (generation), cost = (input tokens × input price) + (output tokens × output price), using the price from the matching model definition. Per-trace cost is the sum across its generations; everything else (per model, user, agent, day) rolls up from there.
In this section
Related deep dives
- AI Cost — token monitoring, budgets, ROI analytics
- AI Resilient — performance, SLOs, alerting