AI Optimizer
The AI Optimizer answers the hardest FinOps question with data: can we move this agent to a cheaper (or faster) model without hurting quality? It runs your current model against alternatives, scores them with LLM-as-a-Judge, and recommends a swap only when quality holds — with the projected savings.
Open it from FinOps → AI Optimizer. Creating optimizer jobs requires the Owner or AI Steward role; AI Developers have read-only access.
How it works
The optimizer takes a real agent, runs the baseline and the alternatives, grades their outputs for quality, and compares cost and latency. A model is only recommended if it keeps quality acceptable.
Creating an optimizer job
Select an agent
Choose the agent (LLM-using workflow) you want to optimize — its real traffic defines the comparison.
Choose models
Pick your baseline (the current model) and one or more comparison models (cheaper or faster candidates).
Configure the evaluator
Choose the LLM-as-a-Judge evaluator that scores output quality, plus execution settings.
Run and review
The optimizer evaluates each model and produces recommendations as results complete.
Reading recommendations
Each recommendation is one of:
| Outcome | Meaning |
|---|---|
| Cost-optimized | A cheaper model maintains acceptable quality — switch to save. |
| Latency-optimized | A faster model maintains acceptable quality — switch to speed up. |
| Already optimal | No cheaper/faster alternative passed the quality bar. |
Recommendations show projected savings (per period and annualized) based on the agent's actual volume. The optimizer also has Analytics (savings trends, recommendation distribution) and Execution History tabs.
The optimizer never recommends a downgrade that fails the quality bar — savings are always quality-gated by the evaluator you chose.
Next steps
- LLM-as-a-Judge — the quality gate behind recommendations.
- Models & Pricing — make sure candidate models are priced for accurate savings.
- Cost & Performance Metrics — find agents worth optimizing.