Docs/Finops/Ai Optimizer

AI Optimizer

The AI Optimizer answers the hardest FinOps question with data: can we move this agent to a cheaper (or faster) model without hurting quality? It runs your current model against alternatives, scores them with LLM-as-a-Judge, and recommends a swap only when quality holds — with the projected savings.

Open it from FinOps → AI Optimizer. Creating optimizer jobs requires the Owner or AI Steward role; AI Developers have read-only access.

How it works

text
pick agent baseline model + comparison models evaluator (LLM-as-a-Judge) run per-model quality + cost + latency recommendation

The optimizer takes a real agent, runs the baseline and the alternatives, grades their outputs for quality, and compares cost and latency. A model is only recommended if it keeps quality acceptable.

Creating an optimizer job

Select an agent

Choose the agent (LLM-using workflow) you want to optimize — its real traffic defines the comparison.

Choose models

Pick your baseline (the current model) and one or more comparison models (cheaper or faster candidates).

Configure the evaluator

Choose the LLM-as-a-Judge evaluator that scores output quality, plus execution settings.

Run and review

The optimizer evaluates each model and produces recommendations as results complete.

Reading recommendations

Each recommendation is one of:

OutcomeMeaning
Cost-optimizedA cheaper model maintains acceptable quality — switch to save.
Latency-optimizedA faster model maintains acceptable quality — switch to speed up.
Already optimalNo cheaper/faster alternative passed the quality bar.

Recommendations show projected savings (per period and annualized) based on the agent's actual volume. The optimizer also has Analytics (savings trends, recommendation distribution) and Execution History tabs.

The optimizer never recommends a downgrade that fails the quality bar — savings are always quality-gated by the evaluator you chose.

Next steps

© 2026 ANTS Platform, Inc.Docs v1.0 · Last updated June 2026