Cost Spike Runbook

Steps to diagnose and reduce LLM cost spikes quickly and safely.

Published:February 8, 2026

Admin User

Updated:February 9, 2026

published

What is the immediate containment step?
Freeze routing/prompt changes and apply budgets or throttles to stop runaway spend.

How do we find the cause fast?
Identify top callers, token-heavy prompts, and retry patterns; compare to baseline.

When do we rollback?
If cost signals breach thresholds and verification confirms a regression in routing/prompt behavior.

What evidence should we capture?
Routing versions, prompt versions, top endpoints, cost per task graphs, and actions taken.

What’s the prevention step?
Add budget gates + canary rollout + monitoring alerts for unit cost.

Share