A model swap looks neutral in offline tests and causes live cost blowups through longer outputs. How would you prevent that in the future?

Instruction: Explain how you would guard against cost regressions that appear only in production behavior.

Context: Tests how the candidate diagnoses the problem, chooses the safest next step, and reasons through recovery. Explain how you would guard against cost regressions that appear only in production behavior.

Official answer available

Preview the opening of the answer, then unlock the full walkthrough.

I would add output-length and cost analysis to the release process. A model can pass quality checks and still be...

Related Questions