A high-autonomy agent is safe in sandbox tests and unsafe in live data. How would you explain the gap and close it?

Instruction: Explain how you would reason about a safety gap between sandbox and production reality.

Context: Tests how the candidate diagnoses the problem, chooses the safest next step, and reasons through recovery. Explain how you would reason about a safety gap between sandbox and production reality.

Official answer available

Preview the opening of the answer, then unlock the full walkthrough.

I would explain the gap as an environment realism problem. Sandbox tests often have cleaner data, simpler incentives, and fewer weird edge conditions than live traffic. High-autonomy systems can look disciplined in that world and become risky once the real environment pushes back.

To close the gap, I would bring...

Upgrade to view official answer

A high-autonomy agent is safe in sandbox tests and unsafe in live data. How would you explain the gap and close it?

Related Questions