An agent passes step-level checks and still fails end-to-end tasks. How would you investigate?

Instruction: Describe how you would debug an agent that looks healthy at each step but still fails to complete the workflow.

Context: Tests how the candidate diagnoses the problem, chooses the safest next step, and reasons through recovery. Describe how you would debug an agent that looks healthy at each step but still fails to complete the workflow.

Official answer available

Preview the opening of the answer, then unlock the full walkthrough.

Step-level health can create false comfort. I would look at how the steps interact, because small local mistakes often compound...

Related Questions