A tool call succeeds technically and still creates the wrong business outcome. What do you inspect?

Instruction: Describe how you would debug a tool call that was valid at the API level but wrong for the user.

Context: Tests how the candidate diagnoses the problem, chooses the safest next step, and reasons through recovery. Describe how you would debug a tool call that was valid at the API level but wrong for the user.

Official answer available

Preview the opening of the answer, then unlock the full walkthrough.

The way I'd think about it is this: I inspect the semantic contract, not just the API result. If the call succeeded technically but created the wrong business outcome, then something upstream failed in intent interpretation, argument meaning, workflow state, or policy logic....

Related Questions