A customer says the assistant hallucinated, but the trace shows the answer was stitched from partially relevant evidence. How would you explain the failure?

Instruction: Describe how you would reason about a grounded-looking answer that still overreaches.

Context: Tests how the candidate diagnoses the problem, chooses the safest next step, and reasons through recovery. Describe how you would reason about a grounded-looking answer that still overreaches.

Official answer available

Preview the opening of the answer, then unlock the full walkthrough.

I would explain that this was not a pure fabrication from nowhere. The system found nearby evidence, but it over-assembled it into a stronger claim than the documents actually supported. From the customer’s perspective, that still feels like a hallucination, and that is a fair reaction.

Internally, I would label it as...

Related Questions