Instruction: Explain how you would reason about a gap between internal retrieval metrics and user perception.
Context: Tests how the candidate diagnoses the problem, chooses the safest next step, and reasons through recovery. Explain how you would reason about a gap between internal retrieval metrics and user perception.
Official answer available
Preview the opening of the answer, then unlock the full walkthrough.
I would explain that the eval improved one layer while the user experienced the whole product. Retrieval can get better on a metric like recall@k and still hurt trust if the new candidate set is noisier, more redundant, harder to read, or more likely to support overconfident synthesis.
For example,...
easy
easy
easy
easy
easy
easy