Instruction: Describe how you would investigate a user-facing latency problem when the median metric looks healthy.
Context: Tests how the candidate diagnoses the problem, chooses the safest next step, and reasons through recovery. Describe how you would investigate a user-facing latency problem when the median metric looks healthy.
Official answer available
Preview the opening of the answer, then unlock the full walkthrough.
I would look beyond the median immediately. A chatbot can feel slow because of tail latency, queueing, retrieval overhead, client rendering delays, or because the first useful token arrives too late even if total inference time is acceptable.
I would break down time to first token,...
easy
easy
easy
easy
easy
easy