Tail latency comes from a small percentage of long-running requests with tool calls and retrieval. How would you reduce it?

Instruction: Describe how you would reduce tail latency in complex requests.

Context: Tests how the candidate diagnoses the problem, chooses the safest next step, and reasons through recovery. Describe how you would reduce tail latency in complex requests.

Official answer available

Preview the opening of the answer, then unlock the full walkthrough.

I would attack the long path directly instead of tuning the median. The right fix is often to bound or...

Related Questions