Instruction: Discuss considerations for model selection, training, and deployment to achieve low latency.
Context: This question assesses the candidate's ability to balance model complexity and performance, particularly in applications requiring quick responses.
Official answer available
Preview the opening of the answer, then unlock the full walkthrough.
I would begin with a hard latency budget and profile the system end to end before I changed the model. In low-latency systems, the bottleneck is often not the model itself. It can be feature fetching, serialization, network hops, or an overly expensive fallback path.
Once I know...
easy
medium
medium
medium
medium
medium