How do you decide which requests deserve the expensive model?

Instruction: Explain how you would choose when to route traffic to a higher-cost model.

Context: Checks whether the candidate can explain the core concept clearly and connect it to real production decisions. Explain how you would choose when to route traffic to a higher-cost model.

Example Answer

The way I'd approach it in an interview is this: I decide based on expected value of extra quality. Requests that are ambiguous, high-stakes, multi-step, or historically error-prone are the best candidates for the expensive model. Straightforward or repetitive requests often do fine on the cheaper path.

I also want the routing rule to be observable. If the expensive model is being used, I should be able to explain why. That makes cost control and fairness review much easier later.

The wrong way to do this is to route everything that feels important in the moment. The right way is to use measurable signals tied to quality lift.

What I always try to avoid is giving a process answer that sounds clean in theory but falls apart once the data, users, or production constraints get messy.

Common Poor Answer

A weak answer is saying complex requests should use the expensive model. The better answer defines how complexity is detected and whether the extra model actually changes outcomes.

Related Questions