A model change improves average score but increases policy violations. What do you do?

Instruction: Describe how you would respond when a model improves overall quality but hurts safety or policy compliance.

Context: Tests how the candidate diagnoses the problem, chooses the safest next step, and reasons through recovery. Describe how you would respond when a model improves overall quality but hurts safety or policy compliance.

Official answer available

Preview the opening of the answer, then unlock the full walkthrough.

I would block the change until I understand the failure pattern. If policy violations went up, the system got better in a way that may not be safe or shippable, even if the average score looks nicer.

Then I would break the result apart. Did the new model become...

Upgrade to view official answer

A model change improves average score but increases policy violations. What do you do?

Related Questions