An evaluator model gives higher scores to longer answers. How would you fix it?

Instruction: Describe how you would respond when a grader model is clearly biased by style rather than substance.

Context: Tests how the candidate diagnoses the problem, chooses the safest next step, and reasons through recovery. Describe how you would respond when a grader model is clearly biased by style rather than substance.

Official answer available

Preview the opening of the answer, then unlock the full walkthrough.

I would treat that as grader bias and verify it explicitly before changing anything. If the evaluator is rewarding length, then the score is confounded by a surface trait that may have little to do with usefulness or correctness.

To fix it, I would recalibrate the rubric...

Related Questions