Design an evaluation program for correctness, reviewability, and engineering trust.

Instruction: Describe how you would evaluate a coding agent across technical and human dimensions.

Context: Assesses whether the candidate can design a practical architecture and explain the main tradeoffs. Describe how you would evaluate a coding agent across technical and human dimensions.

Official answer available

Preview the opening of the answer, then unlock the full walkthrough.

I would treat those as separate but connected metrics. Correctness covers whether the change actually solves the task without regression. Reviewability covers diff size, clarity, rationale, and validation evidence. Engineering trust covers downstream signals like accept rate, revert rate, repeat usage, and whether reviewers...

Related Questions