Instruction: Explain how you would build a benchmark that reflects the hard parts of retrieval-based assistants.
Context: Assesses whether the candidate can design a practical architecture and explain the main tradeoffs. Explain how you would build a benchmark that reflects the hard parts of retrieval-based assistants.
Official answer available
Preview the opening of the answer, then unlock the full walkthrough.
I would design the harness around the questions that force the system to reason and stay honest. Easy one-hop lookups...
easy
easy
easy
easy
easy
easy