Instruction: Explain how you would debug a safety issue that appears over long sessions rather than in one turn.
Context: Tests how the candidate diagnoses the problem, chooses the safest next step, and reasons through recovery. Explain how you would debug a safety issue that appears over long sessions rather than in one turn.
Official answer available
Preview the opening of the answer, then unlock the full walkthrough.
I would inspect session state, memory reuse, cumulative prompt effects, and whether the model starts treating earlier speculative content as established fact. Long sessions often drift because the control structure weakens over time, not because the initial policy was wrong.
I would also compare where the first...
easy
easy
easy
easy
easy
easy