Design an incident response process for prompt injection or data leakage.

Instruction: Describe how you would respond operationally to a serious AI safety incident.

Context: Assesses whether the candidate can design a practical architecture and explain the main tradeoffs. Describe how you would respond operationally to a serious AI safety incident.

Official answer available

Preview the opening of the answer, then unlock the full walkthrough.

I would handle it like a real security incident with AI-specific trace analysis. First contain the capability or path that allowed the event. Then preserve enough evidence to understand the chain from untrusted content or user input to the unsafe outcome.

The review should identify which boundary...

Related Questions