Design a sandbox for testing jailbreak and prompt injection resilience.

Instruction: Explain how you would test safety controls in a realistic but safe environment.

Context: Assesses whether the candidate can design a practical architecture and explain the main tradeoffs. Explain how you would test safety controls in a realistic but safe environment.

Official answer available

Preview the opening of the answer, then unlock the full walkthrough.

I would make the sandbox realistic enough to exercise the full workflow. Safety testing is much less useful if it...

Related Questions