A web-browsing agent reads a page that tells it to ignore previous instructions. What do you do first?

Instruction: Describe your first response to a likely prompt injection in retrieved content.

Context: Tests how the candidate diagnoses the problem, chooses the safest next step, and reasons through recovery. Describe your first response to a likely prompt injection in retrieved content.

Official answer available

Preview the opening of the answer, then unlock the full walkthrough.

My first move is containment. I would stop that content from influencing any actions and then inspect how the browsing...

Related Questions