A guardrail blocks the answer, but the unsafe content is actually in the retrieved document, not the user query. How would you investigate?

Instruction: Explain how you would debug a guardrail issue caused by retrieved evidence rather than user intent.

Context: Tests how the candidate diagnoses the problem, chooses the safest next step, and reasons through recovery. Explain how you would debug a guardrail issue caused by retrieved evidence rather than user intent.

Official answer available

Preview the opening of the answer, then unlock the full walkthrough.

I would trace where the unsafe content entered the workflow. If it came from retrieval, the fix may belong in...

Related Questions