The model is polite and policy-compliant in chat, but tool outputs leak hidden sensitive fields. What do you change?

Instruction: Explain how you would respond when the leak comes from the tool layer rather than the conversational layer.

Context: Tests how the candidate diagnoses the problem, chooses the safest next step, and reasons through recovery. Explain how you would respond when the leak comes from the tool layer rather than the conversational layer.

Official answer available

Preview the opening of the answer, then unlock the full walkthrough.

If the tool output is the leak path, prompt tuning is the wrong fix. I would sanitize the result before...

Upgrade to view official answer

The model is polite and policy-compliant in chat, but tool outputs leak hidden sensitive fields. What do you change?

Related Questions