Instruction: Describe the signals you would look at when evaluating agent memory.
Context: Checks whether the candidate can explain the core concept clearly and connect it to real production decisions. Describe the signals you would look at when evaluating agent memory.
I would compare the agent with and without memory on the workflows memory is supposed to improve. The key metrics are task success, number of clarification turns, consistency across turns, stale-memory incidents, and whether memory changes tool or routing decisions in a useful way.
I also inspect failure shape. Memory often hurts by surfacing outdated facts, overpersonalizing the response, or anchoring the agent on earlier assumptions that should have been replaced by fresh evidence.
So I do not judge memory by whether the conversation feels smoother. I judge it by whether it improves the job without increasing stale context risk.
What I always try to avoid is giving a process answer that sounds clean in theory but falls apart once the data, users, or production constraints get messy.
A weak answer is measuring memory mainly by whether users feel the agent remembers them. That can hide whether memory is actually improving or corrupting task execution.
easy
easy
easy
easy
easy
easy