Draft note
Agent demos are good at showing the happy path.
Real workflows are less polite.
Fields change. Labels are ambiguous. The page loads slowly. A required answer depends on context the agent does not have. The user wants help, but not surrender.
That is why I keep coming back to reliability as a product problem, not only a model problem.
An agent that works in the real world needs state tracking, failure memory, recovery paths, and a clear sense of when to stop and ask.
The handoff is not a weakness. Sometimes it is the feature that keeps the system trustworthy.