Why human review still matters, even with AI
Descartes taught us the concept of methodological doubt. He believed that the way to reach certainty is by systematically, though tentatively, doubting everything.
It is good practice to apply this principle to statements generated by AI.
AI Agents often outsmart us, but we can still review their output. The reason is that it is easier to recognize something than to recall it. Generating output is hard, but when it is in front of us, we can test it, compare it, and question it.
Reviews are not optional. We know agents tend to misbehave through hallucination, overconfidence, or gaps in knowledge. If errors are predictable, review is not optional. Ignoring that step would mean outsourcing judgment entirely.
Reviews remain necessary even with perfect agents. The flaw may not be in the system, but in the input we provide. The meaning of our prompt might be ambiguous; our assumptions may be incomplete, and our goals poorly framed.
By reviewing the output, we often discover weaknesses in the input.
Reviews remain necessary, but repetition is not. When agents perform tasks under sufficiently similar conditions, a single review might be enough. What counts as “sufficiently similar” is debatable, since context is never identical. However, the goal is not perfection, but convergence toward something good enough.
Depending on the specific tasks, reviews do not always need to happen upfront. They can also be conducted after tasks have been executed. This applies when the potential harm an AI agent can cause is limited in time and scope. It especially applies to tasks that are reversible.
Trial and error is not a flaw in the process; it is the process. Recognizing the need for review reflects responsible use. When applied proportionately to the stakes, this loop steadily improves both the output and our own judgment, guiding us toward results that are reliably good enough.
