The Flattering Self-Image of Ourselves: When We Compare Ourselves to the Machine, We Idealize the Human

Jun 02, 2026

There is a popular way of arguing that artificial intelligence does not really think. You draw two columns.

On one side you list what a human mind does when it judges: it takes in a rich sensory world, parses a situation, draws on lived experience, is moved by values and goals, reasons about cause and effect, monitors its own uncertainty, and arrives at a judgment it can be held accountable for.

On the other side you list what a large language model does: it ingests text, breaks it into tokens, matches patterns, runs the numbers, predicts the next likely words, and produces a confident answer whether or not that answer is true. Set side by side, the contrast is meant to be devastating.

The human reasons. The machine only predicts.

I think the diagram is wrong. Not about the machine. About us.

The human column is not a description of how clinicians actually judge. It is a description of how we would like to believe we judge. It is the idealized physician, rested and unhurried, free of bias, perfectly calibrated, reasoning from first principles toward an accountable conclusion. That physician does not work on my labor and delivery unit at three in the morning.

Consider the column line by line against real obstetric practice.

We are told that humans ground judgment in a rich perceptual world. Often we do not. A resident accepts a one line sign-out and makes a call about a patient she has never seen. A consultant renders an opinion from a triage note and a single number. That is judgment grounded in thin text, which is precisely the limitation the diagram reserves for the machine.

We are told that humans reason about cause and effect. Obstetrics is a graveyard of plausible causal stories the evidence later demolished. We monitored every low risk labor continuously because the causal story was irresistible: watch the heart rate, prevent the catastrophe. Decades of data then showed more cesareans and operative deliveries with no reduction in cerebral palsy or neonatal death. We performed routine episiotomy on the same kind of reasoning, and prescribed bed rest, and sustained a long list of interventions on a confident narrative rather than on outcomes. Mistaking a good story for a real mechanism is not a machine problem. It is a human one.

We are told, above all, that humans monitor their own uncertainty and can withhold judgment, while the machine is built to project confidence even when it is wrong.

This is the most flattering line in the whole image, and the least accurate.

Physician overconfidence is one of the best documented phenomena in all of medicine, and diagnostic error driven by it is a leading source of preventable harm.

We rarely say the words “I do not know.”

We anchor on the first impression and then defend it.

A single catastrophic shoulder dystocia reshapes a clinician’s practice for years, far out of proportion to the real risk, because one vivid memory overwhelms the base rate.

Forced confidence, the exact phrase used to indict the machine, describes a great deal of human clinical behavior.

So the image does something quietly unfair.

It holds the machine to its worst behavior and the human to her best ideal.

It compares the language model as it actually is against the clinician as she wishes she were.

Judged honestly, several of the supposed fault lines between us are not fault lines at all. They are shared faults.

This matters for how we think about these tools, and it is where professional responsibility enters.

The relevant ethical question is never whether a tool matches an idealized epistemic agent who does not exist. It is whether the tool improves the balance of clinical benefit relative to harm against the realistic alternative.

The honest comparator for an AI counseling aid is not the perfectly calibrated professor in the diagram. It is a tired, biased, overconfident, time pressured clinician, or very often no counseling at all.

None of this argues for handing judgment to the machine. The danger the diagram worries about is real. A fluent, confident answer can quietly substitute for the work of evaluating it, and the patient is left with the feeling of an answer rather than a justified one.

But that danger does not live only in silicon. A confident colleague, a confident guideline, and a confident memory of one bad night can each do the same thing. The verification, the humility, and the accountability the diagram calls irreducibly human are real and they matter. They are not properties we automatically possess by being human. They are obligations we choose to exercise, or neglect.

Which names the actual professional responsibility in front of us. It is not to congratulate ourselves that we are the thoughtful column and the machine is the mechanical one. It is to supply the judgment that neither a pattern completing model nor a pattern matched, overconfident clinician reliably supplies on its own. Learning to use these tools well, and to check them, is part of that obligation now.

The first step is to take the diagram down off the wall and notice that the human in it is a stranger. None of us is that person at three in the morning. The sooner we admit it, the more honest, and the safer, our use of these tools becomes.

Today’s make me feel good music:

ObGyn Intelligence: The Evidence of Women’s Health

Ready for more?