You Cannot Criticize What You Cannot Define

Apr 14, 2026

A new national survey just reported that public trust in AI in healthcare has fallen from 52% to 42% in two years. The headline ran everywhere. The commentary flowed. Physicians worried. Hospital administrators convened meetings. And not one story stopped to ask the obvious question: what exactly did those 1,007 Americans think they were being asked about?

The survey never defined AI. Not once.

This is not a minor oversight. It is the whole problem.

What Is AI, Exactly?

Artificial intelligence is not one thing. It is a broad category that includes tools so different from each other that lumping them together is like asking whether Americans trust “medicine” and then drawing conclusions about surgery.

A large language model, or LLM, is a system trained on vast amounts of text that can read, write, reason, and respond in natural language. ChatGPT is an LLM. The AI that helped a patient understand her lab results last week is an LLM. The tool I use to review evidence and write for this publication is an LLM.

A diagnostic algorithm is something else entirely. It is a set of rules or a statistical model trained on specific clinical data to flag a pattern: this imaging scan looks like cancer, this fetal heart rate tracing warrants attention, this lab value is outside the safe range. These systems do not read or write. They match patterns.

An administrative AI is different again. These are the tools that process insurance claims, predict patient no-shows, flag billing codes, and in some notorious cases, deny Medicare coverage at scale. One such system was found to have a 90% error rate in its denials, while the company relied on patients being too sick or too overwhelmed to appeal. That is not a technology story. That is an accountability story. But it gets filed under AI.

An AI scribe is yet another category: a tool that listens to a clinical encounter and generates a note. Some of these are genuinely useful. Some have been documented to introduce errors in up to 70% of the notes they produce. In some systems, those errors are not preserved in the chart, which means the evidence you would need to audit the tool disappears with each encounter.

These are not four versions of the same thing. They are four different technologies with different architectures, different failure modes, different use cases, and different standards of evidence for deployment. Asking the public whether they trust AI without distinguishing between them is not a survey. It is a word association test.

Definition Is Not Bureaucracy. It Is Science.

In medicine, we do not publish a study on “drugs” and draw conclusions about all pharmacology. We specify the agent, the dose, the mechanism, the population, the outcome. The precision is not pedantry. It is what makes the finding mean something. Without it, you have noise, not evidence.

The same standard applies here. When a critic says AI is dangerous in healthcare, the first question should be: which AI, in which clinical context, deployed how, with what safeguards, and evaluated against what outcome? When a proponent says AI will transform medicine, the same questions apply.

Conflation is not neutral. It does real damage. When a claims-denial algorithm with a 90% error rate gets bundled into the same conversation as an LLM helping a patient prepare questions for her oncologist, both tools get tarred with the same brush. The failures of one become the assumed failures of all. And the genuinely useful tools get caught in a backlash they did not earn.

The reverse is also true. When AI proponents point to impressive diagnostic accuracy in radiology research, they are describing a narrow, well-validated tool in a specific domain. Using that success to defend the unmonitored rollout of AI scribes to 600 health systems is not an argument. It is a sleight of hand.

What the Survey Actually Tells Us

The Ohio State survey found that 51% of adults used AI to make an important health decision without consulting a physician. This number should give us pause, but not for the reasons most commentators cited. The worry is not that patients used AI. The worry is that no one knows which AI, for which decision, with what quality of information, and with what outcome.

The finding that 62% use AI to understand symptoms before seeking care is, depending on the tool and the symptom, either reassuring or concerning. A well-designed LLM helping a patient decide whether chest pain warrants an emergency room visit is not the same as an unvetted chatbot telling a pregnant woman her elevated blood pressure is nothing to worry about. The survey cannot tell us which world we are living in, because it never asked.

This is not a critique of the researchers. It is a critique of the discourse. We have built an entire public debate around a word that nobody has agreed to define. And then we wonder why trust is sliding.

My Take

I have spent the last year pushing my colleagues to take LLMs seriously as clinical and intellectual tools. I have published on it. I have argued that the physician who refuses to learn this technology is making the same mistake as the physician who refused to look through Semmelweis’s microscope. I stand by that position.

But I am equally impatient with the AI critics who write screeds against a word. If your argument is that AI is dangerous in healthcare, you owe your readers a sentence that begins: specifically, I mean.

An LLM with no clinical validation deployed without physician oversight? Say that.

A claims-denial system trained to minimize payouts at the expense of sick patients? Say that.

A diagnostic algorithm validated in one population and sold to another? Say that.

Precision is not a technicality. It is the difference between evidence and noise. The public’s trust is slipping not because they understand AI and reject it. It is slipping because they are watching a field deploy tools faster than it can explain them, and no one in a position of authority is offering definitions, standards, or accountability.

You cannot hold something accountable that you cannot define. That goes for the technology. And it goes for the people writing about it.

ObGyn Intelligence: The Evidence of Women’s Health

Discussion about this post

Ready for more?