ObGyn Intelligence: The Evidence of Women’s Health

ObGyn Intelligence: The Evidence of Women’s Health

The Prevention Files

What AI Got Right About Us — And What It Cannot Be

Patients rate AI responses as more empathic than physician responses. Before we draw any conclusions from that finding, we need to ask whether empathy is even what obstetric patients need most.

Amos Grünebaum, MD's avatar
Amos Grünebaum, MD
Jul 01, 2026
∙ Paid

Patients rated a chatbot as more empathic than their own physicians. I’ve been in the room when a fetal heart rate dropped to 60 and we had four minutes. Here is what that study actually reveals — and why we are asking the wrong question.

What AI Got Right About Us — And What It Cannot Be

A patient in active labor at 38 weeks asked her nurse why the fetal heart rate was doing what it was doing. The nurse had three other patients. The answer she gave was accurate, brief, and insufficient. The patient remembered it for years.

That gap is real. It predates AI by decades. A 2023 study in JAMA Internal Medicine made it measurable: patients shown physician responses and AI chatbot responses to medical questions rated the chatbot as more empathic. The finding has since been replicated in oncology and patient portal research. [1]

The reflexive response in medicine is to treat that finding as a problem of tone, of training, of communication skills. I want to argue that we are solving for the wrong thing. Empathy is not what obstetric patients need most. Compassion is. And those are not the same word for the same concept.

Compassion and Empathy Are Not Synonyms

  • Empathy is the capacity to perceive and resonate with another person’s emotional state. It is cognitive and affective: you recognize what someone is feeling, and you feel some version of it yourself. It is also, as it turns out, learnable by a machine. An LLM trained on millions of human conversations can produce empathic-sounding language with impressive reliability. It has learned the form.

  • Compassion goes further. It is empathy plus the motivation to act and the act itself. The word comes from the Latin: to suffer with. A compassionate clinician does not just recognize that her patient is frightened; she is moved by that recognition to do something about it. She explains. She stays an extra two minutes. She calls back. She changes her language because this particular patient, in this particular moment, needs a different kind of communication. Compassion is a moral act, not a communicative one.

You can score high on an empathy scale and be a compassionless clinician. You can sound warm and still be absent. Patients know the difference, even when they cannot name it. Birth experiences are recalled with unusual fidelity for years. The nurse who gave the accurate, brief, insufficient answer was not cruel. She was stretched beyond the conditions that allow compassion to function.

What the Study Actually Measured

Bioethicist John Lantos argues that most empathy scales capture communicative empathy: warm tone, verbal acknowledgment, scripted validation. Those are real things. They matter. They are also reproducible by a language model. What the scales do not capture is what philosopher and psychiatrist Jodi Halpern calls emotional reasoning: a disciplined, medically-informed attunement to what illness means in a specific patient’s life. [2, 3]

The JAMA study measured patients’ perceptions of responses to written medical questions. That is a useful measurement. It is not the same as what happens when a woman has been laboring for 22 hours, her epidural is wearing off, and the team is discussing whether to proceed to cesarean. In that room, what the patient experiences as compassion is inseparable from whether she trusts the clinician’s judgment, whether that clinician knows her history, and whether the clinician is genuinely present or performing presence.

There is also a structural problem the AI finding actually reveals. Lantos notes that medical training systematically erodes empathy, with the sharpest decline in the third year of medical school. The hidden curriculum rewards detachment. Evaluation systems measure diagnostic accuracy and procedural competence. The environment extinguishes what the curriculum claims to cultivate. If a chatbot outscores residents on empathy metrics, the finding tells us something about what residency does to residents. It tells us nothing about whether the chatbot can be compassionate.

Free readers see above. Paid subscribers continue below.

ObGyn Intelligence: The Evidence of Women’s Health is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

User's avatar

Continue reading this post for free, courtesy of Amos Grünebaum, MD.

Or purchase a paid subscription.
© 2026 Amos Grünebaum, MD · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture