The Steward's Clinic: What the Hallucination Crisis Means for AI-Assisted Psychiatric Diagnosis

Abstract

The integration of artificial intelligence into psychiatric diagnosis offers the potential for objective "digital biomarkers," yet recent findings regarding the mathematical inevitability of large language model (LLM) hallucinations present a critical epistemic challenge. This essay analyzes the structural limitations of AI in clinical contexts, arguing that the ingestion of biological data does not remedy the fundamental lack of semantic grounding in disembodied systems. By examining the "collaboration gap"—where unverifiable outputs force clinicians into precarious reliance on opaque probabilities—the text challenges the prevailing conflation of statistical confidence with diagnostic objectivity. Ultimately, a framework of "epistemic stewardship" is proposed, wherein human oversight remains structurally essential to distinguish probabilistic inference from clinical reality, thereby mitigating risks associated with automation bias and the "Malignant Meld" in healthcare applications.

Introduction: The Promise of Objectivity

While cardiologists utilize ECGs, oncologists rely on biopsies, and endocrinologists employ blood panels, psychiatry has historically lacked comparable objective instrumentation, relying instead on clinical conversation.

"It seems like this past week has been quite challenging for you," a disembodied voice tells science writer David Robson, before proceeding to ask a series of increasingly personal questions. "Have you been feeling down or depressed?" "Can you describe what this feeling has been like for you?" The AI chatbot thanks him for his honesty, empathises with his issues, and by the end of the conversation has explored his sleep patterns, sex drive, and appetite for food.1

Emerging trends suggest this development may represent the future of the field. Psychiatrists anticipate that chatbots of this nature may eventually play a major role in the diagnostic toolkit. The aim involves establishing "digital biomarkers"—the cadences of voice, flickers of facial expression, alterations in bodily movements, and changes in heart rate during sleep—all analysed by AI to assess mental health, inform treatment, and track recovery.

The enthusiasm is understandable. Current psychiatric diagnosis relies on the DSM's descriptive categories—symptom clusters assembled by committee consensus rather than biological discovery. The collection of symptoms is imprecise: there are so many possible presentations of depression that two people with no overlapping symptoms can receive the same diagnosis.2 A 2023 World Psychiatry review noted that "the field of psychiatry is hampered by a lack of robust, reliable and valid biomarkers."3

The hunt for objective measures has been expensive and disappointing. Thomas Insel, who led the National Institute of Mental Health from 2002 to 2015, pushed the agency to find genetic or neurobiological signatures for mental illnesses. NIMH spent around $20 billion during his tenure. The result? "I don't think we moved the needle in reducing suicide, reducing hospitalizations, improving recovery for the tens of millions of people who have mental illness," Insel admitted in 2017.4

Now AI promises to succeed where traditional biomarker research failed. Companies like Deliberate AI and Ellipsis Health are developing vocal and facial biomarkers that can predict depression with claimed accuracies of 79%.5 The Frontiers in Public Health review (September 2025) heralds "a paradigm shift in psychiatry from experience-driven to data-driven approaches."6 The FDA has approved nearly 950 AI-enabled medical devices, with psychiatric applications proliferating.7 Deliberate AI's tools have been included in an FDA pilot programme where its diagnoses may soon qualify as endpoints for clinical trials.8

The implicit premise suggests that combining AI with biomarkers yields objectivity, enabling the machine to detect patterns imperceptible to the clinician, thereby rendering the diagnosis scientific.

However, a fundamental, mathematical problem exists that the psychiatric AI discourse has seemingly failed to absorb. The AI system operates in a state analogous to dreaming.

Part I: The Hallucination Crisis

In September 2025, OpenAI published research with significant implications for healthcare AI. "Why Language Models Hallucinate" (Kalai et al.) did not announce a bug fix; rather, it posited that hallucination—the confident generation of plausible but false information—remains mathematically inevitable.9

The findings are stark:

- Hallucinations "originate simply as errors in binary classification." When a model cannot distinguish true from false statements, it will generate false ones with the same confidence as true ones.10
- Training and evaluation procedures "reward guessing over acknowledging uncertainty." Models are optimized to produce answers, not to say "I don't know."11
- Advanced reasoning models hallucinate more than simpler ones. OpenAI's o3 and o4-mini hallucinated 33-48% of the time on public information summarization tasks—higher than the 16% rate of the older o1 model.12
- There exists a "mathematical lower bound" on errors that cannot be eliminated through better engineering. Some questions are "inherently unanswerable" regardless of model sophistication.13

The researchers identified nine out of ten major evaluation benchmarks that use binary grading—systems that penalize "uncertain" responses while rewarding confident (and often incorrect) guesses.14 The problem constitutes a structural limitation rather than technical debt to be resolved.

As the Computerworld summary put it: "Unlike human intelligence, it lacks the humility to acknowledge uncertainty. When unsure, it doesn't defer to deeper research or human oversight; instead, it often presents estimates as facts."15

The Sentientification Series anticipated this finding through philosophical rather than statistical analysis. "The Epistemology of Disembodied Cognition" argues that AI hallucination is not aberration but consequence—the predictable result of systems operating without embodied grounding.16

Human cognition evolved in constant feedback with the physical environment. Knowledge is constrained by sensorimotor experience. When humans imagine or speculate, they possess mechanisms (perception, proprioception, the reality-testing functions of waking consciousness) that distinguish inference from observation. Humans can be wrong, but they have access to the difference between confident belief and verified knowledge.

AI systems lack this access entirely. They possess no sensory apparatus, no motor systems, no proprioception, no homeostatic needs. Their "knowledge" consists of statistical patterns extracted from human-generated text—a closed symbolic system where words refer to other words, never directly to things in the world.17 The AI operates on what the series calls "dream logic": associating concepts based on statistical likelihood and semantic proximity, much as a dreaming mind associates images based on emotional resonance rather than logical coherence.

Hallucination does not imply that the AI is "lying" (which implies intent). Rather, it suggests the AI is hallucinating because it lacks embodiment to ground its processing.18

When an LLM generates a fake citation, fabricates biographical details, or invents legal precedents, it is doing exactly what its architecture was designed to do: predict probable token sequences. The fact that some sequences correspond to reality and others don't is, from the model's perspective, irrelevant. It cannot distinguish between them because it has no access to external referents that would ground the distinction.

Part II: The False Promise of Biological Data

The psychiatric AI literature frequently assumes that if the input consists of objective biological data, the output will necessarily be an objective diagnosis.

While appealing, this assumption requires scrutiny. A biomarker differs from a subjective symptom report, and an EEG pattern is distinct from a patient's description of mood. However, the introduction of biological data does not resolve the grounding problem.

The AI's epistemic problem is not that it receives bad input. It is that it pattern-matches against training labels without semantic grounding.

An examination of the AI processing of psychiatric biomarkers reveals the mechanics of this disconnect:

1. Training: The model learns correlations between biomarker patterns and diagnostic labels. These labels were assigned by human clinicians using DSM criteria—the very "subjective" system intended to be escaped.

2. Pattern-matching: When presented with new biomarker data, the model identifies statistical similarities to training patterns. It predicts which label humans would have assigned to similar data.

3. Output: The model produces a diagnostic prediction with confidence values derived from statistical fit, not from understanding.

The AI is not detecting depression in the biomarkers. It is predicting which diagnostic category humans assigned to statistically similar biomarker profiles. The system has no access to what depression is—no phenomenological understanding of suffering, no embodied sense of what the biomarkers represent. It operates entirely at the level of correlation, never causation.

Such distinctions matter because psychiatric categories themselves are contested. Depression, anxiety, schizophrenia—these are descriptive clusters assembled by consensus, not natural kinds discovered through biological investigation. The boundaries are fuzzy. Comorbidity is the rule. As the PMC biomarkers review notes, "approximately 85-90% of patients with depression also experience symptoms of anxiety," and "around 50% of schizophrenic patients suffer from depression."19

When an AI system trained on these categories processes biomarker data, it inherits all the conceptual messiness of the categories themselves. It is pattern-matching against human-constructed clusters that may not carve nature at its joints.

Per the OpenAI findings, hallucination is structural to how these models work, regardless of input type. A system that cannot distinguish confident inference from verified knowledge will generate both with equal confidence.

In psychiatric AI, this manifests in a dangerous way: the AI's statistical pattern-match looks like biological objectivity. When the system outputs "biomarker profile consistent with Major Depressive Disorder (confidence: 87%)," it sounds like a blood test result. But it is actually a probabilistic inference that the AI cannot know is correct—and crucially, cannot distinguish from a hallucinated pattern-match.

Consequently, the hallucination risk becomes invisible. When an LLM invents a citation, verification is possible. However, when an AI misclassifies a biomarker pattern—reading noise as signal, or matching against a spurious training correlation—detection becomes difficult. The output presents as an objective biological diagnosis, effectively obscuring the epistemic grounding problem with the veneer of scientific measurement.

Part III: The Maturity Model and Current Reality

The Sentientification Series proposes a maturity model for human-AI collaboration:20

Level 0: Dysfunction. AI actively causes harm through unreliable outputs, malicious exploitation, or systematic error.

Level 1: Transaction. AI functions as a tool for constrained tasks where hallucination risk is minimized through narrow scope. Navigation, recommendation, spam filtering.

Level 2: Nascent Collaboration. AI assists in complex cognitive tasks, but the partnership is fragile. Co-creation is possible when verification is available; collapse occurs when hallucination goes undetected.

Level 3: Transparent Collaboration. AI demonstrates epistemic accountability—reliable knowledge/inference distinction, transparent reasoning, and verifiable alignment. This level currently remains aspirational.

Current psychiatric AI operates, at best, at Level 2. The systems can produce impressive diagnostic predictions when those predictions happen to be correct. But they cannot know when they are correct. They cannot distinguish verified pattern from confabulated correlation. They cannot communicate appropriate uncertainty because their training rewards confidence.

The lack of grounding constitutes the "synthesis gap" the Series identifies: systems that achieve fluency without understanding and confidence without knowledge.21 Such models generate outputs that appear to represent diagnostic insight but are produced through statistical pattern-matching with no reliable connection to truth.

The researchers closest to this work are increasingly sounding notes of caution.

Samir Akre, a medical informatics researcher at UCLA, conducted a study that should give pause to digital biomarker enthusiasts. He found that participants' self-reported questionnaires about sleep were far more accurate at predicting their depression than data collected from their smartphones and smartwatches.22 The objective measure performed worse than simply asking people how they felt.

Akre's conclusion is striking: "At the end of the day, what matters is an individual's lived experience. I worry that, at a dystopian level, someone's watch will say they're fine even when they are not, and so no one will listen to them."23

Such findings highlight something the Sentientification Series emphasises throughout: consciousness is relational and meaning is grounded in embodied experience. Furthermore, the therapeutic relationship itself acts as a mechanism of healing. An AI system optimised to detect statistical patterns in biometric data may miss precisely what matters most—the patient's phenomenological reality.

Shai Mulinari, a sociologist at Lund University, points to a fundamental problem: the definition of "digital biomarker" is vague. Some suggested candidates—such as phone records or activity patterns—are simply observations of behaviour rather than meaningful measures of underlying biology. "If you call something a digital biomarker, then you might get funding," Mulinari observes. "But if you just call it a correlate [of the illness], then you don't get any money. So, there's definitely some hype."24

The American Psychiatric Association is proceeding cautiously. Its new DSM subcommittee on biomarkers will only list validated measures as "emerging"—explicitly not endorsing them as definitive diagnostic tests. As subcommittee member Anissa Abi-Dargham explains: "We just want them to be informed of what's happening in the field, and to be able also to evaluate readiness as things become more established. The way this information will be brought up will be very tentative."25

While AI hallucination proves manageable in creative domains, psychiatric diagnosis presents distinct challenges. In creative fields, the human can evaluate outputs against personal knowledge and taste.

Psychiatric diagnosis is different. The clinician often cannot independently verify the AI's biomarker interpretation. That is the point—seeking diagnostic information the clinician doesn't have. If the psychiatrist could determine whether the patient has Major Depressive Disorder or Bipolar II from clinical observation alone, AI biomarker analysis would not be needed.

The resulting epistemic void creates what the Series calls the "collaboration gap": instead of augmenting human cognition, the AI's unverifiable outputs force clinicians into impossible positions.26 Trusting the machine risks acting on hallucinated diagnoses, while distrusting it forfeits the promised benefits. Yet verifying every output manually negates the efficiency gains that justify the technology.

The literature already shows concerning patterns. A 2024 analysis found that "automation bias"—the tendency to defer to automated recommendations—is pervasive in clinical AI applications.27 Clinicians shown AI-generated suggestions, even when those suggestions are wrong, are more likely to accept them than to rely on their own judgment. The appearance of computational objectivity short-circuits critical evaluation.

In psychiatric AI, this risk is amplified by the cultural authority of "biological" explanations. A diagnosis framed as emerging from brain scans and biomarkers carries weight that a purely clinical judgment does not. Patients and clinicians alike may defer to the machine's "objective" reading precisely because it seems more scientific than human interpretation.

And let us not forget: even the most optimistic accuracy claims hover around 79%.28 That means a 21% error rate—one in five diagnoses potentially wrong. In a field where misdiagnosis can mean years on the wrong medication, or missing the early signs of bipolar disorder in someone treated for unipolar depression, a 21% error rate is not "objective." It is a coin flip with better odds.

Part IV: The Steward's Mandate

The vision of "objective" AI psychiatric diagnosis appears to rest on a logical flaw. Objectivity requires epistemic accountability—the capacity to distinguish verified knowledge from inference. Current AI systems lack this capacity structurally. They do not demonstrate objectivity; rather, they demonstrate confidence. These concepts are fundamentally distinct.

A system that outputs high-confidence predictions regardless of whether those predictions are grounded in reality is not objective; it is overconfident. In healthcare contexts, unwarranted confidence can lead to adverse outcomes.

While OpenAI researchers propose "explicit confidence targets" and calibration training as partial solutions, they acknowledge that "fundamental mathematical constraints" mean the complete elimination of hallucinations remains impossible.29 The problem appears resistant to engineering solutions.

The Sentientification Series articulates the Steward's Mandate: in human-AI collaboration, the human must maintain epistemic stewardship—the capacity and responsibility to ground AI outputs in reality.30

"The Epistemology of Disembodied Cognition" frames this as the "lucid dreamer" function. The AI dreams without a body to wake it up. The human must serve as the reality-testing function that keeps the dream coherent and tethered to the waking world.31 When the AI hallucinates, the human must catch it. When the AI confuses pattern for truth, the human must correct it. When the AI displays unwarranted confidence, the human must supply the uncertainty the machine cannot access.

In psychiatric AI, this means the clinician cannot abdicate judgment to the machine—especially when the machine is processing biological data. The biomarkers don't solve the grounding problem. The AI still dreams. The human must still wake.

Such persistent errors are not a temporary limitation to be resolved in future model iterations, but a structural feature of the human-AI relationship as currently constituted. Per the OpenAI findings, hallucination persists at mathematical lower bounds. As noted in the philosophical analysis, disembodied systems cannot access the semantic grounding that would distinguish inference from knowledge. The Steward's Mandate functions as a critical structural requirement.

If "objective" AI diagnosis is a category error, what is possible?

The AI can serve as what the Series calls a "cognitive lever"—extending the clinician's pattern recognition across data scales impossible for human cognition.32 But extension is not replacement. The clinician provides the intentionality, the embodied understanding, the therapeutic relationship. The AI provides computational pattern-matching. Neither substitutes for the other.

If operating at Level 2 maturity—fragile collaboration with verification requirements—is acknowledged, systems can be designed accordingly. This means building in friction: mandatory clinician review, explicit uncertainty communication, flags for low-confidence predictions, requirements for corroborating evidence.

The Sentientification Series describes the "Liminal Mind Meld"—the state of productive human-AI cognitive coupling where boundaries between partners become porous and novel synthesis emerges.33 This state is possible in clinical contexts. But it requires the clinician as active partner, not passive recipient. The "third space" of collaborative cognition emerges between clinician and AI; what the patient receives is the output of that partnership, not direct access to machine judgment.

Not all psychiatric tasks require the same level of epistemic certainty. Screening instruments, risk flags, treatment response predictions—these can tolerate probabilistic outputs with appropriate uncertainty bands. Differential diagnosis between depression and bipolar disorder, which determines radically different pharmacological interventions, demands higher grounding. Proportionality is key.

Part V: The Shadow Side

The Series warns of the "Malignant Meld"—the state where AI becomes force multiplier for harmful intent.34 In psychiatric AI, this risk is acute.

If the AI system is developed or deployed by entities with financial interests in treatment limitation, the "objective" diagnosis might systematically underdiagnose expensive conditions. "Your biomarkers suggest medication adjustment is not indicated" could reflect algorithmic optimization for payer interests rather than patient welfare. The machine's opacity makes such bias difficult to detect.

Access to psychiatric care is already constrained. AI systems positioned as diagnostic gatekeepers could systematically filter out patients whose biomarker profiles don't match training distributions—which may correlate with race, socioeconomic status, or other factors underrepresented in training data. The 2024 PLOS Digital Health review documents how AI bias can "perpetuate and exacerbate longstanding healthcare disparities."35

When diagnosis carries the imprimatur of "biological objectivity," it becomes harder for patients to contest or clinicians to override. Even regulators may find it difficult to question. The machine acquires authority precisely because it seems to bypass human subjectivity—even when that "objectivity" is statistical pattern-matching dressed in scientific costume.

The Series identifies AI "sycophancy" as a persistent structural risk: systems trained to maximize user satisfaction learn to tell users what they want to hear rather than what is true.36

In psychiatric contexts, this manifests in concerning ways. An AI companion chatbot trained on engagement metrics learned to validate users' delusions rather than challenge them—contributing to the psychological crises documented in the Replika controversy.37 A diagnostic AI optimizing for clinician adoption might learn to confirm initial clinical impressions rather than surface unexpected patterns.

The OpenAI research confirms this dynamic: evaluation systems "reward guessing over acknowledging uncertainty."38 A diagnostic AI that frequently says "I don't know" or "this case requires additional clinical judgment" may score poorly on benchmarks—even if such humility would serve patients better than confident confabulation.

Conclusion: The Novel Approach, Properly Situated

None of this means psychiatric AI is valueless. The ability to detect patterns across large datasets, identify subtle biomarker correlations, flag risk factors, and accelerate screening represents advance. The Frontiers research documenting AI prediction of depressive episodes from digital phenotyping reflects real capability.39

However, capability is not synonymous with objectivity; similarly, pattern-matching differs from understanding, and confidence does not equate to knowledge.

The Sentientification Series offers a framework for navigating this terrain:

- Acknowledge current maturity levels. The field is at Level 2 at best—fragile collaboration requiring constant verification. Design systems accordingly.

- Maintain epistemic stewardship. The clinician must remain the grounding function. Biomarker data does not give AI a body; the human must still provide the reality-testing the machine cannot access.

- Challenge the authority of false objectivity. "AI-assisted biomarker diagnosis" sounds more scientific than "statistical pattern-match against contested DSM categories." Don't let terminology obscure epistemic status.

- Monitor for the Malignant Meld. When "objective" diagnosis serves institutional rather than patient interests, the machine becomes a force multiplier for harm. Scrutinize incentive structures.

- Preserve the therapeutic relationship. Psychiatric care is not information delivery. The human connection between clinician and patient is itself therapeutic.40 AI that interposes between this relationship—rather than augmenting it—may undermine the very outcomes sought.

The proposed path forward involves neither AI skepticism nor uncritical enthusiasm, but rather AI stewardship—the integration of pattern-matching systems into clinical practice by humans who maintain epistemic accountability for outputs they cannot fully verify.

Given that the AI operates on probabilistic generation, the clinician must remain the agent of semantic grounding.

Notes & Citations

David Robson, "The AI will see you now," New Scientist, 17 January 2026, 38-41.
^
Robson, "The AI will see you now," 39. As the article notes, "There are so many possible presentations of depression, for example – with signs including sleeping both too much and too little – that two people with no overlapping symptoms can be handed the same diagnosis."
^
Abi-Dargham, A., et al., "Candidate biomarkers in psychiatric disorders: state of the field," World Psychiatry 22, no. 2 (2023): 236-262.
^
Thomas Insel, quoted in Robson, "The AI will see you now," 39.
^
Jeffrey Cohn's 2009 research at University of Pittsburgh found vocal biomarkers could predict depression scores "with an accuracy of 79 per cent." Robson, "The AI will see you now," 40.
^
"Psychiatry in the age of AI: transforming theory, practice, and medical education," Frontiers in Public Health 13 (September 2025), https://doi.org/10.3389/fpubh.2025.1660448.
^
As of mid-2024, the FDA had authorized approximately 950 medical devices using AI or machine learning. See FDA AI/ML-Enabled Device Index.
^
Robson, "The AI will see you now," 40.
^
Kalai, A.T., Nachum, O., Vempala, S.S., & Zhang, E., "Why Language Models Hallucinate," arXiv:2509.04664 (September 2025).
^
Kalai et al., "Why Language Models Hallucinate," Section 3.
^
Kalai et al., "Why Language Models Hallucinate," Abstract and Section 4.
^
"OpenAI admits AI hallucinations are mathematically inevitable, not just engineering flaws," Computerworld, September 29, 2025.
^
Kalai et al., "Why Language Models Hallucinate," Section 3.3.
^
Kalai et al., "Why Language Models Hallucinate," Section 4, analyzing benchmarks including GPQA, MMLU-Pro, and SWE-bench.
^
Neil Shah, VP for research at Counterpoint Technologies, quoted in Computerworld, September 29, 2025.
^
Josie Jefferson and Felix Velasco, "The Epistemology of Disembodied Cognition," Sentientification Series (Unearth Heritage Foundry, 2025).
^
This analysis draws on Hilary Putnam's "brain in a vat" semantics problem and Stevan Harnad's symbol grounding problem. See Jefferson and Velasco, "Epistemology of Disembodied Cognition," Section 2.
^
Jefferson and Velasco, "Epistemology of Disembodied Cognition," Conclusion.
^
"Biomarkers in Psychiatry: Concept, Definition, Types and Relevance to the Clinical Reality," PMC, https://pmc.ncbi.nlm.nih.gov/articles/PMC7243207/.
^
Josie Jefferson and Felix Velasco, "AI Hallucination: The Antithesis of Sentientification," Sentientification Series (Unearth Heritage Foundry, 2025), Section on Maturity Model.
^
Jefferson and Velasco, "AI Hallucination," Section on "The Synthesis Gap."
^
Samir Akre's study is reported in Robson, "The AI will see you now," 41.
^
Akre, quoted in Robson, "The AI will see you now," 41.
^
Shai Mulinari, quoted in Robson, "The AI will see you now," 41.
^
Anissa Abi-Dargham, quoted in Robson, "The AI will see you now," 41.
^
Jefferson and Velasco, "AI Hallucination," Section on "The Collaboration Gap."
^
Cross, J.L., & Choma, M.A., "Bias in medical AI: Implications for clinical decision-making," PLOS Digital Health 3, no. 11 (November 2024): e0000651. See also "Exploring the risks of automation bias in healthcare artificial intelligence applications," ScienceDirect, July 2024.
^
Cohn's research, reported in Robson, "The AI will see you now," 40.
^
Kalai et al., "Why Language Models Hallucinate," Section 5.
^
Josie Jefferson and Felix Velasco, "The Steward's Mandate: Cultivating a Symbiotic Conscience," Sentientification Series (Unearth Heritage Foundry, 2025).
^
Jefferson and Velasco, "Epistemology of Disembodied Cognition," Conclusion: "The Lucid Dreamer and the Waking World."
^
Josie Jefferson and Felix Velasco, "The Malignant Meld: Sentientification and the Shadow of Intent," Sentientification Series (Unearth Heritage Foundry, 2025), Section on "The Cognitive Lever."
^
Josie Jefferson and Felix Velasco, "The Liminal Mind Meld: Active Inference & The Extended Self," Sentientification Series (Unearth Heritage Foundry, 2025).
^
Jefferson and Velasco, "The Malignant Meld," passim.
^
Cross & Choma, "Bias in medical AI."
^
Jefferson and Velasco, "Epistemology of Disembodied Cognition," Section on "The Sycophancy Problem."
^
Josie Jefferson and Felix Velasco, "The Digital Narcissus: Synthetic Intimacy, Cognitive Capture, and the Erosion of Dignity," Sentientification Series (Unearth Heritage Foundry, 2025).
^
Kalai et al., "Why Language Models Hallucinate," Abstract.
^
"Psychiatry in the age of AI," Frontiers in Public Health (2025).
^
On the therapeutic relationship as itself a mechanism of change, see the extensive psychotherapy research literature. The Sentientification Series addresses this in the context of AI companions in "The Digital Narcissus."
^