Coherence without truth: the epistemic problem of coherent hallucinations in decision-making AI

Innovation

Technology

Consulting

Home » Coherence without truth: the epistemic problem of coherent hallucinations in decision-making AI

November

2025

19 November 2025

Vincenzo Gioia

Sustainability, Technology, Work and Society

Hallucinations in language models are not just errors to be corrected but also symptoms of how these systems construct coherence. Current explainability and evaluation frameworks measure formal coherence rather than semantic correspondence, opening an epistemic problem that demands different approaches. When an LLM generates an answer that is factually incorrect yet perfectly integrated into its reasoning, we find ourselves in a grey area in which the system has produced internal coherence without semantic correspondence. In decision-making contexts related to safety and security, such as healthcare or the pharmaceutical sector, this “coherent hallucination” can act as a form of nudging that steers choices by exploiting pre-existing biases, without the user noticing the mistake.

A concrete case

Let us consider an LLM used for decision support in a hospital setting. The system might infer a contraindication between two drugs based on statistical correlations apparent in the training data, integrating it coherently into the clinical reasoning. The output displays traceable logical dependencies, citations from the literature that are formally relevant, and a solid argumentative structure. The decision-maker, seeing this coherence, may accept the recommendation without cross-checking, yet the factual premise is false. The contraindication does not exist, or exists only under conditions different from the specific case. The system has produced formal validity without semantic correspondence.

The theoretical framework: Davidson on coherence

This case reveals a structural problem: the system has constructed formal validity without verifying factual correspondence. Argumentative coherence (e.g., traceable logical dependencies, relevant citations, solid structure) has masked the semantic error. Here a fundamental epistemic distinction emerges, articulated by Donald Davidson (1983): coherence as a criterion of justification versus correspondence as a condition of truth. Coherence tells you whether a belief integrates into a system of beliefs; correspondence tells you whether that belief is true. Davidson keeps these two planes distinct: you can be justified in believing something (because it is coherent) even if that belief is false. Coherence authorizes, but does not guarantee truth. Language models optimize for coherence: they maximize the internal consistency of representations, minimize logical contradictions, and produce outputs that integrate smoothly into the context. But they lack native mechanisms for validating semantic correspondence: they cannot verify whether their intermediate representations map onto real states of affairs. The result is hallucinations that are epistemically justified (from the perspective of systemic coherence) but semantically false. When such hallucinations enter decision flows, it becomes essential to understand according to which criteria we invalidate them, who defines the standards of acceptable coherence, with what legitimacy, and with which objectives in mind.

The limits of current technical approaches

Post-hoc explainability tools such as SHAP and LIME generate probabilistic approximations of input–output dependencies, but they do not reveal the model’s internal causal mechanisms. They are external statistical reconstructions that map correlations: they can indicate which inputs contribute to an output, but they do not validate the semantic correspondence of intermediate representations with reality. Even direct inspection tools like attention visualization show where the model allocates attention, yet they do not explain why that configuration produces that specific semantic output. A conceptual—rather than technical—analogy emerges from the Horizon–Fujitsu scandal of the British Post Office. In that case, 900 postmasters were wrongfully convicted on the basis of outputs from an accounting system (Horizon) whose internal mechanisms were inaccessible and unverifiable. The system displayed discrepancies in accounts but could not demonstrate their correspondence with real transactions. The “explanations” provided were retroactive reconstructions that masked the impossibility of accessing the actual computational processes—what we might call “descriptive transparency without semantic validation.” The result was a judicial disaster based on outputs that appeared verifiable but were in fact the product of opaque mechanisms. Similarly, SHAP and LIME produce visualizations that appear transparent but are themselves additional interpretive models, not direct windows into the original model. These reconstructions create an illusion of descriptive transparency without solving the problem of semantic validation: we can observe statistical correlations but cannot verify factual correspondence.

Governance and semantic validation

We cannot delegate validation to the model’s architecture alone. Internal coherence cannot become an epistemic alibi that legitimizes incorrect answers. An external semantic validation framework is needed, but building such a framework requires explicit governance grounded in clear normative criteria, transparency regarding inferential dependencies, and mechanisms of attribution. Without such a framework, “explainability” becomes a form of formal justification: the system produces argumentative coherence without any guarantee of factual correspondence.

Normative implications

Constructing semantic validation frameworks raises questions of distributed responsibility: who certifies external validators? With what audit standards? How is the trade-off managed between rigorous verification and decision latency in time-critical contexts such as medical diagnosis or high-frequency financial decisions? And how can we prevent the validation criteria themselves from becoming vectors of bias, replacing one form of opacity (that of the model) with another (that of the validator)? Some may appeal to the non-solution of the human-in-the-loop. But this framework is only as robust as the human tasked with validating the model’s output, and this fragility is evident in the paradoxes arising from the Chinese Room experiment.

Open questions

I believe that the relationship between coherence and truth has not yet found an epistemic solution that avoids either technical simplifications—often favored by data scientists—or philosophical assumptions grounded in fragile logic. For this reason, I wonder how those working with LLMs in critical decision-making domains (healthcare, legal, finance) validate answers that are coherent yet factually incorrect. I wonder how they draw the boundary between acceptable error and structural manipulation. The journey into the universe of AI lies before a new frontier, one that I find fascinating.

Photo: The image is the coherent representation of a distorted vision. The message is deeply connected to the nature of hallucinations, which are consistent with reality while still being a distortion of it.
Photo by Ehimetalor Akhere Unuabona on Unsplash.

Index

Iscriviti alla newsletter

Indice

Iscriviti alla newsletter

Get more information

Fill out the form below to get in touch with us. We would be pleased to answer any questions you may have.

Home » Coherence without truth: the epistemic problem of coherent hallucinations in decision-making AI