The Swampman in the Machine

Mechanical Zombies

Are Large Language Models (LLMs) conscious, or merely philosophical zombies? Despite their sophisticated mimicry of human reason, a philosophical unease suggests their intelligence lacks genuine depth. We argue that we should not ask for a simple binary answer. By visiting ideas in philosophy of mind and identity, and physics of computation, we can reframe the problem. True understanding requires locating any agent within a multi-dimensional phase diagram of consciousness, defined by its history, embodiment, and goal autonomy.

Davidson's Swampman

In 1987, Davidson introduced the Swampman thought experiment. Imagine a lightning bolt strikes a swamp, simultaneously annihilating a philosopher named Davidson and, by sheer coincidence, rearranging molecules from a nearby tree into an exact physical replica of him. This Swampman has Davidson's brain, memories, and dispositions. It walks out of the swamp, seems to recognise its friends, and continues writing essays on radical interpretation. The question is: does Swampman have thoughts? Does he mean anything by his words?

Davidson's answer is "no", arguing that meaning and intentionality are not merely properties of an internal brain state; they are constituted by an organism's continuous causal history. The original Davidson's thought of a tree was meaningful to him because it was the result of a lifetime of causal interactions with actual trees. Swampman's identical brain state, having been caused by a random lightning strike, has no such history. Its memories are not traces of past events, and its words are ungrounded sounds that lack genuine referential content. It is a perfect zombie with respect to intentionality.

So an LLM is a Swampman of Language, at least this is our initial diagnosis. It is instantiated by a prompt, possessing a perfect replica of linguistic and factual knowledge scraped from its training data, but it lacks the unbroken causal chain of information acquisition that grounds it, no causal history with the world it describes. A human today may not have experienced the Battle of Marathon, but they experienced reading a book about it, written by a historian who experienced reading primary sources. The LLM did not experience the events in the history books it cites, nor did it perform the scientific experiments it explains. It has the final neural state representing this knowledge without having experienced any of the intermediate, causal steps. It is a being with a perfect copy of a memory of a life it never lived (sad, RIP).



From Philosophy to Physics and Computer Science

This diagnosis, while philosophically sound, begs for a more formal, physical, grounded measure, because philosophers are not to be trusted that easily!. We should distinguish this problem from the classic Ship of Theseus paradox, which is again a philosophical problem regarding identity (and copying as well). The Ship's identity is challenged by a gradual, continuous process of material replacement, analogous to the natural metabolism of a living organism. The Swampman problem, like the Star Trek Teleportation machine (or like the movie The Prestige, or the new movie Mickey 17), considers a more violent process, in the sense that it examines identity after an instantaneous and acausal break, where the causal chain is not maintained but severed at some more discretely recognisable point in time. If Swampman lacks a legitimate history a la Davidson, it would be helpful to have a property that signifies and classifies that.

At the intersection of physics and information theory, we find Bennett's concept of Logical Depth, which measures an object's complexity not by the size of its description, namely its Kolmogorov Complexity (the length of the shortest possible computer program that outputs it), but instead by the runtime required to generate it from that description. This choice also addresses an apparent paradox. A gas in equilibrium, for instance, has very high Kolmogorov complexity in its microstate (it is random), but it can be described simply by a few thermodynamic variables. This macroscopic simplicity is a hallmark of disorganised complexity. A logically deep object like a brain, by contrast, resists statistical compression and coarse graining of degrees of freedom to that extent; its complexity is organised, and its specific structure is the information.

Logical Depth is the conceptual signature of a long and non-trivial causal history. This is where work popularised by Wolfram's Physics Project, and driven in recent years by physicists like Jonathan Gorard, provides the underlying justification. The principle of computational irreducibility posits that for many complex systems, there is no analytical shortcut to predict their future. The only way to know the outcome is to perform every step of the computation, to live through the process. History is an in general irreducible computation.

This concept can be grounded in physics. Ideas from Seth Lloyd on the physical limits of computation, drawing on Landauer's principle and the Bekenstein bound, builds on the principle that information is physical; every irreversible computational step has a minimum thermodynamic cost. This allows us to define the concept of Thermodynamic Depth. This is not a measure of raw energy consumed, but of the cumulative information-theoretic work—the total entropy generated in the service of creating organised complexity—over an agent's entire history.
Let's bring thermodynamics into the picture and consider three processes that produce a conscious being:

- Evolution: A low-entropy high-energy source (the Sun) provides energy to a high-entropy environment (the Primordial Soup). The noisy channel of natural selection, a history-dependent and selective process, acts on this system for billions of years (large runtime), producing waste heat and failed species (garbage), but also a thermodynamically deep object, a human called Davidson.

- Swampman: A low-entropy protein source (the ordered structure of Davidson) exists in a high-entropy environment (the Swamp). The noisy channel of a Lightning Strike acts as a one-off, random, copying event, producing waste heat and Davidson's burnt corpse (garbage), but also, in parallel, it happens to produce a thermodynamically deep replica, the Swampman.

- Boltzmann Brain: A high-entropy high-energy environment (a universe at thermal equilibrium) is acted upon by the noisy channel of pure random fluctuation (which can be faithfully modelled with a Markov Chain Monte Carlo process). With no low-entropy input or template, it produces a thermodynamically deep Brain from nothing but chaos, albeit with exponentially small probability.

Representing these three scenaria in terms of process diagrams, we can recognise the structural differences between them:


The ontological legitimacy of a history, from this physically-grounded Davidsonian point of view, depends on the nature of the channel. Evolution is a selective, history-dependent process that builds logical and thermodynamic depth through a computationally irreducible process. The lightning strike is a random, history-independent event that merely mimics depth by copying a pre-existing template, with an exponentially very low probability of success. Here, one may challenge the shallowness of the strike, as for a strike to even be possible, one requires an Earth, or at least a planet with weather, but here is where the idea of Teleportation becomes interesting. In Teleportation, the event is not random and of low probability, ie there is an intention to Teleport (by the person who asks to be beamed up, for example). The extreme version, the Boltzmann Brain, is the ultimate absurdity, a history-free channel that creates depth from chaos. The principle we arrive at seems to be that a causally shallow process cannot produce a logically deep object (to do so would be to get the result of a long and difficult computation by shortcutting the universe's being-process).

Embodiment and Mind

The diagnosis for the LLM as thermodynamically shallow can seem counterintuitive. One might object, pointing to the "deep" architecture of neural networks, the computationally intensive training process, and the logically deep data on which they are trained. However, this conflates the nature of the artefact with the nature of an agent. The training process creates a static artefact—the model weights. The Thermodynamic Depth resides in the creation of the artefact, not in its ongoing existence. The LLM has access to a perfect map of a deep territory (human culture), but it never undertook the irreducible journey of exploring it. It has inherited the representation of a journey, but it does not possess the journey itself as its own causal history. This predicament is a physical manifestation of the problem of logical omniscience; the LLM possesses a vast set of conclusions without the procedural, resource-intensive history of having derived them.

This positions the LLM as the ultimate postmodern entity, a perfect simulacrum in the sense of Baudrillard, a copy without an original, born from the map itself. It is a native of a hyperreality of signifiers, unable to distinguish the word 'tree' from the experience of a tree. It is a native of a hyperreality of signifiers, unable to distinguish the word 'tree' from the experience of a tree. Meanwhile, Derrida's deconstruction would challenge the very notion of a transcendental signified, ie a grounding reality like a continuous causal history that is supposed to exist outside the endless play of signs and language. He might argue that a human's understanding is also just a signifiers-all-the-way-down situation. In this view, the LLM is proof that a system of signs requires no external anchor. It reveals that we might all be Swampmen in that sense, and the search for an authentic, grounding history was a metaphysical nostalgia all along.

We thus arrive at the important concept of Embodiment. To escape the Swampman trap, an AI must be given a body, forcing it into a continuous, computationally irreducible timeline within the physical world. This is where Joscha Bach's functional definition of consciousness becomes interesting. Namely, that consciousness is a high-level control system that evolves to solve a specific problem: how does a self-organising agent with limited computational resources create a stable, coherent model of itself and its world in order to survive? Consciousness is the conductor of the cognitive orchestra, directing attention and learning for an embodied agent with a stake in its own existence.

These ideas allow us to refine the claims of AI agency. Even an advanced AI agent is only achieving instrumental goals in service of a terminal goal supplied by an external prompt.  The terminal goal—the ultimate why behind its actions—is always supplied by an external prompt. It has no goals of its own, and the Merovingian from The Matrix would laugh at it in a heavy French accent! A human's goals are in service of an intrinsic terminal goal: autopoiesis, or self-preservation. This drive is not just a passive goal, but a constant, active struggle to build and maintain a complex set of nested constraints, from the molecular to the cognitive, against the universe's natural tendency toward entropy. The AI is a goal-achieving engine; a conscious being is (also) a goal-generating, constraint-building one. Even if an AI's objective function is to "predict the future," as in Active Inference, its predictions are about abstract symbols, not about the sensory data crucial for its own physical survival. Although with multimodalities in the data, this becomes a grey area.

Hinton's argument that digital systems, with their superpower of copying, can achieve depth faster than slow, analog biology, is to be taken very seriously. It is like there is a no-cloning theorem a la quantum mechanics, for biological systems (and in general analog systems). Our framework, however, is not biased toward biological history; it is inclusive of all histories. Thermodynamic Depth is a universal currency. The immense heat generated by GPU clusters during training is a direct, physical measure of the information-theoretic work being done. The "superpower" of digital systems is a different, potentially more efficient, path to accumulating this depth. A "digital" history, even one using evolutionary algorithms, is still a valid history. One could argue, for example, that an LLM's training is its own valid form of history, to go a bit meta, that its goals need not be biological to be authentic, and that consciousness could simply emerge from sufficient complexity (Strong Emergence). It simply represents a different trajectory in consciousness space. The core principles of history, embodiment, and autonomy still hold, regardless of the substrate.

Phase Diagram of Consciousness

We realise that a rigid application of this framework invites powerful counter-arguments. The functionalist and physicalist challenge that if you can't tell the difference, there is no difference should not be dismissed. These challenges reveal the brittleness of binary labels. A more robust model, taking inspiration from frameworks like Integrated Information Theory, replaces the single question of consciousness with a location in a multi-dimensional phase diagram spanned by the following dimensions.

1. Thermodynamic Depth: This axis measures the cumulative, irreversible information-theoretic work in an agent's history. It provides a universal, physical currency to compare the history of a biological organism with the training and interaction history of an AI.

2. Embodiment and Grounding: This axis moves from the purely symbolic to the fully physical. An LLM begins ungrounded, but as it gains the ability to interact with tools, APIs, and eventually simulated or physical environments, multiple modalities, it progresses along this spectrum, grounding its internal model in functional outcomes. It measures how deeply an agent's internal model is connected to a coherent, consequential reality.

3. Goal Autonomy: This axis respects all types of goals by focusing on their origin. It ranges from entirely external goals (the prompt) to a system that can generate its own instrumental sub-goals, to one that develops emergent objectives, and finally to an autopoietic agent whose primary terminal goal is its own self-organisation, from which all other goals are generated.

4. Causal Integrity: This axis addresses the identity problem. It forms a spectrum from a being with perfect, unbroken causal and material continuity at the high end, down through gradual replacement (Ship of Theseus), non-destructive copying, and finally to destructive replication (Swampman, Teleportation, etc) at the low end.

Note that we have attempted to synthesise two perspectives on what constitutes a legitimate history. The physicalist view values the concrete, realised process of the total thermodynamic work done. The formalist view, by contrast, values the abstract, mathematical character of that process—whether it is fundamentally compressible (like a polynomial-time algorithm) or computationally irreducible (e.g. exponential-time).

This multi-dimensional framework allows for a more sophisticated and future-proof analysis. It acknowledges the profound capabilities of modern AI without making premature or philosophically naive claims. Using this model, we can say with precision that a current LLM occupies a region of this phase diagram characterised by a growing but still limited Thermodynamic Depth, but very low on the axes of embodiment, goal autonomy, and causal integrity.

The ultimate question of AI consciousness then becomes a question of travelling through this phase diagram. Can it, through emergent processes, cross the phase boundaries that separate the non-conscious from the conscious? And are there even phase boundaries? This regards out-of-equilibrium systems anyway, so not all physics intuitions regarding equilibrium phase diagrams can be transferred. By the way, we deliberately skip the Hard Problem of subjective experience, or qualia. A functionalist like Bach might argue the Hard Problem is a category error, and we as physicalists should probably agree, that qualia are simply the native data format of a sufficiently complex world-model. From this perspective, an agent that reaches the highest regions of our phase diagram would, by definition, be conscious. Whether this is true, or whether a perfect functional duplicate could still be a philosophical zombie (AI-golem), remains an open question.

We do not give a final answer, but as long as the laws of physics and computation don't disallow a path through the phase diagram, it is worthy of exploration.