The Matrix Got Closer - But Not the Way We Thought

18 Months Ago, We Asked If AI Could Build a Simulation. The Answer Arrived - And It Changed the Question.

Jun 24, 2026

You are standing in a room that did not exist three seconds ago.

There are windows. Through the windows, there is a street - cobblestone, European-looking, afternoon light slanting through a gap between buildings. You take a step forward, and the floorboards respond. You turn your head, and the room extends in the direction you look: a hallway, doors, the suggestion of a kitchen at the end. You walk toward it, and it becomes a kitchen - cabinets, a window above the sink, a courtyard outside.

None of this was pre-built. None of it was rendered in advance. The room, the street, the hallway, the kitchen - they were generated in the moment you moved toward them. What’s behind you isn’t being computed anymore. But turn around, and it will be there - consistent, geometrically correct, exactly as you left it. Not because it was stored, but because the model knows what it should look like when you look again.

This is not a thought experiment. This is a product demo running on commercial hardware in 2026.

And it changes everything about a question we asked eighteen months ago.

In November 2024, we published an article arguing that LLMs and generative AI increase the likelihood that we could build a Matrix - or that we might already be inside one. The argument was simple: LLMs prove that the collective knowledge of humanity is compressible into patterns within a neural network. Text-to-video models like Sora suggested that even perceptual reality - light, texture, motion, perspective - was reconstructable from those patterns. If knowledge compresses and perception reconstructs, then a realistic simulation of the universe might be far more feasible than anyone assumed.

Reading that article today is a strange experience. Not because it was wrong. Because it was too conservative. And that almost never happens with technology predictions. They almost always overshoot. They paint futures that arrive late, diluted, or not at all. This is one of those rare cases where reality moved faster than speculation - and didn’t just deliver what was predicted, but reframed the entire problem in a way the original article couldn’t have anticipated.

What happened in between is the rise of AI World Models. And what they demonstrate is not an incremental improvement on what existed in 2024. It is a categorical shift in what “simulation” means.

The Convergence

Between late 2025 and early 2026, four independent teams - Google DeepMind, Tencent, Runway, and an Israeli startup called Decart - shipped systems that do essentially the same thing: generate interactive, navigable 3D environments in real time, from text descriptions, without pre-built assets.

The specifics matter less than the convergence. Google’s Genie 3 runs at 24 frames per second, 720p. Tencent’s HunyuanWorld maintains geometric consistency when you leave an area and return - the model remembers spatial relationships. Runway’s GWM-1 comes in three variants: explorable worlds, robotic training environments, and photorealistic conversational avatars with real-time facial expressions. Decart’s MirageLSD solved what may be the hardest technical problem in the space - infinite generation without quality collapse - through a technique called Live Stream Diffusion: per-frame error correction that lets the model run indefinitely without accumulating drift. Under 40 milliseconds per frame. Zero latency.

Four teams. Different architectures. Different funding. Different continents. Same result: you describe a world, the model generates it around you as you move through it.

This is not a coincidence. It is a convergence - and what converges is not just the technology, but its implications. Because what these systems collectively demonstrate is something that should, if you think about it for more than a minute, make you profoundly uneasy.

The Wall That Was Supposed to Hold

There was always a trump card against the simulation hypothesis. Not a philosophical objection - those are easy to argue around - but a physical one. A mathematical one. And it went like this:

To simulate a universe, you would need to compute every particle, every field interaction, every quantum event, everywhere, simultaneously, whether anyone is observing it or not. The computational cost of this is not merely “enormous.” It is, in a precise technical sense, larger than the universe itself. You cannot simulate a system inside a system that is smaller than the system being simulated. The entire observable universe does not contain enough matter to build a computer that could simulate the entire observable universe at full resolution.

That was the wall. Not “we don’t know how.” Not “it would be expensive.” But: it is physically impossible, by definition, regardless of how advanced your technology becomes.

And for a long time, that wall held. It was the clean, satisfying answer. Yes, the Matrix is a fun thought experiment. No, it cannot exist. The math doesn’t work. Go home.

Here is what World Models did to that wall.

You Don’t Simulate a Universe. You Simulate an Experience.

The wall assumes that simulation means brute-force physics - computing every atom, every photon, every interaction, everywhere, at all times. And if that’s what simulation means, the wall is correct. It will always be correct. You cannot out-compute physics with physics.

But that is not what World Models do. They don’t simulate atoms. They don’t run physics engines. They don’t solve differential equations for fluid dynamics or electromagnetic propagation. They have never seen a physics equation in their training data.

What they do is something categorically different: they have watched millions of hours of video of what physics looks like from the inside, and they have learned to reproduce the result directly - without computing the process.

Think about what that means. The difference between simulating rain and knowing what rain looks like is not a matter of degree. It is a difference in kind. Simulating rain means modeling billions of individual water droplets, each subject to gravity, air resistance, turbulence, surface tension, collision dynamics - a fluid dynamics problem that costs enormous compute even for a few seconds of a small volume. Knowing what rain looks like means: grey sky, streaks in the air, wet surfaces reflect more, puddles form in concavities, the sound is a specific kind of noise. The model doesn’t compute the rain. It generates the experience of rain - and the experience is computationally trivial compared to the physics.

This is not a shortcut. It is an entirely different paradigm. And it is the paradigm that demolishes the wall.

Because the wall was built against brute-force simulation. It says: you cannot compute every atom. And it’s right - you can’t. But World Models don’t need to. They don’t simulate the universe. They simulate what it is like to be inside a universe. They generate perception, not physics. And perception - the visual, auditory, tactile surface of reality that a conscious being actually encounters - turns out to be compressible, learnable, and reproducible at a fraction of the cost.

The wall was the right answer to the wrong question.

Observation-Dependent Reality

And here is where it gets genuinely unsettling.

Genie 3 doesn’t pre-compute an environment and then let you walk through it. It generates the world as you move. What’s ahead of you is created when you walk toward it. What’s behind you stops being computed when you turn away. The model retains enough information to reconstruct it consistently when you look back - but the reconstruction happens at the moment of observation, not before.

The world, in a precise technical sense, exists only insofar as it is being perceived.

This is not a new philosophical idea. It is one of the oldest. The question of whether the tree in the forest makes a sound when no one is listening has been a staple of introductory philosophy courses for centuries. Quantum mechanics has its own version: the measurement problem, the observer effect, the collapse of the wave function. These were always treated as metaphysical curiosities - interesting to discuss, impossible to test, irrelevant to engineering.

What’s new is that we now have a computational architecture that implements exactly this principle. And it doesn’t just work in theory. It produces coherent, navigable, interactive environments that feel real enough to walk through. It runs on a laptop. It generates at 40 milliseconds per frame.

The implications for the simulation argument are not subtle. The computational cost of simulating a universe drops by orders of magnitude - by unfathomable orders of magnitude - if you don’t need to simulate the parts that no one is experiencing. A Matrix doesn’t need to run the physics of Alpha Centauri while its inhabitants are having breakfast on Earth. It only needs to generate Alpha Centauri if and when someone points a telescope at it - and even then, only the observable light pattern, not the actual stellar dynamics. Not the fusion reactions. Not the magnetic field topology. Just: what does this look like from where you’re standing?

That is exactly how World Models work.

The wall didn’t fall because someone built a bigger computer. It fell because the question changed. You don’t need to out-compute the universe. You just need to out-generate the experience of being in one.

The Drift Problem - And Why Its Solution Might Be the Most Important Part

There was a second wall, less discussed but equally serious: error accumulation.

Every computation introduces rounding. Every frame carries forward imperfections from the frame before. In any finite-precision system, these tiny errors compound over time. Run a simulation long enough, and it diverges from coherence. The output collapses into noise.

This is not a theoretical worry. It is the reason every World Model before late 2025 degraded after seconds or minutes. You could generate a room, walk through it for a while, and then the textures would smear, the geometry would warp, the world would dissolve. Error accumulation was the hard ceiling - and it was a hard ceiling on the simulation hypothesis too, because a Matrix that falls apart after ten minutes is not a Matrix.

Decart’s MirageLSD solved this - not by eliminating errors, but by teaching the model to metabolize them. The system is trained on deliberately corrupted input: frames with injected noise, distortions, drift. It learns to anticipate what degradation looks like and correct it in real time, continuously, indefinitely. The output doesn’t collapse. The errors don’t accumulate. They are absorbed.

From the perspective of the simulation argument, this may be the most significant development of the entire period. Because it answers the quiet objection that even philosophers rarely stated explicitly: even if you set the rules right, any simulation must eventually decay. And the answer is: no. Not if the system corrects its own drift. Not if error correction is built into the generative process itself.

Whether our universe has analogous mechanisms - whether physical constants are, in some sense, error-correction parameters that keep reality coherent over billions of years - is a question that has just become significantly less absurd to ask.

The Construct

In the Matrix films, there is a space called the Construct - an infinite white room where anything can be instantiated on demand. Weapons, training programs, entire cities. “Need a helicopter? Load the helicopter.” It was cinematic shorthand. Fiction shorthand. A visual metaphor for a capability that seemed so far from reality that it needed no justification.

Consider what happens when World Models reach consumer-grade fidelity within the next few years - and everything about the current trajectory suggests they will.

A child in a classroom in rural India loads a real-time walkable ancient Rome. Not a pre-built game level. A world generated on the fly from historical data, where she can turn any corner and the model fills in architecturally and historically coherent detail. An architect doesn’t build 3D mockups - he describes a building and walks through it, testing sightlines and lighting conditions in a world that assembles itself around his specifications. A trauma therapist places a patient in a controlled reconstruction of the environment that caused the PTSD - generated, interactive, adjustable in real time. Waymo is already training self-driving systems on generated scenarios too rare for real testing: tornadoes, animals on highways, construction zones that don’t exist yet.

None of this requires VR headsets. None of it requires neural interfaces. A screen and a keyboard are sufficient, because World Models generate flat video output that you navigate like a game. The hardware barrier that kept immersive simulation in the domain of science fiction has quietly dissolved.

The economic implications alone are staggering. The game development industry - a $200 billion market - is built on the premise that interactive worlds must be manually constructed, asset by asset, polygon by polygon. World Models make that premise obsolete. The same logic extends to architecture visualization, urban planning, film pre-production, military training, real estate, tourism, education.

The Construct is not a metaphor anymore. It is a product category.

What’s Missing - And It’s Not a Detail

At this point, an honest inventory is necessary.

Everything described so far - every World Model, every generated environment, every 40-millisecond frame - produces output on a screen. You look at it. You navigate it with a keyboard. And while you do, you are sitting in a chair, in a room, with peripheral vision, ambient sound, the weight of your body, the smell of your coffee. You know, at every moment, that what’s on the screen is not where you are.

The Matrix in the film is not a screen. It is total sensory substitution - a system that replaces your entire perceptual input, across every modality, with such fidelity that you have no remaining reference point to distinguish the generated from the real. That requires a brain-computer interface that can write directly to the nervous system - not just visual cortex, but proprioception, touch, temperature, balance, pain. We do not have this. Neuralink’s current implants read motor signals from a few thousand neurons. Writing rich, high-bandwidth sensory experience back into the brain is a problem of a completely different order, and anyone who tells you it’s five years away is selling something.

This is not a minor gap. It is the difference between watching a documentary about the ocean and drowning.

World Models have built something remarkable: the rendering engine of a possible simulation. The part that generates coherent, navigable, physically plausible perceptual output in real time. That is genuinely new, and it is genuinely significant. But a rendering engine is not a Matrix. A Matrix requires embodiment - the closure of the loop between generated world and experiencing subject, with no exit and no seam. That loop is not closed. It is not close to closed.

What has changed is not that the Matrix is here. What has changed is the answer to a more specific question: Is the generation problem solvable? Can a system produce perceptual reality, on the fly, at the moment of observation, without pre-computing an entire universe? Eighteen months ago, that was speculative. Now it is demonstrated. The generation problem is solved, or very nearly so.

The interface problem - how you get that generated reality into a brain so completely that the brain cannot tell the difference - remains unsolved, and remains hard in ways that are not analogous to the generation problem. It is a neuroscience problem, not a machine learning problem, and neuroscience does not move on machine learning timescales.

So the honest answer is: the Matrix got closer, but the distance that remains is not the kind that shrinks predictably. One wall fell. The other still stands. It is a different wall, made of different material, and the tools that demolished the first one do not obviously work on the second.

What Remains

The original article ended with a careful hedge: “Perhaps we are closer to the Matrix than we thought - or perhaps the complexity of chaos simply proves that we are not.”

That hedge is no longer available - but neither is the opposite.

We have not proven we’re inside a simulation. We have not built a Matrix. We have not even built half of one. What we have done is demolish the strongest objection to its possibility - the brute-force computational wall - by demonstrating that the wall was built against a paradigm of simulation that turns out not to be the relevant one. The generation problem, the one everyone assumed was impossible, is solved or very nearly so. The interface problem, the one almost nobody was thinking about because they were busy proving generation was impossible, remains wide open.

Nobody at Google DeepMind, Tencent, Runway, or Decart set out to prove the simulation hypothesis. They set out to build better tools for gaming, robotics, and content creation. What they’ve collectively produced, as an engineering byproduct, is one half of a proof of concept for exactly the kind of perceptual simulation that the Matrix requires - and an uncomfortably clear view of what the other half would need to look like.

Eighteen months ago, the question was whether a simulation was computationally conceivable. That question is answered. The question now is whether the remaining gap - the embodiment gap, the interface, the neuroscience - is a wall or a delay.

Nobody is building the Matrix. But half of it is building itself, as a side effect, without anyone having intended it.

Whether that’s reassuring depends entirely on how you think the other half arrives.

Prompt Injection

Discussion about this post

Ready for more?