Simulation Gap

Simulation Gap Andreas Chatzopoulos, 2026 UNIVERSITY OF GOTHENBURG | COLLEGIUM
OF COGNITIVE SCIENCE

UNIVERSITY OF GOTHENBURG | COLLEGIUM OF COGNITIVE SCIENCE AIF

UNIVERSITY OF GOTHENBURG | COLLEGIUM OF COGNITIVE SCIENCE Limitations of
AIF Simulation Gap Way forward Outline

Brain as a prediction machine UNIVERSITY OF GOTHENBURG | COLLEGIUM
OF COGNITIVE SCIENCE PREDICTION PREDICTION ERROR (DISCREPANCY) SENSORY DATA

Generative model Used to simulate the hidden causes of sensory
data. Probabilistic model, constantly asking: "what world must exist to explain these sensations?" UNIVERSITY OF GOTHENBURG | COLLEGIUM OF COGNITIVE SCIENCE

Two paths to inference UNIVERSITY OF GOTHENBURG | COLLEGIUM OF
COGNITIVE SCIENCE When prediction fails, we change the model. We update our beliefs to better match incoming sensory evidence. Perception When prediction fails, we adjust the world to make our sensations match the predictions. Action

Policies Policy = Sequence of actions, a plan for how
to manipulate the environment. Policy selection = The selection of a specific action plan. UNIVERSITY OF GOTHENBURG | COLLEGIUM OF COGNITIVE SCIENCE

The Active Inference loop UNIVERSITY OF GOTHENBURG | COLLEGIUM OF
COGNITIVE SCIENCE

AIF and phenomenal experiences UNIVERSITY OF GOTHENBURG | COLLEGIUM OF
COGNITIVE SCIENCE Not explicitly treated in most AIF accounts. Only briefly mentioned as tied to policy-conditioned beliefs "what would I expect to experience if I acted this way". Also tied to policy selection that determines which beliefs are afforded high precision and thus come to dominate. The contents of consciousness are the beliefs that "win" this competition under a selected policy.

AIF and phenomenal experiences UNIVERSITY OF GOTHENBURG | COLLEGIUM OF
COGNITIVE SCIENCE ➔ No exhaustive explanation of a first-person view or phenomenal experiences. ➔ Some accounts talk about the content of consciousness, but does not explain how the experiences of this content are generated by the brain.

UNIVERSITY OF GOTHENBURG | COLLEGIUM OF COGNITIVE SCIENCE With AIF
as a base, would it possible to develop and extended version that would be able to explain these hard-to-explain features?

UNIVERSITY OF GOTHENBURG | COLLEGIUM OF COGNITIVE SCIENCE First step:
Find philosophical theories about the mind that would be compatible with AIF.

USEFUL PHILOSOPHICAL THEORIES UNIVERSITY OF GOTHENBURG | COLLEGIUM OF COGNITIVE
SCIENCE

Arises because AIF specifies the generative model and its beliefs
at the level of probabilistic state description, without specifying any representational format other than probability distributions. Simulation Gap UNIVERSITY OF GOTHENBURG | COLLEGIUM OF COGNITIVE SCIENCE

What kind of representational and biological format would allow probabilistic,
action- conditioned inference to appear as an immersive, egocentric world of experience? Representational format? UNIVERSITY OF GOTHENBURG | COLLEGIUM OF COGNITIVE SCIENCE

UNIVERSITY OF GOTHENBURG | COLLEGIUM OF COGNITIVE SCIENCE Use a
hypothetical model as a first step toward an explanation?

UNIVERSITY OF GOTHENBURG | COLLEGIUM OF COGNITIVE SCIENCE Focusing on
the visual aspect of simulations: Could we find a model that has a similar output as our experienced inner simulations?

UNIVERSITY OF GOTHENBURG | COLLEGIUM OF COGNITIVE SCIENCE How could
we construct such a model? If we were to build an artificial system with a similar visual simulation as output, how would it be implemented? Better yet; can we find an already existing example of how something similar to this might be achieved?

UNIVERSITY OF GOTHENBURG | COLLEGIUM OF COGNITIVE SCIENCE

Example usage Simulated word in which virtual AI agents can
be trained. The agents can interact with the simulated world and learn without access to the real world. UNIVERSITY OF GOTHENBURG | COLLEGIUM OF COGNITIVE SCIENCE

...an agent used Genie to generate internal worlds? ...the genie
world generation system was an internal part of the agent? What if... UNIVERSITY OF GOTHENBURG | COLLEGIUM OF COGNITIVE SCIENCE

By watching a lot of video game content, the model
has learned action-like latent variables that explain transitions between frames. Given two frames, there must be some cause for whatever changed between them. ➔ It can infer what action will cause which changes in the generated world. Latent Action Model UNIVERSITY OF GOTHENBURG | COLLEGIUM OF COGNITIVE SCIENCE move left frame 1 frame 2

➔ In AIF, actions represents possible ways the sensory stream
can be influenced. ➔ In AIF, selected action depends on the most preferred policy, and policy selection means that the system selects among possible action sequences. Difference: In Genie, the sequence is provided by a human user. Functional comparison holds: Different action sequences generate different possible worlds. Actions UNIVERSITY OF GOTHENBURG | COLLEGIUM OF COGNITIVE SCIENCE ➔ Actions corresponding to changes between frames are encoded in latent action space (move left", "move right" etc.) ➔ A user chooses a discrete latent action, and the dynamics model generates the next frame. ➔ Capable of selecting a sequence of latent actions that conditions the future trajectory of the generated world. Genie Compatibility

Based on all seen video frames and the learned action
outcomes, the model has learned to predict what will happen next in the generated world based on what action is taken. ➔ It can predict how the world unfolds over time. Dynamics Model UNIVERSITY OF GOTHENBURG | COLLEGIUM OF COGNITIVE SCIENCE UNIVERSITY OF GOTHENBURG | COLLEGIUM OF COGNITIVE SCIENCE

➔ In AIF, perceptions and actions are encoded in an
internal model of how sensory states unfold. Answers the question: "What sensory impression will I get if I perform this action?" Both systems involve counterfactual predictions that represent possible next states. They both have a model of "what would happen if..." World UNIVERSITY OF GOTHENBURG | COLLEGIUM OF COGNITIVE SCIENCE ➔ Action-Controlled world model. ➔ Future video frames depend on action-like input. Answers the question: "Given this current visual state, what should happen next if this action is taken?" Genie Compatibility

➔ The video tokenizer compresses video frames into discrete tokens.
➔ The dynamics model predicts future tokens, and the tokenizer’s decoder turns predicted tokens back into image space. Video tokenizer UNIVERSITY OF GOTHENBURG | COLLEGIUM OF COGNITIVE SCIENCE

That is, the world model does not operate directly on
finished images; it uses an intermediate representational format that can then be predicted over and rendered back into visible scenes. Video tokenizer UNIVERSITY OF GOTHENBURG | COLLEGIUM OF COGNITIVE SCIENCE

➔ AIF tells us that perception and action involve probabilistic
inference, performed by a generative model. ➔ Genie shows a concrete implementation in which a generative model can use an intermediate representational format to produce a world-like output. ➔ The move from probabilistic prediction to world-like presentation seems to require an representational that works similar to the one found in Genie. That would help us close the Simulation Gap. AIF and Genie UNIVERSITY OF GOTHENBURG | COLLEGIUM OF COGNITIVE SCIENCE

➔ AIF does not tell us any details about how
immersive simulations are generated from probabilistic models. ➔ We can find philosophical theories that are aligned with the concept of inner simulations, but these require those simulations to be immersive and fully rendered. ➔ Genie gives us a proof-of-concept that could perhaps serve as a model: An action-conditioned generative model that seems to be aligned with AIF and produces fully rendered, explorable scenes as output. Summary UNIVERSITY OF GOTHENBURG | COLLEGIUM OF COGNITIVE SCIENCE

➔ If we want to extend AIF into a theory
that approaches an explanation of the visual immersive part of phenomenal consciousness, we need to specify not only the probabilistic logic of inference but also the implemented format in which probabilistic predictions become a egocentric, immersive world. ➔ Genie gives us an artificial case in which latent action, temporal prediction, and generative rendering combine into an immersive, explorable world that can perhaps serve as a model for this. Conclusion UNIVERSITY OF GOTHENBURG | COLLEGIUM OF COGNITIVE SCIENCE

Thank You! UNIVERSITY OF GOTHENBURG | COLLEGIUM OF COGNITIVE SCIENCE
[email protected] cognitivescience.se SpeakerDeck:

Simulation Gap

Simulation Gap

Andreas Chatzopoulos

More Decks by Andreas Chatzopoulos

Other Decks in Science

Featured

Transcript

Simulation Gap Andreas Chatzopoulos, 2026 UNIVERSITY OF GOTHENBURG | COLLEGIUM

UNIVERSITY OF GOTHENBURG | COLLEGIUM OF COGNITIVE SCIENCE AIF

UNIVERSITY OF GOTHENBURG | COLLEGIUM OF COGNITIVE SCIENCE Limitations of

Brain as a prediction machine UNIVERSITY OF GOTHENBURG | COLLEGIUM

Generative model Used to simulate the hidden causes of sensory

Two paths to inference UNIVERSITY OF GOTHENBURG | COLLEGIUM OF

Policies Policy = Sequence of actions, a plan for how

The Active Inference loop UNIVERSITY OF GOTHENBURG | COLLEGIUM OF

AIF and phenomenal experiences UNIVERSITY OF GOTHENBURG | COLLEGIUM OF

AIF and phenomenal experiences UNIVERSITY OF GOTHENBURG | COLLEGIUM OF

UNIVERSITY OF GOTHENBURG | COLLEGIUM OF COGNITIVE SCIENCE With AIF

UNIVERSITY OF GOTHENBURG | COLLEGIUM OF COGNITIVE SCIENCE First step:

USEFUL PHILOSOPHICAL THEORIES UNIVERSITY OF GOTHENBURG | COLLEGIUM OF COGNITIVE

Arises because AIF specifies the generative model and its beliefs

What kind of representational and biological format would allow probabilistic,

UNIVERSITY OF GOTHENBURG | COLLEGIUM OF COGNITIVE SCIENCE Use a

UNIVERSITY OF GOTHENBURG | COLLEGIUM OF COGNITIVE SCIENCE Focusing on

UNIVERSITY OF GOTHENBURG | COLLEGIUM OF COGNITIVE SCIENCE How could

UNIVERSITY OF GOTHENBURG | COLLEGIUM OF COGNITIVE SCIENCE

UNIVERSITY OF GOTHENBURG | COLLEGIUM OF COGNITIVE SCIENCE

UNIVERSITY OF GOTHENBURG | COLLEGIUM OF COGNITIVE SCIENCE

Example usage Simulated word in which virtual AI agents can

...an agent used Genie to generate internal worlds? ...the genie

By watching a lot of video game content, the model

➔ In AIF, actions represents possible ways the sensory stream

Based on all seen video frames and the learned action

➔ In AIF, perceptions and actions are encoded in an

➔ The video tokenizer compresses video frames into discrete tokens.

That is, the world model does not operate directly on

➔ AIF tells us that perception and action involve probabilistic

➔ AIF does not tell us any details about how

➔ If we want to extend AIF into a theory

Thank You! UNIVERSITY OF GOTHENBURG | COLLEGIUM OF COGNITIVE SCIENCE