Transformers as Scientific Models?

UNIVERSITY OF GOTHENBURG TRANSFORMERS AS SCIENTIFIC MODELS? ANDREAS CHATZOPOULOS, 2024
1

UNIVERSITY OF GOTHENBURG THE ORIGINS OF ARTIFICIAL INTELLIGENCE 2

UNIVERSITY OF GOTHENBURG ‣ McCulloch & Pitts 1940s ‣ Crude
model of a biological neuron ‣ Can serve as logic gates NEURON MODEL a b 1 a b 1 b 0 or and not 3

UNIVERSITY OF GOTHENBURG ‣ Rosenblatt 1950s ‣ More realistic -
takes varying synaptic strength into account (Hebbian theory) PERCEPTRON a b y Σ 4 w1 w2

UNIVERSITY OF GOTHENBURG Inspired by the functions of biological neurons
in the brain PERCEPTRON NEURON MODEL 5

UNIVERSITY OF GOTHENBURG AI IN 2024 ‣ Mostly Artificial Neural
Networks ‣ Not indented as brain models – tools to accomplish various tasks 6

UNIVERSITY OF GOTHENBURG EXCEPTION ‣ Convolution Neural Networks – in
many ways developed with inspiration from the structure of the visual cortex 7

UNIVERSITY OF GOTHENBURG ‣ Layered system that processes information in
a hierarchical way to extract image features ‣ Similar to how neurons in the visual cortex respond to particular stimuli CONVOLUTIONAL NEURAL NETWORKS (CNNs) 8

UNIVERSITY OF GOTHENBURG NETWORKS AS TOOLS ‣ Others are developed
as tools to accomplish various tasks with little regard to the functions of the brain 9

UNIVERSITY OF GOTHENBURG ‣ Developed in 2017 ‣ Designed to
handle sequential data like text and speech – attention functionality allows the transformer to focus on specific parts of the input sequence ‣ Basis for Large Language Models (LLMs) ‣ Constructed to solve a specific problem, not to model a particular brain functionality TRANSFORMERS 10

UNIVERSITY OF GOTHENBURG ‣ Neuron and astrocytes in the brain
could theoretically implement the core computations performed by transformers networks. ‣ Provides a novel perspective of the relationship between LLMs and the brain BRAIN- TRANSFORMERS? 11

UNIVERSITY OF GOTHENBURG Could we accidentally have stumbled upon a
model of a mechanism that actually exists in the brain? 12

UNIVERSITY OF GOTHENBURG HOW-POSSIBLY MODEL? 13

UNIVERSITY OF GOTHENBURG HOW-ACTUAL MODEL ‣ Models a phenomenon in
the way it actually occurs – a model of how things actually are PHILOSOPHICAL ANALYSIS HOW-POSSIBLY MODEL ‣ Propositional model of how a phenomena might possibly occur – how things could possibly be 14

UNIVERSITY OF GOTHENBURG ‣ Scientific progress entails moving towards how-actual
‣ A hypothesis moves towards corroboration EPISTEMIC PLAUSABILITY how-actual how-plausibly how-possibly 15

UNIVERSITY OF GOTHENBURG Even if this could be seen as
a model of a functionality that actually exist in the brain, it would only be a model of one particular feature of the brain 16

UNIVERSITY OF GOTHENBURG What if it is not an exact
model of this feature? What if it is somewhat accurate? 17

UNIVERSITY OF GOTHENBURG HOW SHOULD THIS DIFFERENCE BE UNDERSTOOD? PHILOSOPHICAL
ANALYSIS how-actual Target Model (Stuart Glennan) 18 how-possibly ? Model Model Model Model

UNIVERSITY OF GOTHENBURG PHILOSOPHICAL ANALYSIS Target Model (Stuart Glennan) 19
WHAT DOES THIS ACTUALLY MEANS?

UNIVERSITY OF GOTHENBURG PHILOSOPHICAL ANALYSIS (Stuart Glennan) 20 The relationship
is one of similarities in degrees and respect Target Model

UNIVERSITY OF GOTHENBURG PHILOSOPHICAL ANALYSIS (Stuart Glennan) 21 Since the
relationship is about similarities, about representing more or less, we cannot say: ‣ Model A represents a possibility that is actual ‣ Model B represents a possibility that is not actual THEY REPRESENT IN DEGREES AND RESPECT, NOT EITHER OR

UNIVERSITY OF GOTHENBURG PHILOSOPHICAL ANALYSIS (Stuart Glennan) 22 If we
hold a model to less strict similarity requirements, it may succeed in representing a target, if only roughly

UNIVERSITY OF GOTHENBURG PHILOSOPHICAL ANALYSIS (Stuart Glennan) 23 INSTEAD OF
DIVIDING MODELS INTO POSSIBLY–ACTUAL: ‣ Adjust their similarity requirements ‣ A model that succeeds in representing a target due to decreased similarity requirements should be viewed as a how-roughly model If we hold a model to less strict similarity requirements, it may succeed in representing a target, if only roughly

UNIVERSITY OF GOTHENBURG ‣ Postulates circular orbits for the planets
‣ We now this to be incorrect: ‣ An inaccurate, how-possibly model according to old view ‣ Shift to thinking about similarities: ‣ With decreased similarity requirements, the model would be correct ‣ A how-roughly model that captures important features, even if it is not 100% correct EXAMPLE: COPERNICAN MODEL

UNIVERSITY OF GOTHENBURG HOW-POSSIBLY MODEL? 25

UNIVERSITY OF GOTHENBURG HOW-ROUGHLY MODEL 26

UNIVERSITY OF GOTHENBURG HOW DO WE TEST THIS? 27 Models
can be tested by running them in simulations where their performance is examined ‣ When we interact with ChatGPT, we are running the LLM in a simulation ‣ When we are using image recognition with CNNs, we run simulations that employ these models

UNIVERSITY OF GOTHENBURG 28 BUILDING MODELS AND SIMULATIONS Same methodology
could be employed for many different cognitive theories ‣ Language processing (LLMs) ‣ Visual processing (CNNs) ‣ ...

UNIVERSITY OF GOTHENBURG 29 TESTING BY EMBEDDING To test the
simulations it's often advantageous to use them in agents, placed in an environment

UNIVERSITY OF GOTHENBURG 30 REINFORCEMENT LEARNING Environment Agent Input Actions

UNIVERSITY OF GOTHENBURG How to use LLMs to augment exploration
in Reinforcement Learning

UNIVERSITY OF GOTHENBURG Use of LLMs to augment Reinforcement Learning
in various ways

UNIVERSITY OF GOTHENBURG Embedding agents in virtual worlds to explore
questions in Cognitive Science

UNIVERSITY OF GOTHENBURG ‣ Agents in an environment in a
reinforcement learning scenario where the agent behavior is studied in various ways ‣ Simulated ecosystem where behavior is learned through reinforcement learning RIGHT NOW

UNIVERSITY OF GOTHENBURG I want to develop this: ‣ Augment
agents with LLMs ‣ Theory behind this is that the agents "cognitive functions" could be viewed as how-roughly models that are tested in an environment GOING FORWARD

UNIVERSITY OF GOTHENBURG TO SUM IT UP 36 Just as
the Copernican Model could learn us things about the solar system without being 100% correct, maybe transformers and LLMs can teach us about the brain without being 100% accurate models of the brain, or even intended as brain-models to begin with. They could be seen as how-roughly models. Simulated environments with embedded agents that employs transformer-based LLMs could help us to test this, and the same methodology could be used to test other cognitive theories.

THANK YOU! [email protected]

Transformers as Scientific Models?

Transformers as Scientific Models?

More Decks by Andreas Chatzopoulos

Featured

Transcript