Transformers as Scientific Models?

Slide 1

Slide 1 text

UNIVERSITY OF GOTHENBURG TRANSFORMERS AS SCIENTIFIC MODELS? ANDREAS CHATZOPOULOS, 2024 1

Slide 2

Slide 2 text

UNIVERSITY OF GOTHENBURG THE ORIGINS OF ARTIFICIAL INTELLIGENCE 2

Slide 3

Slide 3 text

UNIVERSITY OF GOTHENBURG ‣ McCulloch & Pitts 1940s ‣ Crude model of a biological neuron ‣ Can serve as logic gates NEURON MODEL a b 1 a b 1 b 0 or and not 3

Slide 4

Slide 4 text

UNIVERSITY OF GOTHENBURG ‣ Rosenblatt 1950s ‣ More realistic - takes varying synaptic strength into account (Hebbian theory) PERCEPTRON a b y Σ 4 w1 w2

Slide 5

Slide 5 text

UNIVERSITY OF GOTHENBURG Inspired by the functions of biological neurons in the brain PERCEPTRON NEURON MODEL 5

Slide 6

Slide 6 text

UNIVERSITY OF GOTHENBURG AI IN 2024 ‣ Mostly Artificial Neural Networks ‣ Not indented as brain models – tools to accomplish various tasks 6

Slide 7

Slide 7 text

UNIVERSITY OF GOTHENBURG EXCEPTION ‣ Convolution Neural Networks – in many ways developed with inspiration from the structure of the visual cortex 7

Slide 8

Slide 8 text

UNIVERSITY OF GOTHENBURG ‣ Layered system that processes information in a hierarchical way to extract image features ‣ Similar to how neurons in the visual cortex respond to particular stimuli CONVOLUTIONAL NEURAL NETWORKS (CNNs) 8

Slide 9

Slide 9 text

UNIVERSITY OF GOTHENBURG NETWORKS AS TOOLS ‣ Others are developed as tools to accomplish various tasks with little regard to the functions of the brain 9

Slide 10

Slide 10 text

UNIVERSITY OF GOTHENBURG ‣ Developed in 2017 ‣ Designed to handle sequential data like text and speech – attention functionality allows the transformer to focus on specific parts of the input sequence ‣ Basis for Large Language Models (LLMs) ‣ Constructed to solve a specific problem, not to model a particular brain functionality TRANSFORMERS 10

Slide 11

Slide 11 text

UNIVERSITY OF GOTHENBURG ‣ Neuron and astrocytes in the brain could theoretically implement the core computations performed by transformers networks. ‣ Provides a novel perspective of the relationship between LLMs and the brain BRAIN- TRANSFORMERS? 11

Slide 12

Slide 12 text

UNIVERSITY OF GOTHENBURG Could we accidentally have stumbled upon a model of a mechanism that actually exists in the brain? 12

Slide 13

Slide 13 text

UNIVERSITY OF GOTHENBURG HOW-POSSIBLY MODEL? 13

Slide 14

Slide 14 text

UNIVERSITY OF GOTHENBURG HOW-ACTUAL MODEL ‣ Models a phenomenon in the way it actually occurs – a model of how things actually are PHILOSOPHICAL ANALYSIS HOW-POSSIBLY MODEL ‣ Propositional model of how a phenomena might possibly occur – how things could possibly be 14

Slide 15

Slide 15 text

UNIVERSITY OF GOTHENBURG ‣ Scientific progress entails moving towards how-actual ‣ A hypothesis moves towards corroboration EPISTEMIC PLAUSABILITY how-actual how-plausibly how-possibly 15

Slide 16

Slide 16 text

UNIVERSITY OF GOTHENBURG Even if this could be seen as a model of a functionality that actually exist in the brain, it would only be a model of one particular feature of the brain 16

Slide 17

Slide 17 text

UNIVERSITY OF GOTHENBURG What if it is not an exact model of this feature? What if it is somewhat accurate? 17

Slide 18

Slide 18 text

UNIVERSITY OF GOTHENBURG HOW SHOULD THIS DIFFERENCE BE UNDERSTOOD? PHILOSOPHICAL ANALYSIS how-actual Target Model (Stuart Glennan) 18 how-possibly ? Model Model Model Model

Slide 19

Slide 19 text

UNIVERSITY OF GOTHENBURG PHILOSOPHICAL ANALYSIS Target Model (Stuart Glennan) 19 WHAT DOES THIS ACTUALLY MEANS?

Slide 20

Slide 20 text

UNIVERSITY OF GOTHENBURG PHILOSOPHICAL ANALYSIS (Stuart Glennan) 20 The relationship is one of similarities in degrees and respect Target Model

Slide 21

Slide 21 text

UNIVERSITY OF GOTHENBURG PHILOSOPHICAL ANALYSIS (Stuart Glennan) 21 Since the relationship is about similarities, about representing more or less, we cannot say: ‣ Model A represents a possibility that is actual ‣ Model B represents a possibility that is not actual THEY REPRESENT IN DEGREES AND RESPECT, NOT EITHER OR

Slide 22

Slide 22 text

UNIVERSITY OF GOTHENBURG PHILOSOPHICAL ANALYSIS (Stuart Glennan) 22 If we hold a model to less strict similarity requirements, it may succeed in representing a target, if only roughly

Slide 23

Slide 23 text

UNIVERSITY OF GOTHENBURG PHILOSOPHICAL ANALYSIS (Stuart Glennan) 23 INSTEAD OF DIVIDING MODELS INTO POSSIBLY–ACTUAL: ‣ Adjust their similarity requirements ‣ A model that succeeds in representing a target due to decreased similarity requirements should be viewed as a how-roughly model If we hold a model to less strict similarity requirements, it may succeed in representing a target, if only roughly

Slide 24

Slide 24 text

UNIVERSITY OF GOTHENBURG ‣ Postulates circular orbits for the planets ‣ We now this to be incorrect: ‣ An inaccurate, how-possibly model according to old view ‣ Shift to thinking about similarities: ‣ With decreased similarity requirements, the model would be correct ‣ A how-roughly model that captures important features, even if it is not 100% correct EXAMPLE: COPERNICAN MODEL

Slide 25

Slide 25 text

UNIVERSITY OF GOTHENBURG HOW-POSSIBLY MODEL? 25

Slide 26

Slide 26 text

UNIVERSITY OF GOTHENBURG HOW-ROUGHLY MODEL 26

Slide 27

Slide 27 text

UNIVERSITY OF GOTHENBURG HOW DO WE TEST THIS? 27 Models can be tested by running them in simulations where their performance is examined ‣ When we interact with ChatGPT, we are running the LLM in a simulation ‣ When we are using image recognition with CNNs, we run simulations that employ these models

Slide 28

Slide 28 text

UNIVERSITY OF GOTHENBURG 28 BUILDING MODELS AND SIMULATIONS Same methodology could be employed for many different cognitive theories ‣ Language processing (LLMs) ‣ Visual processing (CNNs) ‣ ...

Slide 29

Slide 29 text

UNIVERSITY OF GOTHENBURG 29 TESTING BY EMBEDDING To test the simulations it's often advantageous to use them in agents, placed in an environment

Slide 30

Slide 30 text

UNIVERSITY OF GOTHENBURG 30 REINFORCEMENT LEARNING Environment Agent Input Actions

Slide 31

Slide 31 text

UNIVERSITY OF GOTHENBURG How to use LLMs to augment exploration in Reinforcement Learning

Slide 32

Slide 32 text

UNIVERSITY OF GOTHENBURG Use of LLMs to augment Reinforcement Learning in various ways

Slide 33

Slide 33 text

UNIVERSITY OF GOTHENBURG Embedding agents in virtual worlds to explore questions in Cognitive Science

Slide 34

Slide 34 text

UNIVERSITY OF GOTHENBURG ‣ Agents in an environment in a reinforcement learning scenario where the agent behavior is studied in various ways ‣ Simulated ecosystem where behavior is learned through reinforcement learning RIGHT NOW

Slide 35

Slide 35 text

UNIVERSITY OF GOTHENBURG I want to develop this: ‣ Augment agents with LLMs ‣ Theory behind this is that the agents "cognitive functions" could be viewed as how-roughly models that are tested in an environment GOING FORWARD

Slide 36

Slide 36 text

UNIVERSITY OF GOTHENBURG TO SUM IT UP 36 Just as the Copernican Model could learn us things about the solar system without being 100% correct, maybe transformers and LLMs can teach us about the brain without being 100% accurate models of the brain, or even intended as brain-models to begin with. They could be seen as how-roughly models. Simulated environments with embedded agents that employs transformer-based LLMs could help us to test this, and the same methodology could be used to test other cognitive theories.

Slide 37

Slide 37 text

THANK YOU! [email protected]