(neuro)science with AI: Machine learning as scientific modeling

Slide 1

Slide 1 text

(neuro)science with AI: Machine learning as scientific modeling Ga¨ el Varoquaux Predictive models avoid excessive reductionism in cognitive neuroimaging [Varoquaux and Poldrack 2019] AI as statistical methods for imperfect theories [Varoquaux 2021]

Slide 2

Slide 2 text

My scientific wanderings Physics Quantum physics (PhD with Alain Aspect) Atom-interferometric tests of relativity Brain image analysis for cognition Statistics, machine learning, image analysis Cognitive neuroscience, psychology Machine learning for public health Informing policy? From absolute quantities to qualitative subject matters Ga¨ el Varoquaux 1

Slide 3

Slide 3 text

Questions of interest How does scientific knowledge emerge from data? Can we have a statistical control on this process? What role do models play? Ga¨ el Varoquaux 2

Slide 4

Slide 4 text

This talk 1 Evidence in (cognitive) neuroscience 2 Medical neuroimaging 3 Rethinking modeling Ga¨ el Varoquaux 3

Slide 5

Slide 5 text

1 Evidence in (cognitive) neuroscience

Slide 6

Slide 6 text

Human neuroscience Observed at a distance, with difficult interventions The ideal experiments would observe all neurons intervene directly on them Neuroscience knowledge is built as early astronomy Ga¨ el Varoquaux 5

Slide 7

Slide 7 text

Brain imaging Brain images = blurry recording of many neurons with many ongoing processes Very complex to model Complete statistical model hopeless Machine learning to model brain Ga¨ el Varoquaux 6

Slide 8

Slide 8 text

Probing a mental process via opposition 1 Craft an experimental condition that recruits it Ga¨ el Varoquaux 7

Slide 9

Slide 9 text

Probing a mental process via opposition 1 Craft an experimental condition that recruits it 2 Do an elementary psychological manipulation Ga¨ el Varoquaux 7

Slide 10

Slide 10 text

Probing a mental process via opposition - - Isolate mental processes Reason on the contrast Also for reaction times pathologies... Ga¨ el Varoquaux 7

Slide 11

Slide 11 text

The lens of the cognitive model Psychological manipulations are designed and interpreted based on a cognitive model Experimental “paradigm” Task & stimuli used – should recruit the right mental processes Opposition used – should cancel out “nuisances” Ga¨ el Varoquaux 8

Slide 12

Slide 12 text

The visual system: a paradigmatic example Successive experiments have revealed specialized regions Ga¨ el Varoquaux 9 [Hubel and Wiesel 1959, Logothetis... 1995, Kanwisher... 1997]

Slide 13

Slide 13 text

The visual system: a paradigmatic example Successive experiments have revealed specialized regions But evidence is tied to a theory decomposing mental processes Is there a car area? Ga¨ el Varoquaux 9 [Poldrack 2010]

Slide 14

Slide 14 text

Problem: Brain signals would struggle to debunk false theories Successive experiments have revealed specialized regions But evidence is tied to a theory decomposing mental processes Is there a car area? Ingredients now considered invalid would yield significant differences “philoprogenitiveness” “alimentiveness” “mirthfulness” ... Ga¨ el Varoquaux 9 [Poldrack 2010]

Slide 15

Slide 15 text

Problem: The inference is the wrong way [Poldrack 2006] What mental process is supported by this brain structure? Salience Ga¨ el Varoquaux 10

Slide 16

Slide 16 text

Problem: The inference is the wrong way [Poldrack 2006] What mental process is supported by this brain structure? Salience Executive control The experimental manipulation implies the observed response Ga¨ el Varoquaux 10

Slide 17

Slide 17 text

Problem: The inference is the wrong way [Poldrack 2006] What mental process is supported by this brain structure? Pain Executive control Salience The experimental manipulation implies the observed response Empirical evidence: P(neural activity|mental process) Ga¨ el Varoquaux 10

Slide 18

Slide 18 text

Problem: The inference is the wrong way [Poldrack 2006] What mental process is supported by this brain structure? Pain Executive control Salience Salience Pain Executive control The experimental manipulation implies the observed response Empirical evidence: P(neural activity|mental process) To conclude that neural activity ⇒ mental process High-dimensional statistics (many brain regions / neurons) Requires data on many / all mental processes Ideally would be a causal claim Ga¨ el Varoquaux 10

Slide 19

Slide 19 text

New methodology: predicting the task Machine learning to predict mental processes from activity Pain Executive control Salience Salience Pain Executive control High-dimensional statistics Machine learning: abandonning well-posed maximum likelihood Requires data on many / all mental processes Challenge = calibrated labeling of mental processes in tasks (not only oppositions) Ideally would be a causal claim Let me come back to this Ga¨ el Varoquaux 11 [Poldrack 2011, Varoquaux... 2018, Menuet... 2022]

Slide 20

Slide 20 text

Slide 21

Slide 21 text

New methodology: AI models for less reductionist task decomposition Computer vision as a model for human vision Internal representations capture all aspects of natural stimuli Mapping them to brain responses with high-dimensional predictors Avoids choosing few ingredients/facets of a cognitive process (excess reductionism) [Varoquaux and Poldrack 2019] Can generalize across experimental paradigms [Eickenberg... 2017] Ga¨ el Varoquaux 12 [Yamins... 2014]

Slide 22

Slide 22 text

Evidence in cognitive neuroscience Focus on significance rather than signal fit leaves open doors to wrong models Well-posed models must be overly simple, and cannot answer the questions of interest Machine learning / IA enables to model the complexity of the actual situations But we want understanding The answer does not lie in simplistic mechanistic models wich cannot be confronted to data Ga¨ el Varoquaux 13

Slide 23

Slide 23 text

2 Medical neuroimaging

Slide 24

Slide 24 text

Health is an observable (not a latent factor) We can predict it Machine learning for the win Ga¨ el Varoquaux 15

Slide 25

Slide 25 text

Goals of improving health Goals of improving health Easier, no “construct”, “understanding”... Easier, no “construct”, “understanding”... Easier!? Easier!? Ga¨ el Varoquaux 16

Slide 26

Slide 26 text

Success! Predicting mental disorders despite heterogeneity Autism = heterogeneous, symptom-defined disorder Can brain imaging give universal diagnostic criteria? Accuracy Fraction of subjects used Prediction to new sites works as well with enough data (n = 1 000) Ga¨ el Varoquaux 17 [Abraham... 2017]

Slide 27

Slide 27 text

Addressing the crave for data: proxy measures [Liem... 2017] Use common health outcome ⇒ more data Capturing aging by associating brain image to chronological age Discrepancy with chronological age (brain-age delta) correlates with cognitive impairment 0 2 4 Brain aging discrepancy (years) -0.38 0.74 1.72 Objective Cognitive Impairment group Normal Mild Major Brain proxy of aging Avoids simplistic disease dichotomy Ga¨ el Varoquaux 18

Slide 28

Slide 28 text

Slide 29

Slide 29 text

Population imaging with proxy measures [Dadi... 2021] Elusive constructs of mental health: intelligence, neuroticism Make proxy measures: empirically-tuned across many subjects Aging, neuroticism, fluid intelligence: proxy measures relate more to real-life health behavior than canonical assessments Socio-demographics + questionnaires relate more than brain images Imaging seen as desirable to give intervention targets A causal, not correlational question Ga¨ el Varoquaux 19

Slide 30

Slide 30 text

Medical analysis of brain images Machine learning promising for diagnosis, prognosis Given sufficient labels, machine-learning biases in data + labels, epidemiology 101 [Varoquaux and Cheplygina 2022] For more data, defining new labels: proxy measures Medical research: intervention targets? Must be framed as a causal / counterfactual question Drugs validated by randomized trials, not mechanisms Ga¨ el Varoquaux 20

Slide 31

Slide 31 text

3 Rethinking modeling AI as statistical methods for imperfect theories

Slide 32

Slide 32 text

Scientific progress and statistical evidence Dominant framework of statistical reasoning: Formulating a probabilistic model from mechanical hypotheses Integrating empirical evidence (data) by fitting this model Reasoning from model parameters Rigour breaks down with wrong modeling ingredients Science needs more reasoning from model outputs For statistics: robustness to mis-specification Generalization grounds scientific theories Black-box phenomenological data models are good for science Ga¨ el Varoquaux 22

Slide 33

Slide 33 text

Statistical evidence in science and data science 1. Model the data Based on the knowledge and constructs of the field & the understanding of data collection m d2 dt2 ⃗ x = ⃗ F ⃗ F = q (⃗ E + d dt ⃗ x × ⃗ B) Intelligence Fluid intelligence Crystallized intelligence Ga¨ el Varoquaux 23

Slide 34

Slide 34 text

Statistical evidence in science and data science 1. Model the data Based on the knowledge and constructs of the field & the understanding of data collection 2. Statistical inference Fit model to data (typically maximizing likelihood) Reason from the model and its parameters Relies on statistical modeling [Cox 2006] Ga¨ el Varoquaux 23

Slide 35

Slide 35 text

Example: studying brain brain activity Neural support of mental process Model of task and mental processes ⇒ brain maps Ga¨ el Varoquaux 24

Slide 36

Slide 36 text

Example: studying brain brain activity Neural support of mental process Model of task and mental processes ⇒ brain maps Uncontrolled variability In modeling across teams [Botvinik-Nezer... 2019] Across software for same model [Bowring... 2019] Even experts cannot chose the “right” model Ga¨ el Varoquaux 24

Slide 37

Slide 37 text

Teachings from history of science Current view of physics, maths, chemistry... Building models from the right ingredients – “first principles” The past Refining relevant constructs from wrong models Ga¨ el Varoquaux 25

Slide 38

Slide 38 text

The birth of mechanics Early scientists (eg ancient Greece) “natural motion of objects”, no notion of force, or acceleration. Observation of planetary motion (eg Kepler) Search for regularities in planets – “harmonies” The period squared is proportional to the cube of the major diameter of the orbit Modern laws of dynamics (Newton) Differential calculus ⇒ laws with force and acceleration Unite observations of celestial and earthly motions Ga¨ el Varoquaux 26

Slide 39

Slide 39 text

The birth of mechanics Early scientists (eg ancient Greece) “natural motion of objects”, no notion of force, or acceleration. Lacking key ingredients Observation of planetary motion (eg Kepler) Search for regularities in planets – “harmonies” The period squared is proportional to the cube of the major diameter of the orbit Phenomenological model1 crucial Modern laws of dynamics (Newton) Differential calculus ⇒ laws with force and acceleration Unite observations of celestial and earthly motions Validity established by strong generalizability Ga¨ el Varoquaux 26

Slide 40

Slide 40 text

Slide 41

Slide 41 text

Modern physics does not need phenomenological models? Vulcan: false discovery of a planet (19th century) Anomaly in Mercury’s orbit not explained by Newtonian physics ⇒ invent and “observe” an additional planet, Vulcan Theory laden observations Particle physics builds evidence with machine learning (today) Fundamental laws of the universe = most precise theory ever Particle detection by discriminating physics model with non-parametric background “Pure” models insufficient for “dirty” reality Ga¨ el Varoquaux 27

Slide 42

Slide 42 text

Phenomenological data fits have been crucial to science Science uses false models as means for truer theory [Wimsatt 2007] The reductionist aesthetics of “pure” simple mathematical theories is not adapted to the messy world beyond pure physics Generalization or prediction failures make or break scientific theories Ga¨ el Varoquaux 28

Slide 43

Slide 43 text

Statistics and scientific evidence Validity Reasonning = more than formal problems Ga¨ el Varoquaux 29

Slide 44

Slide 44 text

Slide 45

Slide 45 text

Slide 46

Slide 46 text

Validity of scientific findings – much more than statistical validity External validity [Cook and Campbell 1979] External validity asserts that findings apply beyond the study Generalizability Constructs and their validity [Cronbach and Meehl 1955] Construct = abstract ingredients such as “intelligence” Construct validity: measures and manipulations actually capture the theoretical construct Implicit realistic stances in theories Realism = objective and mind-independent unobservable entities Is intelligence a valid construct? How about a center of gravity? Places implicit preferences on models beyond empirical evidence Ga¨ el Varoquaux 30

Slide 47

Slide 47 text

Reasoning with statistical tools Model reasoning [Cox 2006] Carefully craft a probabilistic model of the data Estimated model parameters are interpreted within its logic “data descriptions that are potentially causal” [Cox 2001] Warranted reasoning [Baiocchi and Rodu 2021] Relies on warrants in the experiment (eg randomization) Output reasoning [Breiman 2001, Baiocchi and Rodu 2021] Relies on capacity to approximate relations Ga¨ el Varoquaux 31

Slide 48

Slide 48 text

Benefits of reasoning on outputs rather than models Science needs black-box output reasoning Ga¨ el Varoquaux 32

Slide 49

Slide 49 text

For statistical validity Even expert modeling choices explore meaningful variability Model reasoning is conditional to the model parameters have a meaning in a model Imperfect science: 70 different teams of brain-imaging experts qualitatively different neuroscience findings [Botvinik-Nezer... 2020] Analytical variability breaks statistical control Output reasoning: milder conditions for statistical control Theoretical results in mispecified settings [Hsu... 2014] Multi-colinearity no longer an issue Higher-dimensional settings ⇒ Forces less reductionist choices Ga¨ el Varoquaux 33

Slide 50

Slide 50 text

For understanding? “Nobody understands quantum mechanics” Richard Feynman Narrative truth versus operational truth Humans need stories, for teaching, for intuitions, for “selling” these simplifications are not “truth” Ga¨ el Varoquaux 34

Slide 51

Slide 51 text

For understanding counterfactual reasonning “Nobody understands quantum mechanics” Richard Feynman Narrative truth versus operational truth Humans need stories, for teaching, for intuitions, for “selling” these simplifications are not “truth” Counterfactual reasoning & causal inference We want to reason on new situations Causal, not correlational knowledge Bad health is associated with hospitals, but seldom caused by. Predictive models enable counterfactual reasoning if - they extrapolate enough - they build on the right variables (confounds, not colliders) Ga¨ el Varoquaux 34 [Rose and Rizopoulos 2020, Doutreligne and Varoquaux 2023]

Slide 52

Slide 52 text

Slide 53

Slide 53 text

For broader scientific validity of findings The only strong evidence is strong generalization Model reasoning favors internal validity Model reasoning often need “pure” models with little generalization Fields without a unifying formal theory tackle empirical evidence with overly reductionist lenses Machine learning/AI can model the full problem space and give testable generalization Relating to more general constructs Theories & models are written in terms of constructs (eg attention) To help generalizing across vastly different situations Must ground these directly on observations Ga¨ el Varoquaux 35

Slide 54

Slide 54 text

AI gives statistical methods for imperfect theories Model reasoning has no guarantees for imperfect models Scientific roadblocks are on model ingredients, not functional forms Proposal Gauge models more on their predictions than their ingredients Scientific inference from model predictions as in [Eickenberg... 2017] counterfactual reasoning, model comparison, feature importances For neuroscience Build predictive models with strong generalization rather than mechanistic explanations @GaelVaroquaux

Slide 55

Slide 55 text

References I A. Abraham, M. P. Milham, A. Di Martino, R. C. Craddock, D. Samaras, B. Thirion, and G. Varoquaux. Deriving reproducible biomarkers from multi-site resting-state data: An autism-based example. NeuroImage, 147:736–745, 2017. M. Baiocchi and J. Rodu. Reasoning using data: Two old ways and one new. Observational Studies, 7(1):3–12, 2021. R. Botvinik-Nezer, F. Holzmeister, C. F. Camerer, A. Dreber, J. Huber, M. Johannesson, M. Kirchler, R. Iwanir, J. A. Mumford, A. Adcock, ... Variability in the analysis of a single neuroimaging dataset by many teams. bioRxiv, 2019. R. Botvinik-Nezer, F. Holzmeister, C. F. Camerer, A. Dreber, J. Huber, M. Johannesson, M. Kirchler, R. Iwanir, J. A. Mumford, R. A. Adcock, ... Variability in the analysis of a single neuroimaging dataset by many teams. Nature, 582(7810):84–88, 2020. A. Bowring, C. Maumet, and T. E. Nichols. Exploring the impact of analysis software on task fmri results. Human brain mapping, 40(11):3362–3384, 2019. L. Breiman. Statistical modeling: The two cultures (with comments and a rejoinder by the author). Statistical science, 16(3):199–231, 2001.

Slide 56

Slide 56 text

References II T. Cook and D. Campbell. Quasi-experimentation: Design and analysis issues for field settings 1979 Boston. MA Houghton Mifflin, 1979. D. R. Cox. [statistical modeling: The two cultures]: Comment. Statistical science, 16(3): 216–218, 2001. D. R. Cox. Principles of statistical inference. Cambridge university press, 2006. L. J. Cronbach and P. E. Meehl. Construct validity in psychological tests. Psychological Bulletin, 52:281, 1955. K. Dadi, G. Varoquaux, J. Houenou, D. Bzdok, B. Thirion, and D. Engemann. Population modeling with machine learning can enhance measures of mental health. GigaScience, 10 (10):giab071, 2021. M. Doutreligne and G. Varoquaux. How to select predictive models for decision making or causal inference? working paper or preprint, 2023. URL https://hal.science/hal-03946902. M. Eickenberg, A. Gramfort, G. Varoquaux, and B. Thirion. Seeing it all: Convolutional network layers map the function of the human visual system. NeuroImage, 152:184–194, 2017.

Slide 57

Slide 57 text

References III D. Hsu, S. Kakade, and T. Zhang. Random design analysis of ridge regression. Foundations of Computational Mathematics, 14, 2014. D. H. Hubel and T. N. Wiesel. Receptive fields of single neurones in the cat’s striate cortex. J. Physiol., 148:574–591, 1959. N. Kanwisher, J. McDermott, and M. M. Chun. The fusiform face area: a module in human extrastriate cortex specialized for face perception. J. Neurosci., 17(11):4302–4311, 1997. F. Liem, G. Varoquaux, J. Kynast, F. Beyer, S. K. Masouleh, J. M. Huntenburg, L. Lampe, M. Rahim, A. Abraham, R. C. Craddock, ... Predicting brain-age from multimodal imaging data captures cognitive impairment. NeuroImage, 2017. N. K. Logothetis, J. Pauls, and T. Poggio. Shape representation in the inferior temporal cortex of monkeys. Current Biology, 5:552, 1995. R. Menuet, R. Meudec, J. Dock` es, G. Varoquaux, and B. Thirion. Comprehensive decoding mental processes from web repositories of functional brain images. Scientific Reports, 12 (1):1–14, 2022.

Slide 58

Slide 58 text

References IV R. Poldrack. Can cognitive processes be inferred from neuroimaging data? Trends in cognitive sciences, 10:59, 2006. R. A. Poldrack. Mapping mental function to brain structure: how can cognitive neuroimaging succeed? Perspectives on psychological science, 5:753, 2010. R. A. Poldrack. Inferring mental states from neuroimaging data: from reverse inference to large-scale decoding. Neuron, 72:692, 2011. This review show how decoding can be used on large-scale databases to ground formal reverse inference, capturing how selectively a brain area is activated by a mental process. S. Rose and D. Rizopoulos. Machine learning for causal inference in biostatistics. Biostatistics, 21(2):336–338, 2020. G. Varoquaux. Ai as statistical methods for imperfect theories. In NeurIPS 2021 AI for Science Workshop, 2021. G. Varoquaux and V. Cheplygina. Machine learning for medical imaging: methodological failures and recommendations for the future. NPJ digital medicine, 5(1):48, 2022.

Slide 59

Slide 59 text

References V G. Varoquaux and R. A. Poldrack. Predictive models avoid excessive reductionism in cognitive neuroimaging. Current opinion in neurobiology, 55:1–6, 2019. G. Varoquaux, Y. Schwartz, R. A. Poldrack, B. Gauthier, D. Bzdok, J.-B. Poline, and B. Thirion. Atlases of cognition with large-scale human brain mapping. PLOS Computational Biology, 14(11):1–18, 11 2018. W. C. Wimsatt. Re-engineering philosophy for limited beings: Piecewise approximations to reality. Harvard University Press, 2007. D. L. Yamins, H. Hong, C. F. Cadieu, E. A. Solomon, D. Seibert, and J. J. DiCarlo. Performance-optimized hierarchical models predict neural responses in higher visual cortex. Proc Natl Acad Sci, page 201403112, 2014. This study shows that models of neural response based on computer-vision artificial networks explain brain activity better than classic theoretical-neuroscience models of vision.