Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Cognitive Plausibility of Neural Language Models

Cognitive Plausibility of Neural Language Models

tatsuki kuribayashi

December 05, 2022
Tweet

More Decks by tatsuki kuribayashi

Other Decks in Research

Transcript

  1. Research topics including collaborative works 1/2 l Computational psycholinguistics and

    NLP - Cognitive modeling using neural language models [Kuribayashi+,ACL2021] [Kuribayashi+,EMNLP2022] - Word preferences in language Mmdels [Kuribayashi+,ACL2020] - Language model training efficiency with respect to multi-modality or multi-linguality (to appear?) - Organizing committee of CMCL (Cognitive Modeling and Computational Linguistics) workshop 2023 (if the proposal is accepted) l Writing assistance - Writing assistance system (Langsmith) [Ito+(equal cont.),EMNLP2021(demo)] - Tool for developing NLP-powered editor (language server protocol) [Hagiwara+,EMNLP2019(demo)] - Translating rough drafts into academic-style texts [Ito+(equal cont.),INLG2019] Discourse processing (next slides…) Interpretability (next slides…) 2022/12/6 MBZUAI  Today’s topic Service page
  2. l Modeling discourse phenomena - Topicalization preferences in humans Japanese

    LMs [Fujihara+,COLING2022] (2nd author) - Ellipsis preferences in humans and Japanese LMs (to appear?) - Modeling event salience in a narrative [Otake+,COLING2020] - Argumentation structure parsing [Kuribayashi+,ACL2019] l Interpretability - Analyzing Transformers with vector norms [Kobayashi+,EMNLP2020,2021] (2nd author) - Chain-of-thought abilities of vanilla seq2seq [Aoki+,AACL-SRW2022 (best paper!, but non-archival)] Research topics including collaborative works 2/2 2022/12/6 MBZUAI 
  3. Brief summary: What language models (LMs) simulate human reading better?

    2022/12/6 MBZUAI  Trans-sm LSTM Trans-lg Model 100000 Number of updates 10000 1000 100 Data size LG MD SM + N-gram I have a pen that… E.g.,−log 𝑝(word|context) text humans E.g., human gaze duration LSTM-xs-Wiki GPT2-xs-Wiki GPT2-md-Wiki GPT2-sm GPT2-md GPT2-lg GPT2-xl better PPL human-like context access limitation human-like good correlation ’s bias may be similar to the actual humans’ bias different biases 😲 😲
  4. 2022/12/6 MBZUAI  What’s the goal of natural language processing

    (NLP)? (Why is computational psycholinguistics?)
  5. What’s the goal of NLP? l NLP---a branch of artificial

    intelligence focusing on language 2022/12/6 MBZUAI 
  6. What’s the goal of NLP? l NLP---a branch of artificial

    intelligence focusing on language l Go back to the definition of artificial intelligence [Shapiro, 2008]: 2022/12/6 MBZUAI  … definition may be examined more closely by considering the field from three points of view:
  7. What’s the goal of NLP? l NLP---a branch of artificial

    intelligence focusing on language l Go back to the definition of artificial intelligence [Shapiro, 2008]: 2022/12/6 MBZUAI  … definition may be examined more closely by considering the field from three points of view: 1. Machine intelligence---push outwards the frontier of what we know how to program on computers, especially in the direction of tasks that, although we don’t know how to program them, people can perform. progressed (I believe) E.g., machine translation, information retrieval…
  8. What’s the goal of NLP? l NLP---a branch of artificial

    intelligence focusing on language l Go back to the definition of artificial intelligence [Shapiro, 2008]: 2022/12/6 MBZUAI  … definition may be examined more closely by considering the field from three points of view: 1. Machine intelligence---push outwards the frontier of what we know how to program on computers, especially in the direction of tasks that, although we don’t know how to program them, people can perform. progressed (I believe) progressed (I believe) E.g., machine translation, information retrieval… E.g., gigantic language models 2. Computational philosophy---form a computational understanding of human- level intelligent behavior, without being restricted to the algorithms and data structures that the human mind actually does (or conceivably might) use. (if human-level intelligence is implementable on a computer by any means)
  9. What’s the goal of NLP? l NLP---a branch of artificial

    intelligence focusing on language l Go back to the definition of artificial intelligence [Shapiro, 2008]: 2022/12/6 MBZUAI  … definition may be examined more closely by considering the field from three points of view: 1. Machine intelligence---push outwards the frontier of what we know how to program on computers, especially in the direction of tasks that, although we don’t know how to program them, people can perform. progressed (I believe) progressed (I believe) E.g., machine translation, information retrieval… E.g., gigantic language models 2. Computational philosophy---form a computational understanding of human- level intelligent behavior, without being restricted to the algorithms and data structures that the human mind actually does (or conceivably might) use. (if human-level intelligence is implementable on a computer by any means) 3. Computational psychology---understand human intelligent behavior by creating computer programs that behave in the same way that people do. For this goal it is important that the algorithm expressed by the program be the same algorithm that people actually use, and the data structure… often unstated, but pivotal goal
  10. Case1: Enhancing feasibility of psycholinguistic research l NLP---a branch of

    artificial intelligence focusing on language l Go back to the definition of artificial intelligence [Shapiro, 2008]: 2022/12/6 MBZUAI  … definition may be examined more closely by considering the field from three points of view: 1. Machine intelligence---push outwards the frontier of what we know how to program on computers, especially in the direction of tasks that, although we don’t know how to program them, people can perform. progressed (I believe) progressed (I believe) E.g., machine translation, information retrieval… E.g., gigantic language models 2. Computational philosophy---form a computational understanding of human- level intelligent behavior, without being restricted to the algorithms and data structures that the human mind actually does (or conceivably might) use. (if human-level intelligence is implementable on a computer by any means) 3. Computational psychology---understand human intelligent behavior by creating computer programs that behave in the same way that people do. For this goal it is important that the algorithm expressed by the program be the same algorithm that people actually use, and the data structure… often unstated, but pivotal goal Q. What’s the key to humans’ efficient language acquisition? Do some innate biases relate to? Raising children without any language experience from birth, then… [Coulton, 1972] Ethical issues
  11. Case1: Enhancing feasibility of psycholinguistic research l NLP---a branch of

    artificial intelligence focusing on language l Go back to the definition of artificial intelligence [Shapiro, 2008]: 2022/12/6 MBZUAI  … definition may be examined more closely by considering the field from three points of view: 1. Machine intelligence---push outwards the frontier of what we know how to program on computers, especially in the direction of tasks that, although we don’t know how to program them, people can perform. progressed (I believe) progressed (I believe) E.g., machine translation, information retrieval… E.g., gigantic language models 2. Computational philosophy---form a computational understanding of human- level intelligent behavior, without being restricted to the algorithms and data structures that the human mind actually does (or conceivably might) use. (if human-level intelligence is implementable on a computer by any means) 3. Computational psychology---understand human intelligent behavior by creating computer programs that behave in the same way that people do. For this goal it is important that the algorithm expressed by the program be the same algorithm that people actually use, and the data structure… often unstated, but pivotal goal Q. What’s the key to humans’ efficient language acquisition? Do some innate biases relate to? Raising children without any language experience from birth, then… [Coulton, 1972] Ethical issues As a computational simulation of human language acquisition, we trained language models under the situation as close as possible to the human language acquisition environment (e.g., multi-modality input), then identify which factors were important. [Warstadt&Bowman, 2022] Feasible
  12. Case2: Exact simulation of humans in application l Need feedback

    about my writing l Need expected (human) reader 2022/12/6 MBZUAI  In what part of this text, readers might feel difficulty to follow? Ah…, the flow of the first paragraph is difficult to follow Suppose…
  13. Case2: Exact simulation of humans in application l Need feedback

    about my writing l Need expected (human) reader 2022/12/6 MBZUAI  In what part of this text, readers might feel difficulty to follow? Ah…, the flow of the first paragraph is difficult to follow Suppose…
  14. Case2: Exact simulation of humans in application l Need feedback

    about my writing l Need expected (human) reader 2022/12/6 MBZUAI  In what part of this text, readers might feel difficulty to follow? Ah…, the flow of the first paragraph is difficult to follow Super robust model magically(?) inferring the writer's intention. Okey, first, I compute self- attention over the full text… Suppose…
  15. Case2: Exact simulation of humans in application l Need feedback

    about my writing l Need expected (human) reader 2022/12/6 MBZUAI  In what part of this text, readers might feel difficulty to follow? Ah…, the flow of the first paragraph is difficult to follow Super robust model magically(?) inferring the writer's intention. Model showing the same processing difficulty humans do. compatible the definition of t is written in …Section 4, and u is in Section 2, ... ah….! Okey, first, I compute self- attention over the full text… Suppose… human-like model
  16. l Modeling humans using modern NLP - Theory: people try

    to transmit a constant amount of information across time Case3: Measuring the texts 2022/12/6 MBZUAI  Current techniques are not very good at estimating H(word|context), because we do not have a very good model of context,… [Genzel and Charniak, 2002] Technical issues −log 𝑝(word|context) ?
  17. l Modeling humans using modern NLP - Theory: people try

    to transmit a constant amount of information across time - Modern approach: observing the surprisal −log 𝑝(word|context) computed by neural LM Case3: Measuring the texts 2022/12/6 MBZUAI  Current techniques are not very good at estimating H(word|context), because we do not have a very good model of context,… [Genzel and Charniak, 2002] Technical issues −log 𝑝(word|context) ? I have a pen that… −log 𝑝(word|context)
  18. What model computes human-like surprisal? l Surprisal −log 𝑝𝜽(word|context) well

    simulated human reading behavior 2022/12/6 MBZUAI  I have a pen that… I have a pen that… E.g., surprisal −log 𝑝(word|context) text I have a pen that… humans E.g., human gaze duration modern NLP [Levy,2008][Smith&Levy,2013]
  19. What model computes human-like surprisal? l Surprisal −log 𝑝𝜽(word|context) computed

    by different models 𝜽 are compared 2022/12/6 MBZUAI  I have a pen that… I have a pen that… E.g., surprisal −log 𝑝(word|context) text I have a pen that… humans E.g., human gaze duration modern NLP Too many LM variants 🤗 [Wilcox+,2020]…
  20. What model computes human-like surprisal? l E.g., hierarchical bias v.s.

    sequential bias 2022/12/6 MBZUAI  vanilla (sequential) language model I know grammar α I know grammar β better correlation Grammar β is likely related to human sentence processing model A model B model C models [Hale+,ACL2018 (best paper)][Yoshida+,EMNLP2021]
  21. Does scaling solve the cognitive modeling? l Scaling low works

    even in modeling human reading behavior? 2022/12/6 MBZUAI  ∝ scaling cognitively plausible model ? [Kuribayashi+, ACL21]
  22. Does scaling solve the cognitive modeling? l Scaling low works

    even in modeling human reading behavior? l Background: scaling low for language model performance (perplexity; PPL) 2022/12/6 MBZUAI  ∝ scaling cognitively plausible model ? [Kuribayashi+, ACL21] [Kaplan+, 2020]
  23. l Language-dependent results 2022/12/6 MBZUAI  Does scaling solve the

    cognitive modeling? Trans-sm LSTM Trans-lg Model 100000 Number of updates 10000 1000 100 Data size LG MD SM + N-gram Spearman’s r = -0.87 English better worse [Kuribayashi+, ACL21] better PPL better gaze duration modeling Trans-sm LSTM Trans-lg Model 100000 Number of updates 10000 1000 100 Data size LG MD SM + N-gram
  24. l Language-dependent results 2022/12/6 MBZUAI  Does scaling solve the

    cognitive modeling? Trans-sm LSTM Trans-lg Model 100000 Number of updates 10000 1000 100 Data size LG MD SM + N-gram Spearman’s r = -0.87 r = 0.53 English Japanese better worse better worse scaling law breaks [Kuribayashi+, ACL21] better PPL better gaze duration modeling
  25. l SOV language arguably incurs non-uniform processing cost across tokens

    Human-like slow downs (speed up) diminished 2022/12/6  0 20 40 60 80 −15 −5 5 15 tokenN_in_sent s(tokenN_in_sent,3.7) 0 5 10 20 −100 −50 0 50 tokenN s(tokenN,2.62) Change of gaze duration (ms) position in sentence position in sentence Change of gaze duration (ms) Dundee Corpus (English) BCCWJ-EyeTrack (Japanese) stats. in toy corpus [Maurits+, 2010] reading time stats. [Kuribayashi+, 2021] MBZUAI Japanese [Kuribayashi+, ACL21]
  26. l Human-like variations of surprisal diminished Human-like slow downs (speed

    up) diminished 2022/12/6  Trans-sm LSTM Trans-lg Model 100000 Number of updates 10000 1000 100 Data size LG MD SM + N-gram 400 Effect of syntactic category MBZUAI [Kuribayashi+, ACL21] better PPL by-word category variation of surprisal diminished
  27. Similar reports: human-like slow downs diminished l Processing difficulty in

    syntactic violation differs between LMs and humans - LMs under-predict the difficulty 2022/12/6 MBZUAI  [Wilcox+, ACL21] I know that my mother sent the present to Taylor last weekend. I know who my mother sent the present to Taylor last weekend. 😌 😵💫
  28. Similar reports: human-like slow downs diminished l Processing difficulty in

    syntactic violation differs between LMs and humans - LMs under-predict the difficulty 2022/12/6 MBZUAI  [Wilcox+, ACL21] I know that my mother sent the present to Taylor last weekend. I know who my mother sent the present to Taylor last weekend. gpt2 rnng jrnn grnn human Cleft FGD-obj FGD-pp FGD-sbj MVRR NPL-any-orc NPL-any-src NPL-ever-orc NPL-ever-src RNA-f-orc RNA-f-src RNA-m-orc RNA-m-src SVNA-orc SVNA-pp SVNA-src 0 200 400 0 200 400 0 200 400 0 200 400 0 200 400 Test Suite Slowdown in Milliseconds Predicted vs. Observed Slowdown Between Conditions 😌 😵💫 😵💫 😵💫 LMs too smoothly processed the text Humans are more surprised
  29. Cognitive plausibility of noisy language model l Tested LMs with

    limited context access 2022/12/6 MBZUAI  [Kuribayashi+,EMNLP22] better gaze duration modeling More severe noise English Japanese LSTM-xs-Wiki GPT2-xs-Wiki GPT2-md-Wiki GPT2-sm GPT2-md GPT2-lg GPT2-xl … people wearing a red hat come …
  30. Cognitive plausibility of noisy language model l Tested LMs with

    limited context access l Theories: human context access during syntactic processing is limited - pressure that long dependencies/deep nesting are avoided in natural language 2022/12/6 MBZUAI  [Kuribayashi+, EMNLP22] … people wearing a red hat come … better gaze duration modeling More severe noise English Japanese LSTM-xs-Wiki GPT2-xs-Wiki GPT2-md-Wiki GPT2-sm GPT2-md GPT2-lg GPT2-xl
  31. Summary l There are at least three mindsets in NLP:

    machine intelligence, computational philosophy, and computational psychology - they are not exclusive - not talking about which mindset is correct l Engineeringly good model is not always human-like (at least in our settings) - not intend that the current NLP directions are wrong - when achieving something, simply replicating nature is not always a good idea (e.g., airplane does not flap its wings as birds) l Understanding humans will continue to be challenging goal - at least scaling does not solve 2022/12/6 MBZUAI  https://www.lesswrong.com/posts/eqxqgFxymP8hXDTt5/a nnouncing-the-inverse-scaling-prize-usd250k-prize-pool