Machine Learning for Materials (Lecture 8)

Aron Walsh Department of Materials Centre for Processable Electronics Machine
Learning for Materials 8. Recent Advances in AI

Course Contents 1. Course Introduction 2. Materials Modelling 3. Machine
Learning Basics 4. Materials Data and Representations 5. Classical Learning 6. Artificial Neural Networks 7. Building a Model from Scratch 8. Recent Advances in AI 9. and 10. Research Challenge

“A problem in artificial intelligence is one which is so
complex that it cannot be solved using any normal algorithm” Hugh M. Cartwright, Applications of AI in Chemistry (1993)

Class Outline Recent Advances in AI A. Large Language Models
B. Closed-Loop Materials Discovery

Natural Language Processing (NLP) Branch of AI that focuses on
the interaction between computers and human language Easy Hard Spell checking Text classification Information extraction Question answering Conversational agent

Natural Language Processing (NLP) Branch of AI that focuses on
the interaction between computers and human language Image from https://github.com/practical-nlp

Natural Language Processing (NLP) Four major building blocks of language
Image from https://github.com/practical-nlp

Natural Language Processing (NLP) Many statements are ambiguous and require
context to be understood Let’s eat grandma? Essen wir Oma? 我们吃奶奶的饭? おばあちゃんを食べようか？ Mangeons grand-mère? 할머니랑 같이 먹어요? Does the ambiguity of the English phrase translate? (image: DALL-E 3 model)

What Happens Inside a Chatbot? From text prompt to text
response via a large language model (LLM) “Write a Limerick about Imperial College London” Prompt “In Imperial College, bright minds convene, Where knowledge and innovation gleam. From White City to South Ken, A hub of brilliance, amen, Where dreams are born and discoveries esteem!” Response 1. LLM interprets the user prompt (encoding: words to vector) 2. LLM generates a response (decoding: vector to words) LLM Using GPT-3 via https://github.com/hwchase17/langchain

Language Models Predictive text Using GPT3 via https://github.com/hwchase17/langchain I love
materials because of they their shape are like Top words ranked by probability “Temperature” of the text choices Sampling the distribution of probabilities (“creativity”) I love materials because they ignite a symphony of vibrant colors, tantalizing textures, and wondrous possibilities that dance in the realms of imagination, transcending boundaries and embracing the sheer beauty of creation itself. I love materials because they are essential. strong essential beautiful

Language Models Large refers to the size and capacity of
the model. It must sample a literary combinatorial explosion 104 common words in English 108 two-word combinations 1012 three-word combinations 1016 four-word combinations Language must be represented numerically for machine learning models Token: discrete scalar representation of word (or subword) Embedding: continuous vector representation of tokens

Text to Tokens Example: “ZnO is a wide bandgap semiconductor”
GPT-3: https://platform.openai.com/tokenizer [57, 77, 46, 318, 257, 3094, 4097, 43554, 39290, 40990] Token-IDs The model looks up 768 dimensional embedding vectors from the (contextual) embedding matrix

Large Language Models Image from https://towardsdatascience.com Deep learning models trained
to generate text e.g. BERT (370M, 2018), GPT-3 (175B, 2020) Recent models include: Llama2 (Meta, 2023) Bard (Google, 2023) GPT-4 (OpenAI, 2023) PanGu-Σ (Huawei, 2023)

Large Language Models T. N. Brown et al, arXiv:2005.14165 (2020)
GPT = “Generative Pre-trained Transformer” Generate new content Trained on a large dataset Deep learning architecture User Prompt Encode to a vector Transformer layers analyse relationship between vector components; generate transformed vector Decode to words Response Key components of a transformer layer Self-attention heads: smart focus on different parts of input Feed-forward neural network: capture non-linear relationships

Large Language Models B. Geshkovski et al, arXiv:2312.10794 (2023); Image:
https://pub.aimind.so Ongoing analysis into transformer architectures, e.g. “the structure of these interacting particle systems allows one to draw concrete connections to established topics in mathematics, including nonlinear transport equations”

Large Language Models T. N. Brown et al, arXiv:2005.14165 (2020)
Essential ingredients of GPT Diverse data Deep learning model Validation on tasks

Large Language Models What are the potential drawbacks and limitations
of LLMs such as GPT? • Training data, e.g. not up to date, strong bias • Context tracking, e.g. limited short-term memory • Hallucination, e.g. generate false information • Ownership, e.g. fair use of training data • Ethics, e.g. appear human generated

LLMs for Materials Many possibilities, e.g. read a textbook and
ask technical questions about the content “The Future of Chemistry is Language” A. D. White, Nat. Rev. Chem. 7, 457 (2023)

LLMs for Materials Language models tailored to be fact-based with
clear context. Applied to one of my review papers https://github.com/whitead/paper-qa

LLMs for Materials L. M. Antunes et al, arXiv 2307.04340
(2023); https://crystallm.com CrystaLLM: learn to write valid crystallographic information files (cifs) and generate new structures

LLMs for Materials CrystaLLM: learn to write valid crystallographic information
files (cifs) and generate new structures Training set 2.2 million cifs Validation set 35,000 cifs Test set 10,000 cifs Tokenisation: space group symbols, element symbols, numeric digits. 768 million training tokens for a deep learning model with 25 million parameters L. M. Antunes et al, arXiv 2307.04340 (2023); https://crystallm.com

LLMs for Materials Integrate a large language model into scientific
research workflows Daniil A. Boiko et al, Nature 624, 570 (2023)

Class Outline Recent Advances in AI A. Large Language Models
B. Closed-Loop Materials Discovery

Accelerate Scientific Discovery Research can be broken down into a
set of core tasks that can each benefit from acceleration H. S. Stein and J. M. Gregoire, Chem. Sci. 10, 9640 (2019) Traditional research workflow

Potential for speedup Accelerate Scientific Discovery H. S. Stein and
J. M. Gregoire, Chem. Sci. 10, 9640 (2019) Research can be broken down into a set of core tasks that can each benefit from acceleration

Accelerate Scientific Discovery Workflow classification of published studies H. S.
Stein and J. M. Gregoire, Chem. Sci. 10, 9640 (2019)

Automation and Robotics Execution of physical tasks to achieve a
target using autonomous or collaborative robots Industrial revolutions from https://transportgeography.org

Automation and Robotics Robots can be tailored for a wide
range of materials synthesis and characterisation tasks B. P. MacLeod et al, Science Advances 6, eaaz8867 (2020)

Automation and Robotics Self-driving labs (SDL) are now operating N.
J. Szymanski et al, Nature 624, 86 (2023) A-Lab

Automation and Robotics Robots can be equipped with sensors and
artificial intelligence to interact with their environment S. Eppel et al, ACS Central Science 6, 1743 (2020) Adapting computer vision models for laboratory settings GT = ground truth Pred = predicted

Automation and Robotics Robots can be equipped with sensors and
artificial intelligence to interact with their environment https://www.youtube.com/watch?v=K7I2QJcIyBQ

Automation and Robotics Automation platforms designed to deliver complex research
workflows (fixed platform or mobile) Catalysis workflow from https://www.chemspeed.com Digifab is a dedicated institute within ICL https://www.imperial.ac.uk/digital-molecular- design-and-fabrication/ Usually a mix of proprietary code, with GUI and Python API for user control

Optimisation Algorithms to efficiently achieve a desired research objective. Considerations:
Objective function (O): Materials properties or device performance criteria, e.g. battery lifetime Parameter selection: Variables that can be controlled, e.g. temperature, pressure, composition Data acquisition: How the data is collected, e.g. instruments, measurements, automation

Optimisation Algorithms Local optimisation – find the best solution in
a limited region of the parameter space (x) Gradient based: iterate in the direction of the steepest gradient (dO/dx), e.g. gradient descent Hessian based: use information from the second derivatives (d2O/dx2), e.g. quasi-Newton O x x1 xn Local minimum The same concepts were discussed for ML model training

Optimisation Algorithms Global optimisation – find the best solution from
across the entire parameter space Numerical: iterative techniques to explore parameter space, e.g. downhill simplex, simulated annealing Probabilistic: incorporate probability distributions, e.g. Markov chain Monte Carlo, Bayesian optimisation O x The same concepts were discussed for ML model training Global minimum xn x1

Bayesian Optimisation (BO) BO can use prior (measured or simulated)
data to decide which experiment to perform next Probabilistic (Surrogate) Model Approximation of the true objective function O(x) ~ f(x), e.g. Gaussian process, GP(x,x') Acquisition Function Selection of the next sample point, e.g. Upper confidence bound, UCB(x') = μ(x') + κσ(x’) known J. Močkus, Optimisation Techniques 1, 400 (1974) new mean prediction exploration term (parameters to sample)

Bayesian Optimisation (BO) Y. Wu, A. Walsh, A. M. Ganose,
ChemRxiv (2023) BO can use prior (measured or simulated) data to decide which experiment to perform next

Bayesian Optimisation (BO) Application to maximise electrical conductivity of a
composite (P3HT-CNT) thin-film D. Bash et al, Adv. Funct. Mater. 31, 2102606 (2021)

Bayesian Optimisation (BO) D. Bash et al, Adv. Funct. Mater.
31, 2102606 (2021) Application to maximise electrical conductivity of a composite (P3HT-CNT) thin-film

Active Learning (AL) BO: find inputs that maximise the objective
function AL: find inputs that enhance model performance Epistemic uncertainty* Posterior samples Target unknown regions with the largest uncertainty Gaussian process: f(x) ~ GP(μ(x), k(x,x’)) mean function Gaussian kernel function * Reducible uncertainty associated with lack of information

Integrated Research Workflows Feedback loop between optimisation model and automated
experiments NIMS-OS: R. Tamura, K. Tsuda, S. Matsuda, arXiv:2304.13927 (2023)

Obstacles to Closed Loop Discovery • Materials complexity (complex structures,
compositions, processing sensitivity) • Data quality and reliability (errors and inconsistencies that waste resources) • Cost of automation (major investment required in infrastructure and training) • Adaptability (systems and workflows may be difficult to reconfigure for new problems)

Class Outcomes 1. Explain the foundations of large language models
2. Assess the impact of AI on materials research and discovery 3. Discuss potential biases and ethical considerations for these applications Activity: Closed-loop optimisation

Machine Learning for Materials (Lecture 8)

Machine Learning for Materials (Lecture 8)

More Decks by Aron Walsh

Other Decks in Science

Featured

Transcript