Slide 1

Slide 1 text

Kosuke Nakago, Taku Watanabe Preferred Computational Chemistry, Inc. History towards Universal Neural Network Potential for Material Discovery

Slide 2

Slide 2 text

Motivation 2 • To accelerate materials discovery for a sustainable future.

Slide 3

Slide 3 text

Motivation 3 • To accelerate materials discovery for a sustainable future.

Slide 4

Slide 4 text

Motivation 4 • To accelerate materials discovery for a sustainable future. ticlehtml/2019/ee/c8ee02495b

Slide 5

Slide 5 text

Motivation 5 • To accelerate materials discovery for a sustainable future. ticlehtml/2019/ee/c8ee02495b

Slide 6

Slide 6 text

Motivation 6 • To accelerate materials discovery for a sustainable future. ticlehtml/2019/ee/c8ee02495b in-li10gep2s12-sulfide-solid-electrolyte

Slide 7

Slide 7 text

Motivation 7 • To accelerate materials discovery for a sustainable future. ticlehtml/2019/ee/c8ee02495b in-li10gep2s12-sulfide-solid-electrolyte

Slide 8

Slide 8 text

Motivation 8 • To accelerate materials discovery for a sustainable future. ticlehtml/2019/ee/c8ee02495b in-li10gep2s12-sulfide-solid-electrolyte

Slide 9

Slide 9 text

Motivation 9 • To accelerate materials discovery for a sustainable future. ticlehtml/2019/ee/c8ee02495b in-li10gep2s12-sulfide-solid-electrolyte Use Atomistic simulation for materials discovery

Slide 10

Slide 10 text

Today’s Topic 10 • “Towards Universal Neural Network Potential for Material Discovery” • Providing SaaS: “Matlantis” – Universal High-speed Atomistic Simulator

Slide 11

Slide 11 text

Today’s Topic Universal Atomistic Simulator accelerates Material Discovery 11 Reaction path analysis (NEB) C-O dissociation on Co+V Catalyst Molecular Dynamics Thiol dynamics on Cu(111) Opt Fentanyl structure optimization

Slide 12

Slide 12 text

Today’s Topic Understand “Towards Universal Neural Network Potential for Material Discovery” 12

Slide 13

Slide 13 text

Today’s Topic 13 1st part introduces NNP research history Understand “Towards Universal Neural Network Potential for Material Discovery”

Slide 14

Slide 14 text

Today’s Topic 14 2nd part explains how to create universal NNP Understand “Towards Universal Neural Network Potential for Material Discovery”

Slide 15

Slide 15 text

Table of Contents • 1st part: NNP history – What’s NNP – Behler Parinello type MLP – Graph Neural Network • 2nd part: How to create “Universal” NNP , PFP – PFP • PFP architecture • PFP data collection – PFP case study (in other slides) 15

Slide 16

Slide 16 text

1st part: NNP history 16

Slide 17

Slide 17 text

Neural Network Potential (NNP) E 𝑭𝑖 = − 𝜕E 𝜕𝒓𝑖 O H H 𝒓0 = (𝑥𝑜 , 𝑦0 , 𝑧0 ) 𝒓1 = (𝑥1 , 𝑦1 , 𝑧1 ) 𝒓2 = (𝑥2 , 𝑦2 , 𝑧2 ) Neural Network Goal: Predict energy of given molecule with atomic coords by Neural Network → NN is differentiable, forces can be calculated from energy differentiation 17

Slide 18

Slide 18 text

Neural Network Potential (NNP) A. Normal supervised learning: predicts physical property directly B. NNP learns internal calculation necessary for simulation → After NNP is trained, it can be used to calculate various physical properties! Database for each physical property is unnecessary 18 O H H 𝒓0 = (𝑥𝑜 , 𝑦0 , 𝑧0 ) 𝒓1 = (𝑥1 , 𝑦1 , 𝑧1 ) 𝒓2 = (𝑥2 , 𝑦2 , 𝑧2 ) Schrodinger Eq. ・Energy ・Forces Physical Property ・Elastic consts ・Viscosity etc A B Simulation

Slide 19

Slide 19 text

NNP vs Quantum Chemistry Simulation Pros: Fast • MUCH faster than quantum chemistry simulation (ex. DFT) Cons: • Difficult to evaluate its accuracy • Data collection necessary – Quantum chemistry simulation dataset is necessary for training NNP – Need accuracy evaluation when inference data and training data differs from!divAbstract 19

Slide 20

Slide 20 text

Behler Parinello type: NNP Input - Descriptor Input atomic coordinates ? → NG! It does not satisfy basic physics law ・Translational invariance ・Rotational invariance ・Atom order permutation invariance E O H H 𝒓0 = (𝑥𝑜 , 𝑦0 , 𝑧0 ) 𝒓1 = (𝑥1 , 𝑦1 , 𝑧1 ) 𝒓2 = (𝑥2 , 𝑦2 , 𝑧2 ) Neural Network 𝑓(𝑥0 , 𝑦0 , … , 𝑧2 ) 20

Slide 21

Slide 21 text

NNP Input - Descriptor Instead of raw coordinate value, we input “Descriptor” to the Neural Network What kind of Descriptor can be made? Ex. The distance r between 2 atoms is translational / rotational invariant E O H H 𝒓0 = (𝑥𝑜 , 𝑦0 , 𝑧0 ) 𝒓1 = (𝑥1 , 𝑦1 , 𝑧1 ) 𝒓2 = (𝑥2 , 𝑦2 , 𝑧2 ) Neural Network Multi Layer Perceptron (MLP) 𝑓(𝑮0 , 𝑮1, 𝑮2 ) 𝑮0 , 𝑮1, 𝑮2 Descriptor 21

Slide 22

Slide 22 text

O NNP data collection • The goal is to predict energy for the molecules with various coordinates →Calculate energy by DFT with randomly placing atoms? → NG • In reality, molecule takes only low energy coordinates →We want to predict energy accurately which occurs in the real world. H H Low energy Likely to occur High energy (Almost) never occur O H H O H H O H H O H H O H H 22 exp(−𝐸/𝑘𝐵 𝑇) Boltzmann Distribution

Slide 23

Slide 23 text

ANI-1 Dataset creation “ANI-1, A data set of 20 million calculated off-equilibrium conformations for organic molecules” • GDB-11 database (Molecules which contains up to 11 C, N, O, F) subset is used – Limit to C, N, O – Max 8 Heavy Atom • Normal Mode Sampling (NMS): Various conformations generated from one molecule by vibration. rdkit MMFF94 Gaussian09 default method 23

Slide 24

Slide 24 text

ANI-1: Results “ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost”!divAbstract • Energy prediction on various conformation – It predicts DFT results well compared to DFTB, PM9 (conventional method) • Bigger size than training data can be predicted one-dimensional potential surface scan 24

Slide 25

Slide 25 text

Graph Neural Network (GNN) • Neural network which accepts “graph” input, it learns how the data is connected • Graph: Consists of Vertices v and Edge e – Social Network (SNS connection graph), Citation Network, Product Network – Protein-Protein Association Network – Organic molecules etc… 25 𝒗𝟎 𝒗𝟏 𝒗𝟐 𝒗𝟒 𝒗𝟑 𝑒01 𝑒12 𝑒24 𝑒34 𝑒23 Various applications!

Slide 26

Slide 26 text

Graph Neural Network (GNN) • Image convolution → Graph convolution • Also called Graph Convolution Network, Message Passing Neural Network 26 Image classification Cat, dog… Physical property Energy=1.2 eV … CNN: Image Convolution GNN: Graph Convolution

Slide 27

Slide 27 text

GNN architecture • Similar to CNN, Graph Convolution layer is stacked to create Deep Neural Network 27 Graph Conv Graph Conv Graph Conv Graph Conv Sum Feature is updated in the graph format Output predicted value for each atom (e.g., energy) Input as “Graph” Output total molecule’s prediction (e.g., energy)

Slide 28

Slide 28 text

C N O 1.0 0.0 0.0 6.0 1.0 atom type 0.0 1.0 0.0 7.0 1.0 0.0 0.0 1.0 8.0 1.0 Atomic number chirality Feature is assigned for each node Molecular Graph Convolutions: Moving Beyond Fingerprints Steven Kearnes, Kevin McCloskey, Marc Berndl, Vijay Pande, Patrick Riley arXiv:1603.00856 Feature for each node (atom)

Slide 29

Slide 29 text

GNN for molecules, crystals • Applicable to molecules →Various GNN architecture proposed since late 2010s, big attention to Deep Learning research for molecules. – NFP, GGNN, MPNN, GWM etc… • Then, applied to positional data, crystal data (with periodic condition) – SchNet, CGCNN, MEGNet, Cormorant, DimeNet, PhysNet, EGNN, TeaNet etc… 29 NFP: “Convolutional Networks on Graph for Learning Molecular Fingerprints” GWM: “Graph Warp Module: an Auxiliary Module for Boosting the Power of Graph Neural Networks in Molecular Graph Analysis” CGCNN: “Crystal Graph Convolutional Neural Networks for an Accurate and Interpretable Prediction of Material Properties”

Slide 30

Slide 30 text

SchNet • Atom pair’s distance r, apply continuous filter convolution (cfconv) It can deal with atom’s position r “SchNet: A continuous-filter convolutional neural network for modeling quantum interactions” RBF kernel 30

Slide 31

Slide 31 text

GNN application with periodic boundary condition (pbc) • CGCNN proposes how to construct “graph” for the systems with pbc. • MEGNet reports applying both isolated system (molecule) and pbc (crystal) 31 CGCNN: “Crystal Graph Convolutional Neural Networks for an Accurate and Interpretable Prediction of Material Properties” MEGNet: “Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals”

Slide 32

Slide 32 text

GNN approach: Summary With the Neural Network architecture improvement, we can gain following advantages • Human-tuned descriptor is not necessary – It is automatically learned internally in GNN • Generalization to element species – Input dimension not increase even we add atomic species →It can avoid combinatorial explosion – Generalization to few data (or even unknown) element • Accuracy, Training efficiency – Increased network representation power, possibly high accuracy – Appropriate constraint (inductive bias) makes NN training easier 32

Slide 33

Slide 33 text

Deep learning ~ trending ~ • 2012, AlexNet won on ILSVRC (Efficiently used GPU) • With the progress of GPU power, NN becomes deeper and bigger 33 GoogleNet “Going deeper with convolutions”: ResNet “Deep Residual Learning for Image Recognition”: Year CNN Depth # of Parameter 2012 AlexNet 8 layers 62.0M 2014 GoogleNet 22 layers 6.4M 2015 ResNet 110 layers (Max 1202!) 60.3M

Slide 34

Slide 34 text

Deep learning ~ trending ~ • Dataset size in computer vision area – Grows exponentially, 1 human cannot watch this amount in a life → Starts to learn collective intelligence… – “Pre-training → Fine tuning for specific task” workflow becomes the trend Dataset Data size # of class MNIST 60k 10 CIFAR-100 60k 100 ImageNet 1.3M 1,000 ImageNet-21k 14M 21,000 JFT-300M 300M (Google, not open) 18,000

Slide 35

Slide 35 text

“Universal” Neural Network Potential? • This history of deep learning technology leads the one challenging idea… NNP formulation Proof of conformation generalization ↓ ANI family researches Support various elements ↓ GNN node embedding Deal with crystal (with pbc) ↓ Graph construction for pbc system Big data training ↓ Success in CV/NLP field, DL trend →Universal NNP R&D started!! Goal: to support various elements, isolated/pbc system, various conformation. All use cases.

Slide 36

Slide 36 text

2nd part: How to create “Universal” NNP, PFP 36

Slide 37

Slide 37 text

PFP • “Universal” Neural Network Potential developed by Preferred Networks and ENEOS • Stands for “PreFerred Potential” – SaaS product which packages PFP and various physical property calculation library – Sold by Preferred Computational Chemistry (PFCC) 37

Slide 38

Slide 38 text

PFP • Architecture • Dataset 38

Slide 39

Slide 39 text

TeaNet • PFP is developed based on the TeaNet work • TeaNet is GNN which updates scalar, vector and tensor features internally – Formulation idea comes from the classical potential force field (EAM) 39

Slide 40

Slide 40 text

TeaNet • Physical meaning of using “tensor” feature: Tensor is related to classical force field called Tersoff potential 40 ・・・ Tersoff potential

Slide 41

Slide 41 text

PFP • Several improvements based on TeaNet, through more than 2 years research (Details in paper) • GNN edge cutoff is taken as 6A – 5 layers with different cutoff length [3, 3, 4, 6, 6] – → In total 22A range can be connected – GNN part can be calculated in O(N) • Energy surface is designed to be smooth (infinitely differentiable) 41

Slide 42

Slide 42 text

PFP architecture • Evaluation of PFP performance • Experiment results: OC20 dataset – ※Not the rigorous comparison since data is not completely the same 42

Slide 43

Slide 43 text

PFP Dataset • To achieve universality, dataset is collected with various structures – Molecule – Bulk – Slab – Cluster – Adsorption (Slab+Molecule) – Disordered 43

Slide 44

Slide 44 text

TeaNet: Disordered structure • Dataset - Disordered structures under periodic boundary condition • Generated using Classical MD or training phase NNP’s MD 44 Example structures taken in TeaNet paper: Train NNP Dataset collection MD on Trained NNP

Slide 45

Slide 45 text

PFP Dataset • PFN’s inhouse cluster is extensively utilized 45 Data collection with MN-Cluster & ABCI PFP v4.0.0 used 1650 GPU years computing resource

Slide 46

Slide 46 text

PFP Dataset • To achieve universality, dataset is collected with various structures 46

Slide 47

Slide 47 text

PFP Dataset • Latest PFP v4.0 (released in 2023) is applicable to 72 elements 47 v0.0 supported 45 elements

Slide 48

Slide 48 text

Summary • NNP can be used to calculate energy much faster than quantum calculation • Quality of data is important for good model – Data versatility – Quantum calculation quality/accuracy • PFP is “universal” NNP which can handle various structures/applications • Applications – Energy, force calculation – Structure optimization – Reaction pathway analysis, activation energy – Molecular Dynamics – IR spectrum 48

Slide 49

Slide 49 text

Applications 49

Slide 50

Slide 50 text

Applications 50

Slide 51

Slide 51 text

Use Case 1: Renewable energy synthetic fuel catalyst • Search for the effective FT catalyst that accelerates C-O dissociation • High throughput screening of promoters → Revealed doping V to Co accelerates the dissociation process 51 C-O dissociation on Co+V catalyst Reaction of fuel (C5+) from H2 ,CO Effect of promoters on activation energy Activation energies of methanation reactions of synthesis gas on Co(0001). Comparison of activation energy

Slide 52

Slide 52 text

Use Case 2: Grain boundary energy of elemental metals 52 Al Σ5 [100](0-21) 38 atoms H. Zheng et al., Acta Materialia,186, 40, (2020)

Slide 53

Slide 53 text

Use Case 3: Li-ion battery • Li diffusion activation energy calculation on LiFeSO4 F, each a, b, c direction – Consists of various elements – Good agreement with DFT result 53 Diffusion path for [111], [101], [100] direction

Slide 54

Slide 54 text

Use Case 4: Metal-organic frameworks • Water molecule binding energy on metal-organic framework MOF-74 – Metal element with organic molecule – Result matches with existing work with the Grimme’s D3 correction 54

Slide 55

Slide 55 text

Demonstration 55

Slide 56

Slide 56 text

Foundation Models, Generative AI, LLM 56 Foundation Model Application 1 Application 2 Application 3 ,,, and more!? • Many foundation models surprising the world: Stable diffusion, ChatGPT… • Model provider cannot extract all the potential of the foundation model – Want user to explore & find “new value”

Slide 57

Slide 57 text

Matlantis 57 • Model provider don’t know the full capability of the PFP, universal NNP – Various knowledge can be obtained by utilizing the model – We wish some people take Novel Prize for new materials discovery by utilizing PFP system PFP Structural Relaxation Reaction Analysis Molecular Dynamics ,,, and more!!

Slide 58

Slide 58 text

MRS 2023 Fall Exhibition 58

Slide 59

Slide 59 text

MRS 2023 Fall Meeting • We are presenting 6 oral talks & posters at MRS 2023 fall • Symposium: [DS06] Integrating Machine Learning with Simulations for Accelerated Materials Modeling 59 Date Title Presenter Presentation Nov 27 PM Applicability of Universal Neural Network Potential to Organic Polymer Materials Hiroki Iriguchi Poster Nov 28 AM Investigation of Phase Stability and Ionic Conductivity of Solid Electrolytes Li10MP2S12-xOx (M = Ge, Si, or Sn) with Universal Neural Network Potential Chikashi Shinagawa Oral Nov 28 PM Neural Network Potential for Arbitrary Combination of 72 Elements Trained Against Large Scale Dataset So Takamoto Oral Nov 28 PM Absorption and Dynamics of Gas Molecules in Metal-Organic Frameworks: Application of a Universal Neural Network Potential Taku Watanabe Oral Nov 29 AM Analysis of Monolayer to Bilayer Silicene Transformation in CaSi2Fx(x<1) using Universal Neural Network Potential Akihiro Nagoya Oral Nov 29 AM Efficient Crystal Structure Prediction using Universal Neural Network Potential and Genetic Algorithm Takuya Shibayama Oral

Slide 60

Slide 60 text

Links • PFP related papers – “Towards universal neural network potential for material discovery applicable to arbitrary combination of 45 elements” – “Towards universal neural network interatomic potential” 60

Slide 61

Slide 61 text

Follow us 61 Twitter account GitHub YouTube channel Slideshare account Official website

Slide 62

Slide 62 text

Appendix 62

Slide 63

Slide 63 text

NNP Tutorial review: Neural Network intro 1 “Constructing high‐dimensional neural network potentials: A tutorial review” Linear transform → Nonlinear transform applied in each layer, to express various functions 𝑬 = 𝑓(𝑮0 , 𝑮1, 𝑮2 ) 63

Slide 64

Slide 64 text

NNP Tutorial review: Neural Network intro 2 “Constructing high‐dimensional neural network potentials: A tutorial review” NN can learn more correct function form with increased data. When data is few, prediction value has variance and not trustful When data is enough, variance can be small 64

Slide 65

Slide 65 text

NNP Tutorial review: Neural Network intro 3 “Constructing high‐dimensional neural network potentials: A tutorial review” Careful evaluation is necessary to check if the NN only work well with training data Underfit:NN representation power is not enough, cannot express true target function Overfit:NN representation power is too strong, fit to training data but does not work well in other points 65

Slide 66

Slide 66 text

BPNN: Behler-Parrinello Symmetry function “Constructing high‐dimensional neural network potentials: A tutorial review” AEV: Atomic Environment Vector describes information of specific atom’s surrounding env Rc: cutoff radius 1. radial symmetry functions represents 2-body term (distance) How many atoms exist in the radius Rc from the center atom i 66

Slide 67

Slide 67 text

BPNN: Behler-Parrinello Symmetry function “Constructing high‐dimensional neural network potentials: A tutorial review” 2. angular symmetry functions represents 3-body term (angle) In the radius Rc ball from center atom i, what kind of position relation (angle) do atoms j and k exist? 67 AEV: Atomic Environment Vector describes information of specific atom’s surrounding env Rc: cutoff radius

Slide 68

Slide 68 text

BPNN: Neural Network architecture Problems of normal MLP: ・Fixed number of atoms ー 0 vector is necessary ー Cannot predict more atoms than training ・No ivariance for the atom order permutation “Constructing high‐dimensional neural network potentials: A tutorial review” Proposed approach: ・Predict Atomic Energy for each atom separately, and summing up to obtain final energy Es ・Different NN is trained for each element (O, H) 68

Slide 69

Slide 69 text

ANI-1 & ANI-1 Dataset: Summary “ANI-1: an extensible neural network potential with DFT accuracy at force field computational cost”!divAbstract • For small molecules which consist of H, C, N, O in various conformation, we can create NNP that can predict DFT energy well – Massive training data creation: 20 million datapoint Issues • Add another element (F, S etc) – Different NN necessary for each element – Input descriptor dimension increases in N^2 order • Necessary training data may scale with this order too 69

Slide 70

Slide 70 text

GNN architecture (general) • Similar to CNN, Graph Convolution layer is stacked to create Deep Neural Network 70 Graph Conv Graph Conv Graph Conv Graph Conv Graph Readout Linear Linear Graph→vector Update vector Output prediction Input as “Graph” Feature is updated in the graph format

Slide 71

Slide 71 text

Collect calculated node features, obtain graph-wise feature Han Altae-Tran, Bharath Ramsundar, Aneesh S. Pappu, & Vijay Pande (2017). Low Data Drug Discovery with One-Shot Learning. ACS Cent. Sci., 3 (4) Graph Readout: feature calculation for total graph (molecule)

Slide 72

Slide 72 text

PFP architecture • PFP performance evaluation on PFP benchmark dataset – Confirmed TeaNet (PFP base model) achieves best performance 72

Slide 73

Slide 73 text

PFP Dataset • Calculation condition on MOLECULE, CRYSTAL Dataset • PFP is jointly trained with 3 datasets below 73 Dataset name PFP MOLECULE PFP CRYSTAL, PFP CRYSTAL_U0 OC20 Software Gaussian VASP VASP xc/basis ωB97xd/6-31G(d) GGA-PBE GGA-RPBE Option Unrestricted DFT PAW pseudopotentials Cutoff energy 520 eV U parameter ON/OFF Spin polarization ON PAW pseudopotentials Cutoff energy 350 eV U parameter OFF Spin polarization OFF

Slide 74

Slide 74 text

Application: Nano Particle • “Calculations of Real-System Nanoparticles Using Universal Neural Network Potential PFP” • PFP can even calculate high entropy alloys (HEA), which contains various metals • Difficult to calculate large size with DFT Difficult to support multiple elements with classical potential 74

Slide 75

Slide 75 text

OC20, OC22 introduction 75

Slide 76

Slide 76 text

Open Catalyst 2020 • Motivaion: New catalyst development for renewable energy storage • Overview Paper: – Solar, wind power energy storge is crucial to overcome global warming – Why do hydroelectricity or battery no suffice? • Energy storage does not scale 76

Slide 77

Slide 77 text

Open Catalyst 2020 • Motivaion: New catalyst development for renewable energy storage • Overview Paper: – Store solar energy, wind energy can be stored as a form of hydrogen or methane – Hydrogen, methane reaction process improvement is the key for renewable energy storage 77

Slide 78

Slide 78 text

Open Catalyst 2020 • Catalyst: A substance that promotes a specific reaction. Itself does not change. • Dataset Paper: Technical details for dataset collection 78 Bottom pink atoms → Metal surface = Catalyst Above molecule on top = Reactants

Slide 79

Slide 79 text

Open Catalyst 2020 • Combination of various molecules on various metals • It covers main reactions related to renewable energy • Data size 130M ! 79

Slide 80

Slide 80 text

Open Catalyst 2022 • Subsequent work focuses on Oxygen Evolution Reaction (OER) catalysts • 9.8M Dataset 80

Slide 81

Slide 81 text