Upgrade to Pro — share decks privately, control downloads, hide ads and more …

History towards Universal Neural Network Potential for Material Discovery

Matlantis
October 04, 2023

History towards Universal Neural Network Potential for Material Discovery

■Abstract
The rapid advancements of Artificial Intelligence technology have brought about revolutionary changes in materials discovery.

Neural Network Potential (NNP) describes molecular dynamics force field using a neural network, and many physical properties can be simulated using this single neural network. The webinar reviews the history of NNP research to understand how dataset & neural network architecture are improved.

We also describe the effort to develop a universal neural network and introduce the “PreFerred Potential (PFP)” implemented in Matlantis.

■Speaker
Preferred Computational Chemistry, Inc.
Kosuke Nakago, Taku Watanabe

Matlantis

October 04, 2023
Tweet

More Decks by Matlantis

Other Decks in Science

Transcript

  1. Kosuke Nakago, Taku Watanabe Preferred Computational Chemistry, Inc. History towards

    Universal Neural Network Potential for Material Discovery
  2. Motivation 4 • To accelerate materials discovery for a sustainable

    future. https://pubs.rsc.org/en/content/ar ticlehtml/2019/ee/c8ee02495b
  3. Motivation 5 • To accelerate materials discovery for a sustainable

    future. https://pubs.rsc.org/en/content/ar ticlehtml/2019/ee/c8ee02495b
  4. Motivation 6 • To accelerate materials discovery for a sustainable

    future. https://pubs.rsc.org/en/content/ar ticlehtml/2019/ee/c8ee02495b https://matlantis.com/calculation/li-diffusion- in-li10gep2s12-sulfide-solid-electrolyte
  5. Motivation 7 • To accelerate materials discovery for a sustainable

    future. https://pubs.rsc.org/en/content/ar ticlehtml/2019/ee/c8ee02495b https://matlantis.com/calculation/li-diffusion- in-li10gep2s12-sulfide-solid-electrolyte
  6. Motivation 8 • To accelerate materials discovery for a sustainable

    future. https://matlantis.com/calculation/silicon-tma-tel https://pubs.rsc.org/en/content/ar ticlehtml/2019/ee/c8ee02495b https://matlantis.com/calculation/li-diffusion- in-li10gep2s12-sulfide-solid-electrolyte
  7. Motivation 9 • To accelerate materials discovery for a sustainable

    future. https://matlantis.com/calculation/silicon-tma-tel https://pubs.rsc.org/en/content/ar ticlehtml/2019/ee/c8ee02495b https://matlantis.com/calculation/li-diffusion- in-li10gep2s12-sulfide-solid-electrolyte Use Atomistic simulation for materials discovery
  8. Today’s Topic 10 • “Towards Universal Neural Network Potential for

    Material Discovery” • Providing SaaS: “Matlantis” – Universal High-speed Atomistic Simulator https://www.nature.com/articles/s41467-022-30687-9 https://matlantis.com/
  9. Today’s Topic Universal Atomistic Simulator accelerates Material Discovery 11 Reaction

    path analysis (NEB) C-O dissociation on Co+V Catalyst Molecular Dynamics Thiol dynamics on Cu(111) Opt Fentanyl structure optimization
  10. Today’s Topic 13 1st part introduces NNP research history Understand

    “Towards Universal Neural Network Potential for Material Discovery”
  11. Today’s Topic 14 2nd part explains how to create universal

    NNP Understand “Towards Universal Neural Network Potential for Material Discovery”
  12. Table of Contents • 1st part: NNP history – What’s

    NNP – Behler Parinello type MLP – Graph Neural Network • 2nd part: How to create “Universal” NNP , PFP – PFP • PFP architecture • PFP data collection – PFP case study (in other slides) 15
  13. Neural Network Potential (NNP) E 𝑭𝑖 = − 𝜕E 𝜕𝒓𝑖

    O H H 𝒓0 = (𝑥𝑜 , 𝑦0 , 𝑧0 ) 𝒓1 = (𝑥1 , 𝑦1 , 𝑧1 ) 𝒓2 = (𝑥2 , 𝑦2 , 𝑧2 ) Neural Network Goal: Predict energy of given molecule with atomic coords by Neural Network → NN is differentiable, forces can be calculated from energy differentiation 17
  14. Neural Network Potential (NNP) A. Normal supervised learning: predicts physical

    property directly B. NNP learns internal calculation necessary for simulation → After NNP is trained, it can be used to calculate various physical properties! Database for each physical property is unnecessary 18 O H H 𝒓0 = (𝑥𝑜 , 𝑦0 , 𝑧0 ) 𝒓1 = (𝑥1 , 𝑦1 , 𝑧1 ) 𝒓2 = (𝑥2 , 𝑦2 , 𝑧2 ) Schrodinger Eq. ・Energy ・Forces Physical Property ・Elastic consts ・Viscosity etc A B Simulation
  15. NNP vs Quantum Chemistry Simulation Pros: Fast • MUCH faster

    than quantum chemistry simulation (ex. DFT) Cons: • Difficult to evaluate its accuracy • Data collection necessary – Quantum chemistry simulation dataset is necessary for training NNP – Need accuracy evaluation when inference data and training data differs from https://pubs.rsc.org/en/content/articlelanding/2017/sc/c6sc05720a#!divAbstract 19
  16. Behler Parinello type: NNP Input - Descriptor Input atomic coordinates

    ? → NG! It does not satisfy basic physics law ・Translational invariance ・Rotational invariance ・Atom order permutation invariance E O H H 𝒓0 = (𝑥𝑜 , 𝑦0 , 𝑧0 ) 𝒓1 = (𝑥1 , 𝑦1 , 𝑧1 ) 𝒓2 = (𝑥2 , 𝑦2 , 𝑧2 ) Neural Network 𝑓(𝑥0 , 𝑦0 , … , 𝑧2 ) 20
  17. NNP Input - Descriptor Instead of raw coordinate value, we

    input “Descriptor” to the Neural Network What kind of Descriptor can be made? Ex. The distance r between 2 atoms is translational / rotational invariant E O H H 𝒓0 = (𝑥𝑜 , 𝑦0 , 𝑧0 ) 𝒓1 = (𝑥1 , 𝑦1 , 𝑧1 ) 𝒓2 = (𝑥2 , 𝑦2 , 𝑧2 ) Neural Network Multi Layer Perceptron (MLP) 𝑓(𝑮0 , 𝑮1, 𝑮2 ) 𝑮0 , 𝑮1, 𝑮2 Descriptor 21
  18. O NNP data collection • The goal is to predict

    energy for the molecules with various coordinates →Calculate energy by DFT with randomly placing atoms? → NG • In reality, molecule takes only low energy coordinates →We want to predict energy accurately which occurs in the real world. H H Low energy Likely to occur High energy (Almost) never occur O H H O H H O H H O H H O H H 22 exp(−𝐸/𝑘𝐵 𝑇) Boltzmann Distribution
  19. ANI-1 Dataset creation “ANI-1, A data set of 20 million

    calculated off-equilibrium conformations for organic molecules” https://www.nature.com/articles/sdata2017193 • GDB-11 database (Molecules which contains up to 11 C, N, O, F) subset is used – Limit to C, N, O – Max 8 Heavy Atom • Normal Mode Sampling (NMS): Various conformations generated from one molecule by vibration. rdkit MMFF94 Gaussian09 default method 23
  20. ANI-1: Results “ANI-1: an extensible neural network potential with DFT

    accuracy at force field computational cost” https://pubs.rsc.org/en/content/articlelanding/2017/sc/c6sc05720a#!divAbstract • Energy prediction on various conformation – It predicts DFT results well compared to DFTB, PM9 (conventional method) • Bigger size than training data can be predicted one-dimensional potential surface scan 24
  21. Graph Neural Network (GNN) • Neural network which accepts “graph”

    input, it learns how the data is connected • Graph: Consists of Vertices v and Edge e – Social Network (SNS connection graph), Citation Network, Product Network – Protein-Protein Association Network – Organic molecules etc… 25 𝒗𝟎 𝒗𝟏 𝒗𝟐 𝒗𝟒 𝒗𝟑 𝑒01 𝑒12 𝑒24 𝑒34 𝑒23 Various applications!
  22. Graph Neural Network (GNN) • Image convolution → Graph convolution

    • Also called Graph Convolution Network, Message Passing Neural Network 26 Image classification Cat, dog… Physical property Energy=1.2 eV … CNN: Image Convolution GNN: Graph Convolution
  23. GNN architecture • Similar to CNN, Graph Convolution layer is

    stacked to create Deep Neural Network 27 Graph Conv Graph Conv Graph Conv Graph Conv Sum Feature is updated in the graph format Output predicted value for each atom (e.g., energy) Input as “Graph” Output total molecule’s prediction (e.g., energy)
  24. C N O 1.0 0.0 0.0 6.0 1.0 atom type

    0.0 1.0 0.0 7.0 1.0 0.0 0.0 1.0 8.0 1.0 Atomic number chirality Feature is assigned for each node Molecular Graph Convolutions: Moving Beyond Fingerprints Steven Kearnes, Kevin McCloskey, Marc Berndl, Vijay Pande, Patrick Riley arXiv:1603.00856 Feature for each node (atom)
  25. GNN for molecules, crystals • Applicable to molecules →Various GNN

    architecture proposed since late 2010s, big attention to Deep Learning research for molecules. – NFP, GGNN, MPNN, GWM etc… • Then, applied to positional data, crystal data (with periodic condition) – SchNet, CGCNN, MEGNet, Cormorant, DimeNet, PhysNet, EGNN, TeaNet etc… 29 NFP: “Convolutional Networks on Graph for Learning Molecular Fingerprints” https://arxiv.org/abs/1509.09292 GWM: “Graph Warp Module: an Auxiliary Module for Boosting the Power of Graph Neural Networks in Molecular Graph Analysis” https://arxiv.org/pdf/1902.01020.pdf CGCNN: “Crystal Graph Convolutional Neural Networks for an Accurate and Interpretable Prediction of Material Properties” https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.120.145301
  26. SchNet • Atom pair’s distance r, apply continuous filter convolution

    (cfconv) It can deal with atom’s position r “SchNet: A continuous-filter convolutional neural network for modeling quantum interactions” https://arxiv.org/abs/1706.08566 RBF kernel 30
  27. GNN application with periodic boundary condition (pbc) • CGCNN proposes

    how to construct “graph” for the systems with pbc. • MEGNet reports applying both isolated system (molecule) and pbc (crystal) 31 CGCNN: “Crystal Graph Convolutional Neural Networks for an Accurate and Interpretable Prediction of Material Properties” https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.120.145301 MEGNet: “Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals” https://pubs.acs.org/doi/10.1021/acs.chemmater.9b01294
  28. GNN approach: Summary With the Neural Network architecture improvement, we

    can gain following advantages • Human-tuned descriptor is not necessary – It is automatically learned internally in GNN • Generalization to element species – Input dimension not increase even we add atomic species →It can avoid combinatorial explosion – Generalization to few data (or even unknown) element • Accuracy, Training efficiency – Increased network representation power, possibly high accuracy – Appropriate constraint (inductive bias) makes NN training easier 32
  29. Deep learning ~ trending ~ • 2012, AlexNet won on

    ILSVRC (Efficiently used GPU) • With the progress of GPU power, NN becomes deeper and bigger 33 GoogleNet “Going deeper with convolutions”: https://arxiv.org/pdf/1409.4842.pdf ResNet “Deep Residual Learning for Image Recognition”: https://arxiv.org/pdf/1512.03385.pdf Year CNN Depth # of Parameter 2012 AlexNet 8 layers 62.0M 2014 GoogleNet 22 layers 6.4M 2015 ResNet 110 layers (Max 1202!) 60.3M https://towardsdatascience.com/the-w3h-of-alexnet-vggnet-resnet-and-inception-7baaaecccc96
  30. Deep learning ~ trending ~ • Dataset size in computer

    vision area – Grows exponentially, 1 human cannot watch this amount in a life → Starts to learn collective intelligence… – “Pre-training → Fine tuning for specific task” workflow becomes the trend Dataset Data size # of class MNIST 60k 10 CIFAR-100 60k 100 ImageNet 1.3M 1,000 ImageNet-21k 14M 21,000 JFT-300M 300M (Google, not open) 18,000
  31. “Universal” Neural Network Potential? • This history of deep learning

    technology leads the one challenging idea… NNP formulation Proof of conformation generalization ↓ ANI family researches Support various elements ↓ GNN node embedding Deal with crystal (with pbc) ↓ Graph construction for pbc system Big data training ↓ Success in CV/NLP field, DL trend →Universal NNP R&D started!! Goal: to support various elements, isolated/pbc system, various conformation. All use cases.
  32. PFP • “Universal” Neural Network Potential developed by Preferred Networks

    and ENEOS • Stands for “PreFerred Potential” – SaaS product which packages PFP and various physical property calculation library – Sold by Preferred Computational Chemistry (PFCC) 37
  33. TeaNet • PFP is developed based on the TeaNet work

    • TeaNet is GNN which updates scalar, vector and tensor features internally – Formulation idea comes from the classical potential force field (EAM) 39 https://arxiv.org/pdf/1912.01398.pdf
  34. TeaNet • Physical meaning of using “tensor” feature: Tensor is

    related to classical force field called Tersoff potential 40 https://arxiv.org/pdf/1912.01398.pdf ・・・ Tersoff potential
  35. PFP • Several improvements based on TeaNet, through more than

    2 years research (Details in paper) • GNN edge cutoff is taken as 6A – 5 layers with different cutoff length [3, 3, 4, 6, 6] – → In total 22A range can be connected – GNN part can be calculated in O(N) • Energy surface is designed to be smooth (infinitely differentiable) 41
  36. PFP architecture • Evaluation of PFP performance • Experiment results:

    OC20 dataset – ※Not the rigorous comparison since data is not completely the same 42 https://arxiv.org/pdf/2106.14583.pdf
  37. PFP Dataset • To achieve universality, dataset is collected with

    various structures – Molecule – Bulk – Slab – Cluster – Adsorption (Slab+Molecule) – Disordered 43 https://arxiv.org/pdf/2106.14583.pdf
  38. TeaNet: Disordered structure • Dataset - Disordered structures under periodic

    boundary condition • Generated using Classical MD or training phase NNP’s MD 44 https://arxiv.org/pdf/2106.14583.pdf Example structures taken in TeaNet paper: Train NNP Dataset collection MD on Trained NNP
  39. PFP Dataset • PFN’s inhouse cluster is extensively utilized 45

    Data collection with MN-Cluster & ABCI PFP v4.0.0 used 1650 GPU years computing resource
  40. PFP Dataset • To achieve universality, dataset is collected with

    various structures 46 https://arxiv.org/pdf/2106.14583.pdf
  41. PFP Dataset • Latest PFP v4.0 (released in 2023) is

    applicable to 72 elements 47 v0.0 supported 45 elements
  42. Summary • NNP can be used to calculate energy much

    faster than quantum calculation • Quality of data is important for good model – Data versatility – Quantum calculation quality/accuracy • PFP is “universal” NNP which can handle various structures/applications • Applications – Energy, force calculation – Structure optimization – Reaction pathway analysis, activation energy – Molecular Dynamics – IR spectrum 48 https://matlantis.com/product
  43. Use Case 1: Renewable energy synthetic fuel catalyst • Search

    for the effective FT catalyst that accelerates C-O dissociation • High throughput screening of promoters → Revealed doping V to Co accelerates the dissociation process 51 C-O dissociation on Co+V catalyst Reaction of fuel (C5+) from H2 ,CO Effect of promoters on activation energy Activation energies of methanation reactions of synthesis gas on Co(0001). Comparison of activation energy
  44. Use Case 2: Grain boundary energy of elemental metals 52

    Al Σ5 [100](0-21) 38 atoms H. Zheng et al., Acta Materialia,186, 40, (2020) https://materialsvirtuallab.org/2020/01/grain-boundary-database/
  45. Use Case 3: Li-ion battery • Li diffusion activation energy

    calculation on LiFeSO4 F, each a, b, c direction – Consists of various elements – Good agreement with DFT result 53 Diffusion path for [111], [101], [100] direction
  46. Use Case 4: Metal-organic frameworks • Water molecule binding energy

    on metal-organic framework MOF-74 – Metal element with organic molecule – Result matches with existing work with the Grimme’s D3 correction 54
  47. Foundation Models, Generative AI, LLM 56 Foundation Model Application 1

    Application 2 Application 3 ,,, and more!? • Many foundation models surprising the world: Stable diffusion, ChatGPT… • Model provider cannot extract all the potential of the foundation model – Want user to explore & find “new value”
  48. Matlantis 57 • Model provider don’t know the full capability

    of the PFP, universal NNP – Various knowledge can be obtained by utilizing the model – We wish some people take Novel Prize for new materials discovery by utilizing PFP system PFP Structural Relaxation Reaction Analysis Molecular Dynamics ,,, and more!!
  49. MRS 2023 Fall Meeting • We are presenting 6 oral

    talks & posters at MRS 2023 fall • Symposium: [DS06] Integrating Machine Learning with Simulations for Accelerated Materials Modeling 59 Date Title Presenter Presentation Nov 27 PM Applicability of Universal Neural Network Potential to Organic Polymer Materials Hiroki Iriguchi Poster Nov 28 AM Investigation of Phase Stability and Ionic Conductivity of Solid Electrolytes Li10MP2S12-xOx (M = Ge, Si, or Sn) with Universal Neural Network Potential Chikashi Shinagawa Oral Nov 28 PM Neural Network Potential for Arbitrary Combination of 72 Elements Trained Against Large Scale Dataset So Takamoto Oral Nov 28 PM Absorption and Dynamics of Gas Molecules in Metal-Organic Frameworks: Application of a Universal Neural Network Potential Taku Watanabe Oral Nov 29 AM Analysis of Monolayer to Bilayer Silicene Transformation in CaSi2Fx(x<1) using Universal Neural Network Potential Akihiro Nagoya Oral Nov 29 AM Efficient Crystal Structure Prediction using Universal Neural Network Potential and Genetic Algorithm Takuya Shibayama Oral
  50. Links • PFP related papers – “Towards universal neural network

    potential for material discovery applicable to arbitrary combination of 45 elements” https://www.nature.com/articles/s41467-022-30687-9 – “Towards universal neural network interatomic potential” https://doi.org/10.1016/j.jmat.2022.12.007 60
  51. Follow us 61 Twitter account https://twitter.com/matlantis_en GitHub https://github.com/matlantis-pfcc YouTube channel

    https://www.youtube.com/c/Matlantis Slideshare account https://www.slideshare.net/matlantis Official website https://matlantis.com/
  52. NNP Tutorial review: Neural Network intro 1 “Constructing high‐dimensional neural

    network potentials: A tutorial review” https://onlinelibrary.wiley.com/doi/full/10.1002/qua.24890 Linear transform → Nonlinear transform applied in each layer, to express various functions 𝑬 = 𝑓(𝑮0 , 𝑮1, 𝑮2 ) 63
  53. NNP Tutorial review: Neural Network intro 2 “Constructing high‐dimensional neural

    network potentials: A tutorial review” https://onlinelibrary.wiley.com/doi/full/10.1002/qua.24890 NN can learn more correct function form with increased data. When data is few, prediction value has variance and not trustful When data is enough, variance can be small 64
  54. NNP Tutorial review: Neural Network intro 3 “Constructing high‐dimensional neural

    network potentials: A tutorial review” https://onlinelibrary.wiley.com/doi/full/10.1002/qua.24890 Careful evaluation is necessary to check if the NN only work well with training data Underfit:NN representation power is not enough, cannot express true target function Overfit:NN representation power is too strong, fit to training data but does not work well in other points 65
  55. BPNN: Behler-Parrinello Symmetry function “Constructing high‐dimensional neural network potentials: A

    tutorial review” https://onlinelibrary.wiley.com/doi/full/10.1002/qua.24890 AEV: Atomic Environment Vector describes information of specific atom’s surrounding env Rc: cutoff radius 1. radial symmetry functions represents 2-body term (distance) How many atoms exist in the radius Rc from the center atom i 66
  56. BPNN: Behler-Parrinello Symmetry function “Constructing high‐dimensional neural network potentials: A

    tutorial review” https://onlinelibrary.wiley.com/doi/full/10.1002/qua.24890 2. angular symmetry functions represents 3-body term (angle) In the radius Rc ball from center atom i, what kind of position relation (angle) do atoms j and k exist? 67 AEV: Atomic Environment Vector describes information of specific atom’s surrounding env Rc: cutoff radius
  57. BPNN: Neural Network architecture Problems of normal MLP: ・Fixed number

    of atoms ー 0 vector is necessary ー Cannot predict more atoms than training ・No ivariance for the atom order permutation “Constructing high‐dimensional neural network potentials: A tutorial review” https://onlinelibrary.wiley.com/doi/full/10.1002/qua.24890 Proposed approach: ・Predict Atomic Energy for each atom separately, and summing up to obtain final energy Es ・Different NN is trained for each element (O, H) 68
  58. ANI-1 & ANI-1 Dataset: Summary “ANI-1: an extensible neural network

    potential with DFT accuracy at force field computational cost” https://pubs.rsc.org/en/content/articlelanding/2017/sc/c6sc05720a#!divAbstract • For small molecules which consist of H, C, N, O in various conformation, we can create NNP that can predict DFT energy well – Massive training data creation: 20 million datapoint Issues • Add another element (F, S etc) – Different NN necessary for each element – Input descriptor dimension increases in N^2 order • Necessary training data may scale with this order too 69
  59. GNN architecture (general) • Similar to CNN, Graph Convolution layer

    is stacked to create Deep Neural Network 70 Graph Conv Graph Conv Graph Conv Graph Conv Graph Readout Linear Linear Graph→vector Update vector Output prediction Input as “Graph” Feature is updated in the graph format
  60. Collect calculated node features, obtain graph-wise feature Han Altae-Tran, Bharath

    Ramsundar, Aneesh S. Pappu, & Vijay Pande (2017). Low Data Drug Discovery with One-Shot Learning. ACS Cent. Sci., 3 (4) Graph Readout: feature calculation for total graph (molecule)
  61. PFP architecture • PFP performance evaluation on PFP benchmark dataset

    – Confirmed TeaNet (PFP base model) achieves best performance 72 https://arxiv.org/pdf/2106.14583.pdf
  62. PFP Dataset • Calculation condition on MOLECULE, CRYSTAL Dataset •

    PFP is jointly trained with 3 datasets below 73 Dataset name PFP MOLECULE PFP CRYSTAL, PFP CRYSTAL_U0 OC20 Software Gaussian VASP VASP xc/basis ωB97xd/6-31G(d) GGA-PBE GGA-RPBE Option Unrestricted DFT PAW pseudopotentials Cutoff energy 520 eV U parameter ON/OFF Spin polarization ON PAW pseudopotentials Cutoff energy 350 eV U parameter OFF Spin polarization OFF
  63. Application: Nano Particle • “Calculations of Real-System Nanoparticles Using Universal

    Neural Network Potential PFP” https://arxiv.org/abs/2107.00963 • PFP can even calculate high entropy alloys (HEA), which contains various metals • Difficult to calculate large size with DFT Difficult to support multiple elements with classical potential 74
  64. Open Catalyst 2020 • Motivaion: New catalyst development for renewable

    energy storage • Overview Paper: – Solar, wind power energy storge is crucial to overcome global warming – Why do hydroelectricity or battery no suffice? • Energy storage does not scale 76 https://arxiv.org/pdf/2010.09435.pdf
  65. Open Catalyst 2020 • Motivaion: New catalyst development for renewable

    energy storage • Overview Paper: – Store solar energy, wind energy can be stored as a form of hydrogen or methane – Hydrogen, methane reaction process improvement is the key for renewable energy storage 77 https://arxiv.org/pdf/2010.09435.pdf
  66. Open Catalyst 2020 • Catalyst: A substance that promotes a

    specific reaction. Itself does not change. • Dataset Paper: Technical details for dataset collection 78 Bottom pink atoms → Metal surface = Catalyst Above molecule on top = Reactants https://arxiv.org/pdf/2010.09435.pdf https://opencatalystproject.org/
  67. Open Catalyst 2020 • Combination of various molecules on various

    metals • It covers main reactions related to renewable energy • Data size 130M ! 79 https://arxiv.org/pdf/2010.09435.pdf
  68. Open Catalyst 2022 • Subsequent work focuses on Oxygen Evolution

    Reaction (OER) catalysts • 9.8M Dataset 80 https://arxiv.org/abs/2206.08917
  69. 81