Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Machine Learning for Materials (Lecture 1)

Aron Walsh
January 22, 2024

Machine Learning for Materials (Lecture 1)

Aron Walsh

January 22, 2024
Tweet

More Decks by Aron Walsh

Other Decks in Science

Transcript

  1. New Era of Materials Research Agrawal and Choudhary, APL Materials

    4, 053208 (2016) The research toolkit for materials science includes powerful data-driven statistical models
  2. Computer Revolution Chris Hendon (now: University of Oregon) Keith Butler

    (now: STFC/SciML) Analytical Engine Automated calculations Charles Babbage (1837) “The science of operations has its own truth and value” Ada Lovelace (1840) Multiple two 20 digit numbers in ~ 3 minutes
  3. Computer Revolution Chris Hendon (now: University of Oregon) Keith Butler

    (now: STFC/SciML) “System on a chip” microprocessor from https://www.apple.com
  4. Computer Revolution Chris Hendon (now: University of Oregon) Keith Butler

    (now: STFC/SciML) “System on a chip” microprocessor from https://www.apple.com
  5. Exascale Supercomputing Chris Hendon (now: University of Oregon) Keith Butler

    (now: STFC/SciML) https://top500.org Exascale computing refers to 1018 floating point operations per second; https://top500.org
  6. Powerful Statistical Techniques Chris Hendon (now: University of Oregon) Keith

    Butler (now: STFC/SciML) Using GPT-3 via https://github.com/hwchase17/langchain Answers provided included transition metal oxides (V2 O5 ), Chevrel phases (Mo6 S8 ), Prussian blues (Fe4 [Fe(CN)6 ]3 )
  7. Efficient Research Workflows J. P. Correa-Baena et al., Joule 2,

    1410 (2018) Integration of computational techniques to accelerate discovery & development cycles
  8. Course Contents 1. Course Introduction 2. Materials Modelling 3. Machine

    Learning Basics 4. Materials Data and Representations 5. Classical Learning 6. Artificial Neural Networks 7. Building a Model from Scratch 8. Recent Advances in AI 9. and 10. Research Challenge Dense course with time to self-study to explore concepts further
  9. What is Machine Learning (ML)? Statistical algorithms that learn from

    training data and build a model to make predictions Data types Materials features can be binary (e.g. stability), categorical (e.g. symmetry), integer (e.g. stoichiometry), continuous (e.g. rate) Learning types Unsupervised (identify patterns), supervised (use patterns), reinforcement (maximise reward)
  10. What is Machine Learning (ML)? Statistical algorithms that identify and

    use patterns in multi-dimensional datasets Image from “How Machines Learn” by Helen Edwards
  11. What is Machine Learning (ML)? Statistical algorithms that operate on

    multi-dimensional arrays of numerical data Image from http://karlstratos.com; note the physical definitions are more nuanced 7 8 3 1 7 2 3 4 8 6 7 8 9 [1 7] ⋯ [6 4] ⋮ ⋱ ⋮ [5 6] ⋯ [2 8] 𝑥 𝒙𝒊 𝒙𝒊𝒋 𝒙𝒊𝒋𝒌
  12. What is Machine Learning (ML)? Statistical algorithms that operate on

    multi-dimensional arrays of numerical data Image from “How Machines Learn” by Helen Edwards 𝑦1 𝑦2 𝑦3 𝑥11 𝑥12 𝑥13 𝑥14 𝑥15 𝑥21 𝑥22 𝑥23 𝑥24 𝑥25 𝑥31 𝑥32 𝑥33 𝑥34 𝑥35 𝑔1 𝑔2 𝑔3 𝑔4 𝑔5 = 3✖1 matrix 3✖5 matrix 5✖1 matrix
  13. Why Machine Learning (ML)? Many problems are difficult to solve

    using standard techniques, e.g. combinational expansions Non-deterministic polynomial hard (NP-hard) Challenging class of computational problems, where finding an efficient solution remains an open and difficult task Fast Marching Method: J. Andrews and J. A. Sethian, PNAS 104, 1118 (2007) Travelling salesman: find the shortest route that visits each city once and returns home
  14. Why Machine Learning (ML)? Many problems are difficult to solve

    using standard techniques, e.g. combinational expansions Some relevant challenges in materials science: Reaction engineering Navigate configurational space of reactants & products Crystal structure prediction Find the optimal 3D structure(s) for a given composition Materials design Achieve target functionality within chemical constraints
  15. Why Machine Learning (ML)? Solid-solutions are used to control structure

    and properties, e.g. (1-x)ZnO + (x)ZnS à ZnO1-x Sx ML techniques can be used to sample and model this massive configurational space Mixed sites in a supercell model N = 16: 12,870 N = 32: 6×108 N = 64: 1.8×1018 ! ! ! 2 2 N N N æ öæ ö ç ÷ç ÷ è øè ø Number of configurations for ZnO0.5 S0.5 A wurtzite crystal with a partially occupied anion site
  16. A. L. Samuel, IBM Journal, 211 (1959) Brief History of

    ML The term was coined by Arthur Samuel in 1959 “It is now possible to devise learning schemes which will greatly outperform an average person and that such learning schemes may eventually be economically feasible”
  17. W. S. McCulloch and W. Pitts, Bull. Math. Biophys. 5,

    115 (1943) Brief History of ML An artificial neuron had been proposed in 1943 “Every net, if furnished with a tape, scanners connected to afferents to perform the necessary motor-operations, can compute only such numbers as can a Turing machine”
  18. A. M. Turing, Mind 236, 433 (1950) Brief History of

    ML In 1950, Alan Turing proposed a “Learning Machine” that could become intelligent “I PROPOSE to consider the question, Can machines think?”
  19. Special thanks to Anthony Onwuli, Zhenzhu Li, and Calysta Tesiman

    for assistance Source Material for Course ML content available from many sources, including blogs, research papers, repositories, and textbooks These slides are a skeleton, fleshed out with lectures, activities, and reading General Specialist
  20. Active Participation Your engagement is essential. This is a dense

    course with new concepts, advanced coding, and self-study • Attend all lectures • Attend all practical sessions • Attempt to solve problems yourself and ask course assistants if you need help
  21. Creative Solutions There is great flexibility in programming with no

    unique solution for a given problem You may be interested in speed or clarity, but ultimately want a robust code • Check package manuals, e.g. https://matplotlib.org & https://scikit-learn.org • Search https://stackexchange.com & https://github.com for ideas
  22. Creative Solutions Many AI assistants for coding exist such as

    Github Copilot, CodeWhisperer, Codeium, GPT4 • Most helpful when you know the basics first • Assistants often lack domain expertise and give poor suggestions with buggy code based on old versions of Python libraries • Not a substitute for hands-on coding experience and knowledge of materials
  23. 2024 Course Assistants Dr Zhenzhu Li Schmidt AI in Science

    Fellow Irea Mosquera-Lois Xia Liang Anthony Onwuli Yifan Wu
  24. Module Assessment Aim for working knowledge of ML with practical

    sessions and coursework Computational exercises (40%) Submitted on MyDepartment (Due 1 hour after each computer lab) Research challenge (60%) Assignment to complete (details in Lecture 9) Registration of absence or mitigation goes via the student office