Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Machine Learning for Materials (Lecture 1)

Avatar for Aron Walsh Aron Walsh
January 22, 2024

Machine Learning for Materials (Lecture 1)

Slides linked to https://github.com/aronwalsh/MLforMaterials. Updated for 2026.

Avatar for Aron Walsh

Aron Walsh

January 22, 2024
Tweet

More Decks by Aron Walsh

Other Decks in Science

Transcript

  1. y = f(x) Learn f(x) from data; trust what generalises

    Key Concept #1 Machine learning is function approximation Output Input features Model
  2. New Era of Materials Research A. Agrawal and A. Choudhary,

    APL Materials 4, 053208 (2016) The research toolkit for materials science now includes data-driven statistical models
  3. Computer Revolution Keith Butler (now: STFC/SciML) Analytical Engine Automated calculations

    Charles Babbage (1837) “The science of operations has its own truth and value” Ada Lovelace (1840) Multiple two 20 digit numbers in ~3 minutes
  4. Powerful Statistical Techniques Chris Hendon (now: University of Oregon) Keith

    Butler (now: STFC/SciML) Using GPT-5 via https://github.com/hwchase17/langchain Answers provided included transition metal oxides (V2 O5 ), Chevrel phases (Mo6 S8 ), Prussian blues (Fe4 [Fe(CN)6 ]3 )
  5. Efficient Research Workflows J. P. Correa-Baena et al., Joule 2,

    1410 (2018) Integration of computational techniques to accelerate discovery & development cycles
  6. Module Contents 1. Introduction 2. Machine Learning Basics 3. Materials

    Data 4. Crystal Representations 5. Classical Learning 6. Deep Learning 7. Building a Model from Scratch 8. Accelerated Discovery 9. Generative Artificial Intelligence 10. Future Directions Dense module with time to self-study to explore concepts further
  7. What is Machine Learning (ML)? Statistical algorithms that learn from

    training data and build a model to make predictions Data types Materials features can be binary (e.g. stability), categorical (e.g. symmetry), integer (e.g. stoichiometry), continuous (e.g. rate) Learning types Unsupervised (identify patterns), supervised (use patterns), reinforcement (maximise reward)
  8. What is Machine Learning (ML)? Statistical algorithms that identify and

    use patterns in multi-dimensional datasets Image from “How Machines Learn” by Helen Edwards
  9. What is Machine Learning (ML)? Statistical algorithms that identify and

    use patterns in multi-dimensional datasets Images from https://vas3k.com/blog/machine_learning Predict a category, e.g. decision trees to predict reaction outcome Predict a value, e.g. regression to extract a reaction rate Group by similarity, e.g. high-throughput crystallography Maximise reward, e.g. reaction conditions to optimise yield
  10. What is Machine Learning (ML)? Statistical algorithms that operate on

    multi-dimensional arrays of numerical data Image from http://karlstratos.com; note the physical definitions are more nuanced 7 8 3 1 7 2 3 4 8 6 7 8 9 [1 7] ⋯ [6 4] ⋮ ⋱ ⋮ [5 6] ⋯ [2 8] 𝑥 𝒙 𝒊 𝒙 𝒊𝒋 𝒙 𝒊𝒋𝒌
  11. What is Machine Learning (ML)? Statistical algorithms that operate on

    multi-dimensional arrays of numerical data Image from “How Machines Learn” by Helen Edwards 𝑦 1 𝑦 2 𝑦 3 𝑥 11 𝑥 12 𝑥 13 𝑥 14 𝑥 15 𝑥 21 𝑥 22 𝑥 23 𝑥 24 𝑥 25 𝑥 31 𝑥 32 𝑥 33 𝑥 34 𝑥 35 𝑔1 𝑔2 𝑔3 𝑔4 𝑔5 = 3 1 matrix 3 5 matrix 5 1 matrix
  12. ML ~ Function Approximation Image from https://github.com/jermwatt/machine_learning_refined Model selection, training,

    and testing tunes a “complexity dial” for your problem of interest Linear model Highly non-linear model Underfit regime Overfit regime
  13. A. L. Samuel, IBM Journal, 211 (1959) Brief History of

    ML Term coined by Arthur Samuel in 1959 “It is now possible to devise learning schemes which will greatly outperform an average person and that such learning schemes may eventually be economically feasible”
  14. W. S. McCulloch and W. Pitts, Bull. Math. Biophys. 5,

    115 (1943) Brief History of ML An artificial neuron had been proposed in 1943 “Every net, if furnished with a tape, scanners connected to afferents to perform the necessary motor-operations, can compute only such numbers as can a Turing machine”
  15. A. M. Turing, Mind 236, 433 (1950) Brief History of

    ML In 1950, Alan Turing proposed a “Learning Machine” that could become intelligent “I PROPOSE to consider the question, Can machines think?”
  16. Source Material for Module ML content available from many sources,

    including blogs, research papers, repositories, and textbooks These slides are a skeleton, fleshed out with lectures, activities, and reading General Specialist
  17. Active Participation Your engagement is essential. This is a dense

    course with new concepts, Python coding, and self-study • Attend all lectures to hear the core content • Attend all practical sessions for hands-on coding • Attempt to solve problems yourself and ask course assistants if you need help
  18. Creative Solutions There is great flexibility in programming with no

    unique solution for any given problem You may be interested in speed or clarity, but ultimately want a working code • Check package manuals, e.g. https://matplotlib.org & https://scikit-learn.org • Search https://stackexchange.com & https://github.com for ideas
  19. Creative Solutions Many AI assistants for coding exist such as

    Github Copilot, GPT, Gemini • Most helpful when you know the basics first • Assistants can give poor suggestions with buggy code based on out-of-date libraries/functions • Not a substitute for hands-on coding experience and knowledge of materials
  20. Module Assessment Aim for working knowledge of ML with practical

    sessions and coursework Computer labs (8 ⨉ 2%) Notebook submitted on Blackboard (Due by the end of each session – 15:45) Research assignment (84%) Assignment to complete (details after Lecture 9) Registration of absence or mitigation goes via the student office