Deep Materials Informatics: Illustrative Applications of Deep Learning in Materials Science

Ankit Agrawal Research Associate Professor Department of Electrical Engineering and
Computer Science, Northwestern University Deep Materials Informatics: Illustrative Applications of Deep Learning in Materials Science April 2020 Collaborators: Surya Kalidindi (GaTech), Greg Olson (NU, QuesTek), Chris Wolverton (NU), Peter Voorhees (NU), Veera Sundararaghavan (UMich), Marc De Graef (CMU), Wei Chen (NU), Cate Brinson (Duke), Logan Ward (UC), Carelyn Campbell (NIST), Kamal Choudhary (NIST), Francesca Tavazza (NIST), Andrew Reid (NIST), Stefanos Papanikolaou (WVU) Team Members: Alok Choudhary, Wei-keng Liao, Kasthurirangan Gopalakrishnan, Dipendra Jha, Zijiang Yang, Arindam Paul

Research Thrusts

• NIST Center of Excellence: Center for Hierarchical Materials Design
(CHiMaD) • AFOSR MURI: Managing the Mosaic of Microstructure • DARPA SIMPLEX: Data-Driven Discovery for Designed Thermoelectric Materials • NSF BigData Spoke: SPOKE: MIDWEST: Collaborative: Integrative Materials Design (IMaD): Leverage, Innovate, & Disseminate • NU Data Science Initiative: Data-driven analytics for understanding processing-structure- property-performance relationships in steel alloys • DLA: Digital Innovation Design (DID) • Toyota Motor Corporation: The investigation of machine learning for material development Current and Past Projects

Overview • Introduction ★ Paradigms of Science ★ Deep Learning:
Advantages, Challenges, Network Types • Illustrative Materials Informatics ★ Forward PSPP models ★ Inverse PSPP models ★ Structure characterization ★ Deep materials informatics • Materials Informatics Tools

Paradigms of Science A. Agrawal and A. Choudhary, “Perspective: Materials
informatics and big data: Realization of the “fourth paradigm” of science in materials science, APL Materials, 4, 053208 (2016), doi:10.1063/1.4946894 1st paradigm: Empirical science 2nd paradigm: Model-based theoretical science 3rd paradigm: Computational science (simulations) 4th paradigm: (Big) data driven science 2000 1950 1600 Laws of Thermodynamics Density Functional Theory, Molecular Dynamics ∆U = Q – W Change in Heat Work internal added done energy to system by system Experiments Predictive analytics Clustering Relationship mining Anomaly detection

“Ability of machines to perform tasks that normally require human
intelligence” [2018 DOD AI Strategy] Artificial Intelligence Artificial Intelligence Machine Learning Deep Learning Weak AI Strong AI Super AI

“Field of study that gives computers the ability to learn
without being explicitly programmed” [Arthur Samuel, 1959] Machine Learning • Algorithms whose performance improves as they are exposed to more data over time • AI/ML has been around for decades, but it always has been hungry for big data and big compute è Deep Learning Artificial Intelligence Machine Learning Deep Learning

Deep Learning “A rediscovery of neural networks fueled by the
availability of big data and big compute”

Deep Learning Success Stories

Amount of data Performance Deep learning Most learning algorithms Deep
Learning

Types of Deep Learning Networks Fully connected network (MLP) Generative
adversarial network (GAN) Convolutional neural network (CNN) Residual learning network (ResNet) Recurrent neural network (RNN)

https://thenextweb.com/artificial-intelligence/2019/02/13/thispersondoesnotexist-com-is-face-generating-ai-at-its-creepiest/

Illustrative Materials Informatics • Forward PSPP models (property prediction) o
Steels [IMMI 2014, CIKM 2016, IJF 2018, DSAA 2019] o Crystalline stability [PRB 2014, npjCM 2016, ICDM 2016, DL-KDD 2016, PRB 2017, SciRep 2018, KDD 2019, NatureComm 2019] o Band gap and glass forming ability prediction [npjCM 2016] o Bulk modulus prediction [RSC Adv 2016] o Seebeck coefficient prediction [JCompChem 2018] o Multi-scale localization/homogenization [IMMI 2015, IMMI 2017, CMS 2018, ActaMat 2019, IJCNN 2019] o Chemical properties prediction [NIPS MLMM 2018, IJCNN 2019, Molecular Informatics 2019] • Inverse PSPP models (optimization/discovery) o Stable compounds [PRB 2014] o Magnetostrictive materials [Scientific Reports 2015, AIAA 2018] o Semiconductors and metallic glasses [npjCM 2016] o Microstructure design (GAN) [JMD 2018] o Titanium aircraft panels [CMS 2019] • Structure characterization o EBSD Indexing [BigData-ASH 2016, M&M 2018] o Crack detection in macroscale images [CBM 2017, IJTTE 2018] o XRD analysis for phase detection [IJCNN 2019] o Plastic deformation identification [IJCNN 2019]

DFT Data Mining Density Functional Theory • Very slow simulations
• Require crystal structure as input Training Data • Hundreds of thousands of DFT calculations from (OQMD) • JARVIS-DFT (NIST) Composition-based models • 145 attributes (stoichiometric/ elemental/electronic/ionic) Structure-aware models • Voronoi tessellations to capture local environment of atoms Deep learning models (ElemNet) • Use only element fractions • 20% more accurate and two orders of magnitude faster • Learn chemistry of materials Inverse models • Stable compounds, metallic glasses, semiconductors, quaternary heuslers Software • FEpredictor, Magpie Agrawal et al., ICDM 2016; Ward et al., npj Comp Mat 2016; Ward et al., PRB 2017; Liu et al., DL-KDD 2016; Jha et al., SciRep 2018 Online Tool: http://info.eecs.northwestern.edu/FEpredictor Collaboration between Agrawal, Choudhary, Wolverton, Ward, NIST

ElemNet: Deep Learning the Chemistry of Materials Jha et al.,
Scientific Reports 2018 Li

Model Type Plain Network SRNet IRNet 17-layer 0.0653 0.0551 0.0411
24-layer 0.0719 0.0546 0.0403 48-layer 0.1085 0.0471 0.0382 48-layers Motivation • Deep neural networks suffer from the vanishing gradient problem as depth increases Proposed Solution • Individual residual learning with skip connections across each layer Datasets • OQMD-SC (435,582 x 271) • OQMD-C (341,443 x 145) • MP-C (83,989 x 145) Results • IRNet > SRNet > PlainNetwork • Up to 65% reduction in MAE • IRNet beats best of 10 traditional ML approaches (e.g. Random Forest) on 9 out of 10 dataset-property combinations Formation enthalpy prediction MAE (eV/atom) on OQMD-SC dataset Deeper Learning: Individual Residual Network (IRNet) D. Jha, L. Ward, Z. Yang, C. Wolverton, I. Foster, W.-keng Liao, A. Choudhary, and A. Agrawal, “IRNet: A General Purpose Deep Residual Regression Framework for Materials Discovery,” 25th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD), 2019, pp. 2385–2393.

EBSD Indexing Using Deep Learning Collaboration between Agrawal, Choudhary, De
Graef Tilted specimen Diffraction plane X Y Z Z’ Y’ X’ ϕ φ 1 φ 2 (a) (b) ϕ Objective: Fast and accurate indexing of electron backscatter diffraction (EBSD) patterns Solution: Deep convolutional neural networks with customized loss function Electron beam Tilted specimen Diffraction plane Screen detector X Y Z Z’ Y’ X’ ϕ φ 1 φ 2 (a) (b) ϕ Predictor MAE (degrees) Training time Run time 1-NN 5.7, 5.7, 7.7 0 375s Deep Learning 2.5, 1.8, 4.8 7 days 50s Results: On average 16% more accurate and 86% faster predictions compared to state-of-the-art dictionary-based indexing (1-nearest-neighbor with cosine similarity) Liu et al., BigData ASH 2016; Jha et al., M&M 2018 Data: 375K simulated EBSD patterns Model Loss Function Simulation Data Experimental Data Mean Disorientation Mean Disorientation Mean Symmetrically Equivalent Orientation Absolute Error (MSEAE) Dictionary based Indexing - - 0.652 [0.6592, 0.3534, 0.6484] Deep Learning Mean Absolute Error 0.064 0.596 [0.4039, 0.1776, 0.4426] Mean Squared Error 0.292 1.285 - Mean Disorientation 0.272 1.224 - MAE + Mean Disorientation 0.132 0.548 [0.7155, 0.2194, 0.7066] MSE + Mean Disorientation 0.171 0.658 -

FEM Data Mining: Deep learning for Localization Relationships FE DL
MKS FE DL MKS Results: Fast approximate to FEM and much more accurate than existing data-driven methods. Challenge: Predict from 0/1 to real numbers! Solution: 3-D CNNs Contrast 50 Results: 5.71% Average MASE Contrast 10 Results: 3.07% Average MASE Collaboration between Agrawal, Choudhary, Kalidindi Yang et al., Acta Mat 2019

Challenge • Identifying a low dimensional microstructure representation • Use
it for materials design Proposed Solution • Deep learning • Generative adversarial networks • Bayesian optimization with RCWA Data • 5000 128x128 images synthesized using GRF method Results • 4x4 matrix (design variables) • Statistically similar microstructures • 17% better optical absorption • Scalable generator • Transferable discriminator Yang and Li et al., JMD 2018 Collaboration between Agrawal, Choudhary, Chen, Brinson Deep Adversarial Learning for Microstructure Design

Pavement Crack Detection Using Deep Transfer Learning Objective: Fast and
accurate crack detection from Hot-Mix Asphalt (HMA) and Portland Cement Concrete (PCC) surfaced pavement images Solution: A binary classifier trained on ImageNet pre-trained VGG-16 CNN features for pavement images Results: Up to 90% classification accuracy and 0.87 AUC Gopalakrishnan et al., CBM 2017 Data: Pavement distress images from the Federal Highway Administration’s (FHWA’s) Long-Term Pavement Performance (LTPP) program Challenges: Inhomogeneity of crack, diversity of surface texture, background complexity, presence of non-crack features such as joints, etc.

Motivation • Magnet properties prediction in a complex processing workflow
Challenges • Experimental data • Small, heterogenous, noisy Methodology • Gradient boosting • Deep transfer learning from VGG16 for SEM image featurization Results and Impact • <5% prediction error for Hcj and Br • Cost implications: only relevant experiments, avoid higher-end processing for unpromising candidates, reduce SEM man-hours è savings of millions of $$$ • Faster magnets design: identify most promising regions and routes Yang et al., ICDM LMID 2019 Industrial Materials Design P1 Hcj P1 Br P2 Hcj P2 Br Combination (Numerical + Image) Model Workflow A typical processing workflow

https://doi.org/10.1557/mrc.2019.73

Illustrative Materials Informatics Tools http://info.eecs.northwestern.edu

Thank you!

Deep Materials Informatics: Illustrative Applic...

Deep Materials Informatics: Illustrative Applications of Deep Learning in Materials Science

Daniel Wheeler

More Decks by Daniel Wheeler

Other Decks in Science

Featured

Transcript

Ankit Agrawal Research Associate Professor Department of Electrical Engineering and

Research Thrusts

• NIST Center of Excellence: Center for Hierarchical Materials Design

Overview • Introduction ★ Paradigms of Science ★ Deep Learning:

Overview • Introduction ★ Paradigms of Science ★ Deep Learning:

Paradigms of Science A. Agrawal and A. Choudhary, “Perspective: Materials

“Ability of machines to perform tasks that normally require human

“Field of study that gives computers the ability to learn

Deep Learning “A rediscovery of neural networks fueled by the

Deep Learning Success Stories

Amount of data Performance Deep learning Most learning algorithms Deep

Types of Deep Learning Networks Fully connected network (MLP) Generative

https://thenextweb.com/artificial-intelligence/2019/02/13/thispersondoesnotexist-com-is-face-generating-ai-at-its-creepiest/

Overview • Introduction ★ Paradigms of Science ★ Deep Learning:

Illustrative Materials Informatics • Forward PSPP models (property prediction) o

DFT Data Mining Density Functional Theory • Very slow simulations

ElemNet: Deep Learning the Chemistry of Materials Jha et al.,

Model Type Plain Network SRNet IRNet 17-layer 0.0653 0.0551 0.0411

EBSD Indexing Using Deep Learning Collaboration between Agrawal, Choudhary, De

FEM Data Mining: Deep learning for Localization Relationships FE DL

Challenge • Identifying a low dimensional microstructure representation • Use

Pavement Crack Detection Using Deep Transfer Learning Objective: Fast and

Motivation • Magnet properties prediction in a complex processing workflow

https://doi.org/10.1557/mrc.2019.73

Overview • Introduction ★ Paradigms of Science ★ Deep Learning:

Illustrative Materials Informatics Tools http://info.eecs.northwestern.edu

Thank you!