(CHiMaD) • AFOSR MURI: Managing the Mosaic of Microstructure • DARPA SIMPLEX: Data-Driven Discovery for Designed Thermoelectric Materials • NSF BigData Spoke: SPOKE: MIDWEST: Collaborative: Integrative Materials Design (IMaD): Leverage, Innovate, & Disseminate • NU Data Science Initiative: Data-driven analytics for understanding processing-structure- property-performance relationships in steel alloys • DLA: Digital Innovation Design (DID) • Toyota Motor Corporation: The investigation of machine learning for material development Current and Past Projects
informatics and big data: Realization of the “fourth paradigm” of science in materials science, APL Materials, 4, 053208 (2016), doi:10.1063/1.4946894 1st paradigm: Empirical science 2nd paradigm: Model-based theoretical science 3rd paradigm: Computational science (simulations) 4th paradigm: (Big) data driven science 2000 1950 1600 Laws of Thermodynamics Density Functional Theory, Molecular Dynamics ∆U = Q – W Change in Heat Work internal added done energy to system by system Experiments Predictive analytics Clustering Relationship mining Anomaly detection
without being explicitly programmed” [Arthur Samuel, 1959] Machine Learning • Algorithms whose performance improves as they are exposed to more data over time • AI/ML has been around for decades, but it always has been hungry for big data and big compute è Deep Learning Artificial Intelligence Machine Learning Deep Learning
• Require crystal structure as input Training Data • Hundreds of thousands of DFT calculations from (OQMD) • JARVIS-DFT (NIST) Composition-based models • 145 attributes (stoichiometric/ elemental/electronic/ionic) Structure-aware models • Voronoi tessellations to capture local environment of atoms Deep learning models (ElemNet) • Use only element fractions • 20% more accurate and two orders of magnitude faster • Learn chemistry of materials Inverse models • Stable compounds, metallic glasses, semiconductors, quaternary heuslers Software • FEpredictor, Magpie Agrawal et al., ICDM 2016; Ward et al., npj Comp Mat 2016; Ward et al., PRB 2017; Liu et al., DL-KDD 2016; Jha et al., SciRep 2018 Online Tool: http://info.eecs.northwestern.edu/FEpredictor Collaboration between Agrawal, Choudhary, Wolverton, Ward, NIST
24-layer 0.0719 0.0546 0.0403 48-layer 0.1085 0.0471 0.0382 48-layers Motivation • Deep neural networks suffer from the vanishing gradient problem as depth increases Proposed Solution • Individual residual learning with skip connections across each layer Datasets • OQMD-SC (435,582 x 271) • OQMD-C (341,443 x 145) • MP-C (83,989 x 145) Results • IRNet > SRNet > PlainNetwork • Up to 65% reduction in MAE • IRNet beats best of 10 traditional ML approaches (e.g. Random Forest) on 9 out of 10 dataset-property combinations Formation enthalpy prediction MAE (eV/atom) on OQMD-SC dataset Deeper Learning: Individual Residual Network (IRNet) D. Jha, L. Ward, Z. Yang, C. Wolverton, I. Foster, W.-keng Liao, A. Choudhary, and A. Agrawal, “IRNet: A General Purpose Deep Residual Regression Framework for Materials Discovery,” 25th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD), 2019, pp. 2385–2393.
Graef Tilted specimen Diffraction plane X Y Z Z’ Y’ X’ ϕ φ 1 φ 2 (a) (b) ϕ Objective: Fast and accurate indexing of electron backscatter diffraction (EBSD) patterns Solution: Deep convolutional neural networks with customized loss function Electron beam Tilted specimen Diffraction plane Screen detector X Y Z Z’ Y’ X’ ϕ φ 1 φ 2 (a) (b) ϕ Predictor MAE (degrees) Training time Run time 1-NN 5.7, 5.7, 7.7 0 375s Deep Learning 2.5, 1.8, 4.8 7 days 50s Results: On average 16% more accurate and 86% faster predictions compared to state-of-the-art dictionary-based indexing (1-nearest-neighbor with cosine similarity) Liu et al., BigData ASH 2016; Jha et al., M&M 2018 Data: 375K simulated EBSD patterns Model Loss Function Simulation Data Experimental Data Mean Disorientation Mean Disorientation Mean Symmetrically Equivalent Orientation Absolute Error (MSEAE) Dictionary based Indexing - - 0.652 [0.6592, 0.3534, 0.6484] Deep Learning Mean Absolute Error 0.064 0.596 [0.4039, 0.1776, 0.4426] Mean Squared Error 0.292 1.285 - Mean Disorientation 0.272 1.224 - MAE + Mean Disorientation 0.132 0.548 [0.7155, 0.2194, 0.7066] MSE + Mean Disorientation 0.171 0.658 -
MKS FE DL MKS Results: Fast approximate to FEM and much more accurate than existing data-driven methods. Challenge: Predict from 0/1 to real numbers! Solution: 3-D CNNs Contrast 50 Results: 5.71% Average MASE Contrast 10 Results: 3.07% Average MASE Collaboration between Agrawal, Choudhary, Kalidindi Yang et al., Acta Mat 2019
it for materials design Proposed Solution • Deep learning • Generative adversarial networks • Bayesian optimization with RCWA Data • 5000 128x128 images synthesized using GRF method Results • 4x4 matrix (design variables) • Statistically similar microstructures • 17% better optical absorption • Scalable generator • Transferable discriminator Yang and Li et al., JMD 2018 Collaboration between Agrawal, Choudhary, Chen, Brinson Deep Adversarial Learning for Microstructure Design
accurate crack detection from Hot-Mix Asphalt (HMA) and Portland Cement Concrete (PCC) surfaced pavement images Solution: A binary classifier trained on ImageNet pre-trained VGG-16 CNN features for pavement images Results: Up to 90% classification accuracy and 0.87 AUC Gopalakrishnan et al., CBM 2017 Data: Pavement distress images from the Federal Highway Administration’s (FHWA’s) Long-Term Pavement Performance (LTPP) program Challenges: Inhomogeneity of crack, diversity of surface texture, background complexity, presence of non-crack features such as joints, etc.
Challenges • Experimental data • Small, heterogenous, noisy Methodology • Gradient boosting • Deep transfer learning from VGG16 for SEM image featurization Results and Impact • <5% prediction error for Hcj and Br • Cost implications: only relevant experiments, avoid higher-end processing for unpromising candidates, reduce SEM man-hours è savings of millions of $$$ • Faster magnets design: identify most promising regions and routes Yang et al., ICDM LMID 2019 Industrial Materials Design P1 Hcj P1 Br P2 Hcj P2 Br Combination (Numerical + Image) Model Workflow A typical processing workflow