Model performance
• Performance tested on 29 unseen defects in 12 unseen compositions
• It extrapolates the learned defect motifs to new chemistries
• Correct relaxation of defect environments structures similar to DFT ones
• Ground state structure (GS) identified for 95% of test cases (29 unseen
defects)
• Number of DFT calculations reduced by 70% (from 14 to ~3 per defect) ✅
The solution: sampling the defect PES
• We generate 14 distorted structures through targeted bond distortions and
rattling, which are then relaxed with DFT
How prevalent are defect reconstructions?
• Reconstructions found in all tested materials (CdTe, GaAs, Sb
2
S
3
, Sb
2
Se
3
,
CeO
2
, In
2
O
3
, ZnO, TiO
2
)
• Large energetic and structural changes
• Drastically affect predicted properties
• Follow common motifs (dimers, off-centring, Jahn-Teller distortions) based
on the physico-chemical factors driving the energy lowering
Can we increase sampling efficiency?
Automated in Python
package ShakeNBreak
Dataset
• Chemical space of sulphides and selenides , due to their relevance for
photovoltaic applications and complex defect PES
• 126 neutral cation vacancies in 56 low-symmetry sulphides and selenides
• 30% of defects undergo symmetry-breaking reconstructions missed by the
standard modelling approach, driven by anion-anion bond formation
Model
Fine-tune a universal GNN-based force field (M3GNet), pre-trained on bulk
relaxations from the materials project database
Conclusions
• Defect reconstructions are very prevalent and often missed by the standard
modelling approach incorrect predicted properties
• Current defect structure searching methods require many DFT relaxations
• A fine-tuned MLFF can be used to qualitatively explore the PES & select
promising candidate structures
• Transfer learning bulk chemistry defects
• The model identifies the ground state structure for 95% of tested defects
(Local minimum) (Global minimum)
from ideal structure
❌ ✅
Machine-learning point defect reconstructions
References
• I. Mosquera-Lois & S.R. Kavanagh, Matter 4, 2602 (2021)
• I. Mosquera-Lois, S.R. Kavanagh, A. Walsh & D.O. Scanlon, J. Open Source Softw. 7, 4817
(2022)
• I. Mosquera-Lois, S.R. Kavanagh, A. Walsh & D.O. Scanlon, npj Comp Mater 9, 25 (2023)
• M. Arrigoni & G.K.H. Madsen, npj Comp Mater 7 (2021)
• C. Chen & S.P. Ong, Nat Comput Sci 2, 718–728 (2022)
The problem
• Point defects control the properties of most functional materials. However,
the standard modelling approach can result in incorrect predicted properties.
Here we develop a method to tackle this issue
Standard defect modelling approach:
An initial defect structure is built by placing a defect on a site of a supercell,
followed by a geometry optimisation
Problem:
Irea Mosquera-Lois, Seán R. Kavanagh, David O. Scanlon, Alex Ganose, Aron Walsh
Defect-neighbour
distances distorted
by varying amounts
Random perturbations
to all supercell atoms
Γ-point only
[email protected] Link to papers & package:
Defect Host
Num. local
minima in DFT
PES
Num. DFT
calculations
Symmetry-
broken GS?
(Y/-)
GS
identified?
V
Bi
BiSBr 4 5 Y ✅
V
Bi
BiSCl 3 4 Y ✅
V
Sb,1
Sb
2
S
3
7 3 Y ✅
V
Sb,2
Sb
2
S
3
10 5 Y ✅
V
Cu
CuAsS 2 1 - ✅
V
As
CuAsS 2 4 - ✅
V
Cu,1
CuS 1 1 - ✅
V
Cu,2
CuS 3 3 - ✅
V
Cu,1
CuSe 1 1 Y ✅
V
Cu,2
CuSe 2 1 Y ✅
V
Li,1
Li
4
SnS
4
4 3 - ✅
V
Li,2
Li
4
SnS
4
3 3 - ✅
V
Li,3
Li
4
SnS
4
3 3 - ✅
V
Sn
Li
4
SnS
4
5 8 Y
V
Na,1
Na
2
S
5
2 2 - ✅
V
Na,2
Na
2
S
5
2 3 - ✅
V
P
Tl
3
PS
4
7 5 Y ✅
V
Tl,1
Tl
3
PS
4
2 3 - ✅
V
Tl,2
Tl
3
PS
4
5 3 - ✅
NaTiCuS
3
BiSBr AgBi
3
S
5
CuGaS SnS CuAsSe InGaS
3
Metastable Ground state
(standard relaxation) (this work)
V
Sb.2
(Sb
2
S
3
)
V
Bi
(BiSBr)
V
As
(CuAsS)
Example reconstructions:
DFT
MLFF
Initial distorted structure
used to sample the PES
V
Sb.1
(Sb
2
S
3
)
D
C
(SOAP) = 0.0
• Transfer learning bulk chemistry to defects
• Validated and tested in unseen
compositions generalizability
MAE
Energy
(meV/atom)
Force
(meV/Å)
Stress
(GPa)
Train 18.8 56.5 0.10
Val. 27.0 93.4 0.13
Test 27.3 86.8 0.19
• Ideal structure often lies in a
local minimum of the PES
• Optimisations trapped in
metastable configurations
incorrect predicted properties
• Limitation: high number of DFT
relaxations limits its application in
high-throughput studies
• Solution: Surrogate model to
qualitatively explore the PES
select candidate structures
• PES of each defect explored with 14 relaxations
• Dataset built with 10 frames from each relaxation
• Split into train/validation/test composition-wise