Molecular Conformation Generation Published as a conference paper at International Conference on Learning Representation (ICLR) 2022 Authors : Minkai Xu1,2, Lantao Yu3, Yang Song3, Chence Shi1,2, Stefano Ermon3∗, JianTang1,4,5∗ 1. Mila-Québec AI Institute, Canada 2. Universitéde Montréal, Canada 3. Stanford University, USA 4. HEC Montréal, Canada 5. CIFAR AI Research Chair Paper link : https://openreview.net/pdf?id=PzcvxEMzvQC Review by Daiki Koge (PhD Student at CSB Lab)
For real world molecules, computing 3D structures (conformer) is expensive. l This work studies how to predict valid and stable conformation from molecular graph. • Input : Molecular graph 𝓖 (2D atom-bond graph). • Output : Conformation 𝑪 (atomic 3D coordinates). • Model : Deep generative model 𝑝! (𝑪|𝓖) (as Boltzmann generator). 𝑝! (𝑪|𝓖) Molecular graph 𝓖 Sampling Conformer 𝑪 𝑪 ~ 𝑝(𝑪) ∝ exp(− 𝐸 𝑪 𝑘𝐵 𝑇 ) Boltzmann distribution 1
𝓖 = 𝒱, ℰ , where 𝒱 = 𝑣" " # is the set of vertices representing atoms and ℰ = 𝑒"$ | 𝑖, 𝑗 ⊆ |𝒱|×|𝒱| is the set of edges representing inter-atomic bonds. n Conformer 𝑪 : ・Full set of 3D positions 𝑪 = 𝒄𝟏 , 𝒄𝟏 , … , 𝒄𝒏 ∈ ℝ#×(, where 𝒄𝒊 ∈ ℝ( is a 3D coordinate [𝑥, 𝑦, 𝑧] of the 𝑖-th atom. There are multiple conformers for a molecule. O H 𝑣* 𝑣+ H 𝑣( 𝓖 - (H2 O) 𝑪 - (H2 O) 𝑣* : Hydrogen 𝑣+ : Oxygen 𝑣( : Hydrogen 𝑒*+ : Single 𝑒+( : Single 𝒱 ℰ 𝒄𝟏 𝒄𝟑 𝒄𝟐 𝒄𝟏 : 0.20 - 0.1 0.00 𝒄𝟑 : - 0.2 - 0.1 0.00 𝒄𝟐 : 0.00 0.20 0.00 𝑥 𝑦 𝑧 Unit : Å 2
GeoDiff • Authors define a denoising diffusion model*1 that directory operates on the conformer coordinates. Generative process (reverse diffusion process) Diffusion process • For the diffusion process, noise from fixed posterior distributions 𝑞 𝐶. 𝐶./* is gradually added until the conformation is destroyed. Symmetrically, for the generative process, an initial state noise 𝑪0 is sampled from normal distribution, and progressively refined via Markov chain 𝑝! 𝐶./* 𝓖, 𝐶. . Figure : GeoDiff Conformer 𝑪𝟎 (Observed) Noise 𝑪0~ 𝛮(𝟎, 𝑰) and Mol Graph 𝓖. 3
1 − 𝛽. 𝑪./* + 𝛽. 𝝐𝒕 , 𝝐𝒕 ~ 𝛮(𝟎, 𝑰) (𝛽#: Noise magnitude at time 𝑡˞ pre-defined n Probability distribution at time 𝑡: n Joint distribution : Conformer 𝑪𝟎 (Observed) Noise 𝑪0 ~ 𝛮(𝟎, 𝑰) and Mol Graph 𝓖. 4
𝑰) and Mol Graph 𝓖. Conformer 𝑪𝟎 (Observed) n Probability distribution at time 𝑡 : Generative process (reverse diffusion process) Diffusion process : ( ) Noise prediction model (Graph field network*2) n The likelihood 𝑝! (𝑪𝟎| 𝓖) of the generative process must be SE(3)-invariant (ෆม). Ø Authors proposed an SE(3)-invariant likelihood function. 5 𝑪. = 1 − 𝛽. 𝑪./* + 𝛽. 𝝐𝒕 , 𝝐𝒕 ~ 𝛮(𝟎, 𝑰) (𝛽#: Noise magnitude at time 𝑡˞ pre-defined
(H2 O) 𝑥 𝑦 𝑧 𝑪𝟎′ - (H2 O) 𝑥 𝑦 𝑧 Rotation By 𝑻𝒈 (Rotation Matrix) 𝒄𝟏 𝒄𝟐 𝒄𝟑 l SE(3) - invariant likelihood function : 𝑝! (𝑪𝟎| 𝓖) = 𝑝! (𝑻𝒈 𝑪𝟎| 𝓖) 𝑻𝒈 𝑪 Ø 𝑪𝟎 and 𝑪𝟎′ have the same energy 𝐸 and same physical properties. Ø Conventional SE(3)-invariant conformer generators (Köhler et al., 2020; Satorras et al., 2021; Shi et al., 2021; Zhu et al., 2021). Ø SE(3)-invariant likelihood is important for conformer design. 6
R .5* 0 𝑝! (𝑪𝒕/𝟏| 𝓖, 𝑪𝒕)𝒅𝑪𝟎: 𝑪𝑻 Generative process at time 𝑡 Noise 𝑪0 ~ 𝛮(𝟎, 𝑰) and Mol Graph 𝓖. Conformer 𝑪𝟎 (Observed) Generative process (reverse diffusion process) 𝑝! (𝑪𝟎| 𝓖) = Q 𝑝 𝑪𝑻 𝑝! (𝑪𝟎, … 𝑪𝑻/𝟏| 𝓖, 𝑪𝑻)𝒅𝑪𝟎: 𝑪𝑻 If the 𝑝! (𝑪𝒕/𝟏| 𝓖, 𝑪𝒕) is SE(3)-equivariant (ಉม) and the prior 𝑝(𝑪𝑻) is SE(3)-invariant, 𝑝! 𝑪𝒕/𝟏| 𝓖, 𝑪𝒕 = 𝑝! 𝑻𝒈 𝑪𝒕/𝟏| 𝓖, 𝑻𝒈 𝑪𝒕 → SE(3)-equivariant then the likelihood 𝑝! (𝑪𝟎| 𝓖) is SE(3)-invariant (𝑝! (𝑻𝒈 𝑪𝟎| 𝓖) = 𝑝! (𝑪𝟎| 𝓖)). Ø Detailed proofs are in the appendix of this paper. 7
𝑡) ≡ Graph Field Network. 4. ① Shift the center of gravity of sampled 𝐶7 to Zero*3. Ø Satisfy invariance for density 𝑝! (𝑪𝟎| 𝓖) to translation (ฒਐૢ࡞). Ø This operation is also used for training model. ① 11
diversity of generated conformers following the conventional measurement. 𝑆8 : Sets of referenced conformers from a dataset. 𝑆9 : Sets of generated conformers. Twice of the size of 𝑆8 . 𝛿 : distance threshold. l The other two metrics COV-P and MAT-P can be defined similarity, but with the generated and referenced sets exchanged. Ø This metrics depend more on the quality. 𝛿 : 0.5 Å and 1.25 Å for QM9 and Drugs datasets respectively. 13
properties from molecular graph. • First, 50 conformers are created for each molecular graph, and the following ensemble properties are calculated using psi4 (Smith et al., 2020) to create the data set. ` 𝐸 : Average energy. 𝐸<"# : Lowest energy. Δ𝜖 : Average HOMO-LUMO gap. Δ𝜖<"# , Δ𝜖<=> : Maximum and minimum values of HOMO-LUMO gap. • Create a model that predicts the above properties from a molecular graph using GNN. 16
molecular conformations using denoising diffusion probabilistic model. • By creating SE(3)-invariant likelkelihood, authors were able to create a versatile generative model. • Comprehensive experiments over multiple tasks demonstrate that GeoDiff outperforms the existing state-of-the-art models. 18