Graph convolutional policy network for goal directed molecular graph generation

1 DEEP LEARNING JP [DL Papers] http://deeplearning.jp/ Graph Convolutional Policy
Network for Goal-Directed Molecular Graph Generation (NIPS2018) Kazuki Fujikawa, DeNA

• – DC - C C , D C
• - 0 C L • GD 2 D D G 2 8 1 D B • – DC - C -La • a S ryfo at S • lncp V vx NYw a S – ry IhJgk pf s a a R • t y idje u I y Y R ry t PV M

• • • •

• D eg sk n – eg l 5
i - – u r chez-Lengeling, Dennis Sheberla, Jorge Aguilera-Iparraguirre, Timothy D. Hirzel, ms,∇,∥ and Alán Aspuru-Guzik*,‡,⊥ merica Inc., 10 Post Office Square, Suite 800, Boston, Massachusetts 02109, United States Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts 02138, United States Computer Science, University of Toronto, 6 King’s College Road, Toronto, Ontario M5S 3H5, Canada Engineering, University of Cambridge, Trumpington Street, Cambridge CB2 1PZ, U.K. Mountain View, California, United States rsity, Princeton, New Jersey, United States pired Solar Energy Program, Canadian Institute for Advanced Research (CIFAR), Toronto, Ontario M5S 1M1, Information We report a method to convert discrete of molecules to and from a multidimensional resentation. This model allows us to generate new efficient exploration and optimization through paces of chemical compounds. A deep neural trained on hundreds of thousands of existing tures to construct three coupled functions: an oder, and a predictor. The encoder converts the sentation of a molecule into a real-valued ctor, and the decoder converts these continuous discrete molecular representations. The predictor mical properties from the latent continuous vector of the molecule. Continuous representations of molecules allow us to automatically generate novel chemical erforming simple operations in the latent space, such as decoding random vectors, perturbing known chemical interpolating between molecules. Continuous representations also allow the use of powerful gradient-based o efficiently guide the search for optimized functional compounds. We demonstrate our method in the domain of ਤҾ༻: Gomez-Bombarelli+, 2018

• • • •

• t I G e no – F F
• () – C r a • () - 2-- 2 2 – () C f r – E M F • m rC m • L SC 7 m rC , , , , 2 0 , 2 0 2 0 0 2, 0 2 0 0 0 20 , 02 , 0 ,1 , 02 , 2, 0 , , 20 , 02 , ,1 , 02 , ,1 , 02 , , , , ,

- • mg orO y snvc 2 ,- 2cy bdq
Ol – GB G 8 8 8 A A G E I A 8BAG A B E E AG G BA B B 8 +M B 7 E • 2 ,- 2c GB A8B E [ b R] b [N c • rf [ c – 01+ + E • 1 8B E] b2 ,- 2 y c+ 1-[ • 2 D+ z N 8E A GBE V id c ] • ptOu meh I E G Sa id z ] b esign Using a Data-Driven Continuous ecules ifer N. Wei,‡,# David Duvenaud,¶,# José Miguel Hernández-Lobato,§,# nnis Sheberla,‡ Jorge Aguilera-Iparraguirre,† Timothy D. Hirzel,† ru-Guzik*,‡,⊥ e Square, Suite 800, Boston, Massachusetts 02109, United States iology, Harvard University, Cambridge, Massachusetts 02138, United States sity of Toronto, 6 King’s College Road, Toronto, Ontario M5S 3H5, Canada Cambridge, Trumpington Street, Cambridge CB2 1PZ, U.K. United States ey, United States m, Canadian Institute for Advanced Research (CIFAR), Toronto, Ontario M5S 1M1, to convert discrete m a multidimensional ows us to generate new optimization through ounds. A deep neural thousands of existing coupled functions: an e encoder converts the e into a real-valued verts these continuous ntations. The predictor tent continuous vector ous representations of molecules allow us to automatically generate novel chemical Research Article Cite This: ACS Cent. Sci. 2018, 4, 268−276 Gomez-Bombarelli+, 2018 to generate drug-like molecules. [Gómez-Bombarelli et al., 2016b] employed a variational autoencoder to build a latent, continuous space where property optimization can be made through surrogate optimization. Finally, [Kadurin et al., 2017] presented a GAN model for drug generation. Ad- ditionally, the approach presented in this paper has recently been applied to molecular design [Sanchez-Lengeling et al., 2017]. In the field of music generation, [Lee et al., 2017] built a SeqGAN model employing an efficient representation of multi-channel MIDI to generate polyphonic music. [Chen et al., 2017] presented Fusion GAN, a dual-learning GAN model that can fuse two data distributions. [Jaques et al., 2017] employ deep Q-learning with a cross-entropy reward to optimize the quality of melodies generated from an RNN. In adversarial training, [Pfau and Vinyals, 2016] recontex- tualizes GANs in the actor-critic setting. This connection is also explored with the Wasserstein-1 distance in WGANs [Arjovsky et al., 2017]. Minibatch discrimination and feature mapping were used to promote diversity in GANs [Salimans et al., 2016]. Another approach to avoid mode collapse was shown with Unrolled GANs [Metz et al., 2016]. Issues and convergence of GANs has been studied in [Mescheder et al., 2017]. 3 Background In this section, we elaborate on the GAN and RL setting based on SeqGAN [Yu et al., 2017] G✓ is a generator parametrized by ✓, that is trained to pro- duce high-quality sequences Y1:T = (y1, ..., yT ) of length T and a discriminator model D parametrized by , trained to classify real and generated sequences. G✓ is trained to deceive D , and D to classify correctly. Both models are trained in alternation, following a minimax game: is completed. In order to do so, we perform N-time Monte Carlo search with the canonical rollout policy G✓ represented as MCG✓ (Y1:t; N) = {Y 1 1:T , ..., Y N 1:T } (3) where Y n 1:t = Y1:t and Y n t+1:T is stochastically sampled via the policy G✓ . Now Q(s, a) becomes Q(Y1:t 1, yt) = 8 > < > : 1 N P n=1..N R(Y n 1:T ), with Y n 1:T 2 MCG✓ (Y1:t; N), if t < T. R(Y1:T ), if t = T. (4) An unbiased estimation of the gradient of J(✓) can be de- rived as r✓J(✓) ' 1 T X t=1,...,T E yt ⇠G✓(yt |Y1:t 1) [ r✓ log G✓(yt |Y1:t 1) · Q(Y1:t 1, yt)] (5) Finally in SeqGAN the reward function is provided by D . 4 ORGAN Figure 1: Schema for ORGAN. Left: D is trained as a classifier receiving as input a mix of real data and generated data by G. Right: Guimaraes+, 2017

- • eo m f nhpc eo c dl g
– + 1 8 1 8 9 1 +8 0 • i vc E • r eo M 1 9A 8 T scu C c i vc ] – A 2 8 1 81 8 19 A 2 , 9 2A91 1 1 8 8 0 • V ct ep l [ ] L aC eo c G 2 8 8 • c T J • 1 9A 8 T sM eo Learning Deep Generative Models of Graphs Add edge? (yes/no) Add edge? (yes/no) 2 Add node (2)? (yes/no) 2 Pick node (0) to add edge (0,2) Add node (0)? (yes/no) Add edge? (yes/no) Add edge? (yes/no) 1 1 0 0 Add node (1)? (yes/no) 1 Pick node (0) to add edge (0,1) 0 1 0 1 0 1 0 1 0 2 Add edge? (yes/no) 1 0 2 0 0 Generation steps Figure 1. Depiction of the steps taken during the generation process. Junction Tree Variational Autoencoder for Molecular Graph Generation Figure 2. Comparison of two graph generation schemes: Structure by structure approach is preferred as it avoids invalid intermediate states (marked in red) encountered in node by node approach. ond phase, the subgraphs (nodes in the tree) are assembled together into a coherent molecular graph. We evaluate our model on multiple tasks ranging from molecular generation to optimization of a given molecule according to desired properties. As baselines, we utilize state-of-the-art SMILES-based generation approaches (Kus- ner et al., 2017; Dai et al., 2018). We demonstrate that our model produces 100% valid molecules when sampled from a prior distribution, outperforming the top perform- ing baseline by a signiﬁcant margin. In addition, we show that our model excels in discovering molecules with desired properties, yielding a 30% relative gain over the baselines. 2. Junction Tree Variational Autoencoder Our approach extends the variational autoencoder (Kingma Figure 3. Overview of our method: A molecular graph G is ﬁrst decomposed into its junction tree TG , where each colored node in the tree represents a substructure in the molecule. We then encode both the tree and graph into their latent embeddings z and z . Li+, 2018 Jin+, 2018

• • • •

• D – ! = {$% } • : '
M D – A = {)% } • D 1 – P = +($%-. |$% , … , $2 , )% ) • $% , … , $2 )% – R = {$% } • $% P 3 Proposed Method In this section we formulate the problem of graph generation as learning an RL agent that iteratively adds substructures and edges to the molecular graph in a chemistry-aware environment. We describe the problem definition, the environment design, and the Graph Convolutional Policy Network that predicts a distribution of actions which are used to update the graph being generated. 3.1 Problem Definition We represent a graph G as (A, E, F), where A 2 {0, 1}n⇥n is the adjacency matrix, and F 2 Rn⇥d is the node feature matrix assuming each node has d features. We define E 2 {0, 1}b⇥n⇥n to be the (discrete) edge-conditioned adjacency tensor, assuming there are b possible edge types. Ei,j,k = 1 if there exists an edge of type i between nodes j and k, and A = Pb i=1 Ei . Our primary objective is to generate graphs that maximize a given property function S(G) 2 R, i.e., maximize E G0 [S(G 0)], where G 0 is the generated graph, and S could be one or multiple domain-specific statistics of interest. It is also of practical importance to constrain our model with two main sources of prior knowledge. (1) Generated graphs need to satisfy a set of hard constraints. (2) We provide the model with a set of example graphs G ⇠ pdata(G), and would like to incorporate such prior knowledge by regularizing the property optimization objective with E G,G0 [J(G, G 0)] under distance metric J(·, ·). In the case of molecule generation, the set of hard constraints is described by chemical valency while the distance metric is an adversarially trained discriminator.

( ) ) • 7 7 7 []r hf !"
# l – + 227 0A • r hf s ] p hf • N v S] o ] v u – hf !" ⋃ # s , 2 ci l • 2 o G][C – – % e d &(() o u * + (() – S n t o o &((,-) K] – .+ o K] ai . / : 1 0 .+ = .+ + 3 4 5+ = ∑7 0 .+87 3 Proposed Method In this section we formulate the problem of graph generation as learning an RL agent that iteratively adds substructures and edges to the molecular graph in a chemistry-aware environment. We describe the problem definition, the environment design, and the Graph Convolutional Policy Network that predicts a distribution of actions which are used to update the graph being generated. 3.1 Problem Definition We represent a graph G as (A, E, F), where A 2 {0, 1}n⇥n is the adjacency matrix, and F 2 Rn⇥d is the node feature matrix assuming each node has d features. We define E 2 {0, 1}b⇥n⇥n to be the (discrete) edge-conditioned adjacency tensor, assuming there are b possible edge types. Ei,j,k = 1 if there exists an edge of type i between nodes j and k, and A = Pb i=1 Ei . Our primary objective is to generate graphs that maximize a given property function S(G) 2 R, i.e., maximize E G0 [S(G 0)], where G 0 is the generated graph, and S could be one or multiple domain-specific statistics of interest. It is also of practical importance to constrain our model with two main sources of prior knowledge. (1) Generated graphs need to satisfy a set of hard constraints. (2) We provide the model with a set of example graphs G ⇠ pdata(G), and would like to incorporate such prior knowledge by regularizing the property optimization objective with E G,G0 [J(G, G 0)] under distance metric J(·, ·). In the case of molecule generation, the set of hard constraints is described by chemical valency while the distance metric is an adversarially trained discriminator.

( ) ) • : – L , MP :
!"#$ = &'(&!)(!+,-.", !.01234, !0450, !."26) 1 • 3 : 1 – 8+,-."(9") = 9'8):!;(:+(<)) !+,-." ~8+,-." 9" ∈ {0, 1}3 :+ ℝ3×E ℝ3 1 • 3 1 : 1 – 8.01234(9") = 9'8):!;(:.(<FGHIJK , <)) !.01234 ~8.01234 9" ∈ {0, 1}3#1 • 3 : : : – 80450(9") = 9'8):!;(:0(<FGHIJK , <FJLMNOP )) !0450 ~80450 9" ∈ {0, 1}Q • : L : 2 – 8."26(9") = 9'8):!;(:"(RSS < )) !."26 ~8."26 9" ∈ {0, 1} 3 Proposed Method In this section we formulate the problem of graph generation as learning an RL agent that iteratively adds substructures and edges to the molecular graph in a chemistry-aware environment. We describe the problem definition, the environment design, and the Graph Convolutional Policy Network that predicts a distribution of actions which are used to update the graph being generated. 3.1 Problem Definition We represent a graph G as (A, E, F), where A 2 {0, 1}n⇥n is the adjacency matrix, and F 2 Rn⇥d is the node feature matrix assuming each node has d features. We define E 2 {0, 1}b⇥n⇥n to be the (discrete) edge-conditioned adjacency tensor, assuming there are b possible edge types. Ei,j,k = 1 if there exists an edge of type i between nodes j and k, and A = Pb i=1 Ei . Our primary objective is to generate graphs that maximize a given property function S(G) 2 R, i.e., maximize E G0 [S(G 0)], where G 0 is the generated graph, and S could be one or multiple domain-specific statistics of interest. It is also of practical importance to constrain our model with two main sources of prior knowledge. (1) Generated graphs need to satisfy a set of hard constraints. (2) We provide the model with a set of example graphs G ⇠ pdata(G), and would like to incorporate such prior knowledge by regularizing the property optimization objective with E G,G0 [J(G, G 0)] under distance metric J(·, ·). In the case of molecule generation, the set of hard constraints is described by chemical valency while the distance metric is an adversarially trained discriminator.

• – w G NgA ecG N Q d e
sD PNvt L s pilpa sE • – / : • mAm NFSEF !(π$ , &' ) • 1 r , nA oA PQ – – • p + 4- t L !(π$ , &' ) 3 Proposed Method In this section we formulate the problem of graph generation as learning an RL agent that iteratively adds substructures and edges to the molecular graph in a chemistry-aware environment. We describe the problem definition, the environment design, and the Graph Convolutional Policy Network that predicts a distribution of actions which are used to update the graph being generated. 3.1 Problem Definition We represent a graph G as (A, E, F), where A 2 {0, 1}n⇥n is the adjacency matrix, and F 2 Rn⇥d is the node feature matrix assuming each node has d features. We define E 2 {0, 1}b⇥n⇥n to be the (discrete) edge-conditioned adjacency tensor, assuming there are b possible edge types. Ei,j,k = 1 if there exists an edge of type i between nodes j and k, and A = Pb i=1 Ei . Our primary objective is to generate graphs that maximize a given property function S(G) 2 R, i.e., maximize E G0 [S(G 0)], where G 0 is the generated graph, and S could be one or multiple domain-specific statistics of interest. It is also of practical importance to constrain our model with two main sources of prior knowledge. (1) Generated graphs need to satisfy a set of hard constraints. (2) We provide the model with a set of example graphs G ⇠ pdata(G), and would like to incorporate such prior knowledge by regularizing the property optimization objective with E G,G0 [J(G, G 0)] under distance metric J(·, ·). In the case of molecule generation, the set of hard constraints is described by chemical valency while the distance metric is an adversarially trained discriminator. adversarial rewards. A small positive reward is assigned if the action does not violate valency rules, otherwise a small negative reward is assigned. As an example, the second row of Figure 1 shows the scenario that a termination action is taken. When the environment updates according to a terminating action, both a step reward and a final reward are given, and the generation process terminates. To ensure that the generated molecules resemble a given set of molecules, we employ the Generative Adversarial Network (GAN) framework [10] to define the adversarial rewards V (⇡✓, D ) min ✓ max V (⇡✓, D ) = E x⇠pdata [log D (x)] + E x⇠⇡✓ [log D (1 x)] (1) where ⇡✓ is the policy network, D is the discriminator network, x represents an input graph, pdata is the underlying data distribution which defined either over final graphs (for final rewards) or intermediate graphs (for intermediate rewards). However, only D can be trained with stochastic gradient descent, as x is a graph object that is non-differentiable with respect to parameters . Instead, we use V (⇡✓, D ) as an additional reward together with other rewards, and optimize the total rewards with policy gradient methods [42] (Section 3.5). The discriminator network employs the same structure of the policy network (Section 3.4) to calculate the node embeddings, which are then aggregated into a graph embedding and cast into a scalar prediction. 3.4 Graph Convolutional Policy Network Having illustrated the graph generation environment, we outline the architecture of GCPN, a policy network learned by the RL agent to act in the environment. GCPN takes the intermediate graph Gt and the collection of scaffold subgraphs C as inputs, and outputs the action at , which predicts a new link to be added, as described in Section 3.3.

• ,: 05 , 5
1 + 0 7 ,,+ 1 5 07 I – ] ( • !"#(θ) = ( )* log π/ (0* |2* ) 3 4* – ) 7 2: 0 2 , 5 1 2:0 7 ), ( S • !5"6 θ = ( )* 78(9:|;:) 78<=> (9:|;:) 3 4* = ( )* ?* (θ) 3 4* – ,: 05 , 5 1 + 0 7 ,,+ ( cPa [ O C • !5@6"(θ) = ( )* min(?* D 3 4* , clip(?* θ , 1 − ε, 1 + ε) 3 4* ) 3 Proposed Method In this section we formulate the problem of graph generation as learning an RL agent that iteratively adds substructures and edges to the molecular graph in a chemistry-aware environment. We describe the problem definition, the environment design, and the Graph Convolutional Policy Network that predicts a distribution of actions which are used to update the graph being generated. 3.1 Problem Definition We represent a graph G as (A, E, F), where A 2 {0, 1}n⇥n is the adjacency matrix, and F 2 Rn⇥d is the node feature matrix assuming each node has d features. We define E 2 {0, 1}b⇥n⇥n to be the (discrete) edge-conditioned adjacency tensor, assuming there are b possible edge types. Ei,j,k = 1 if there exists an edge of type i between nodes j and k, and A = Pb i=1 Ei . Our primary objective is to generate graphs that maximize a given property function S(G) 2 R, i.e., maximize E G0 [S(G 0)], where G 0 is the generated graph, and S could be one or multiple domain-specific statistics of interest. It is also of practical importance to constrain our model with two main sources of prior knowledge. (1) Generated graphs need to satisfy a set of hard constraints. (2) We provide the model with a set of example graphs G ⇠ pdata(G), and would like to incorporate such prior knowledge by regularizing the property optimization objective with E G,G0 [J(G, G 0)] under distance metric J(·, ·). In the case of molecule generation, the set of hard constraints is described by chemical valency while the distance metric is an adversarially trained discriminator.

• • • •

• hRedgi – 26 T ao no UV tx
z – x x + • 1 86 – p r ZuU . M E 6JL G P M JI z – - LB M JI CNI M JIZ : 5 z – 94 BO BLM LBML I I Z S – -A J M PBL gfa c • lR m o – 3 -0 791-6 lR m oZ

• z g o SeP g w g T –
8 6 : 8 : A: 8ar lk T- lk – 1 ) ) : 8 8AAg e • p R d T g – - ( R % G 2 ( R % – 8 E:A8 in dG: D6 : s Ou Q T • lk N v G w s gu R R L N t T – z 6 z s L G 8 6 : 8 N w Je L G lk gVO L u RT Table 1: Comparison of the top 3 property scores of generated molecules found by each model. Method Penalized logP QED 1st 2nd 3rd Validity 1st 2nd 3rd Validity ZINC 4.52 4.30 4.23 100.0% 0.948 0.948 0.948 100.0% ORGAN 3.63 3.49 3.44 0.4% 0.896 0.824 0.820 2.2% JT-VAE 5.30 4.93 4.49 100.0% 0.925 0.911 0.910 100.0% GCPN 7.98 7.85 7.80 100.0% 0.948 0.947 0.946 100.0% when selecting two nodes to predict a link, the first node to select, afirst , should always belong to the currently generated graph Gt , whereas the second node to select, asecond , can be either from Gt (forming a cycle), or from C (adding a new substructure). To predict a link, me takes Zafirst Figure 2: Samples of generated molecules in property optimization and constrained property optimization task. In (c), the two columns correspond to molecules before and after modification. References

• a t Me M – P tF 1 o
n i M – n i o g 9 , • n i • L prM – n i n iM Table 2: Comparison of the effectiveness of property targeting task. Method 2.5  logP  2 5  logP  5.5 150  MW  200 500  MW  550 Success Diversity Success Diversity Success Diversity Success Diversity ZINC 0.3% 0.919 1.3% 0.909 1.7% 0.938 0 JT-VAE 11.3% 0.846 7.6% 0.907 0.7% 0.824 16.0% 0.898 ORGAN 0 0.2% 0.909 15.1% 0.759 0.1% 0.907 GCPN 85.5% 0.392 54.7% 0.855 76.1% 0.921 74.1% 0.920 Table 3: Comparison of the performance in the constrained optimization task. JT-VAE GCPN Improvement Similarity Success Improvement Similarity Success

• I I i 2 0 4821 I g –
% Z I i n l 2 0 4821 C n l P – - C J Po E iI E T V • – 2 0 4821 I C J % I – I Id A N n l Pe I a Iz E Table 2: Comparison of the effectiveness of property targeting task. Method 2.5  logP  2 5  logP  5.5 150  MW  200 500  MW  550 Success Diversity Success Diversity Success Diversity Success Diversity ZINC 0.3% 0.919 1.3% 0.909 1.7% 0.938 0 JT-VAE 11.3% 0.846 7.6% 0.907 0.7% 0.824 16.0% 0.898 ORGAN 0 0.2% 0.909 15.1% 0.759 0.1% 0.907 GCPN 85.5% 0.392 54.7% 0.855 76.1% 0.921 74.1% 0.920 Table 3: Comparison of the performance in the constrained optimization task. JT-VAE GCPN Improvement Similarity Success Improvement Similarity Success 0.0 1.91 ± 2.04 0.28 ± 0.15 97.5% 4.20 ± 1.28 0.32 ± 0.12 100.0% 0.2 1.68 ± 1.85 0.33 ± 0.13 97.1% 4.12 ± 1.19 0.34 ± 0.11 100.0% 0.4 0.84 ± 1.45 0.51 ± 0.10 83.6% 2.49 ± 1.30 0.47 ± 0.08 100.0% 0.6 0.21 ± 0.71 0.69 ± 0.06 46.4% 0.79 ± 0.63 0.68 ± 0.08 100.0% Tanimoto distance between the Morgan ﬁngerprints [32] of the molecules. The RL reward for this task is the L1 distance between the property score of a generated molecule and the range center. To increase the difﬁculty, we set the target range such that few molecules in ZINC250k dataset are within the range to test the extrapolation ability of the methods to optimize for a given target. The target

• 21 1 k r – k v c
G o Sv – hw tk iS N tk k h S • e khw u l r – k P G l a c p ny S r C S

• G V DC G IGPGTCVKXG O GN –
3 OG 0 ODCTGNNK" C CGN" GV CN WV OCVK JGOK CN G KIP W KPI C CVC TKXGP PVKPW W TGRTG GPVCVK P O NG WNG 1 GPVTCN KGP G ) - . - , – 3WKOCTCG " 3CDTKGN 7KOC" GV CN :DLG VKXG TGKP T G IGPGTCVKXG C XGT CTKCN PGVY TM : 3 T GSWGP G IGPGTCVK P O GN CTAKX RTGRTKPV CTAKX. , -)( , – BW" 7CPVC " GV CN GS3 . GSWGP G 3GPGTCVKXG XGT CTKCN GV YKVJ NK 3TC KGPV 4 , • 3TCRJ DC G IGPGTCVKXG O GN – B W" 5KC WCP" GV CN 3TCRJ 1 PX NWVK PCN NK GVY TM T 3 CN 2KTG VG 8 NG WNCT 3TCRJ 3GPGTCVK P 4 - V CRRGCT – 7K" BWLKC" GV CN 7GCTPKPI GGR IGPGTCVKXG O GN ITCRJ CTAKX RTGRTKPV CTAKX. - ( (( ) - – 5KP" GPI PI" GIKPC 0CT KNC " CP OOK 5CCMM NC 5WP VK P TGG CTKCVK PCN WV GP GT T 8 NG WNCT 3TCRJ 3GPGTCVK P 4187 - – 6KR " J OC " CP 8C GNNKPI GOK WRGTXK G NC K K CVK P YKVJ ITCRJ PX NWVK PCN PGVY TM 417 , • :VJGT – JWNOCP" 5 JP" GV CN T KOCN R NK RVKOK CVK P CNI TKVJO CTAKX RTGRTKPV CTAKX. , , (), ,

Graph convolutional policy network for goal dir...

Graph convolutional policy network for goal directed molecular graph generation

Kazuki Fujikawa

More Decks by Kazuki Fujikawa

Other Decks in Science

Featured

Transcript

1 DEEP LEARNING JP [DL Papers] http://deeplearning.jp/ Graph Convolutional Policy

• – DC - C C , D C

• • • •

• • • •

• D eg sk n – eg l 5

• • • •

• t I G e no – F F

- • mg orO y snvc 2 ,- 2cy bdq

- • eo m f nhpc eo c dl g

• • • •

• D – ! = {$% } • : '

( ) ) • 7 7 7 []r hf !"

( ) ) • : – L , MP :

• – w G NgA ecG N Q d e

• ,: 05 , 5

• • • •

• hRedgi – 26 T ao no UV tx

• z g o SeP g w g T –

• a t Me M – P tF 1 o

• I I i 2 0 4821 I g –

• 21 1 k r – k v c

• G V DC G IGPGTCVKXG O GN –