8JO-PTF NJMMJPO w *UUPPLXFFLXJUI(16 w 5SBJOJOHBMTPUPPLXFFLXJUI (16 w 5IFOFUXPSLQSPWJEFT&WBMVBUJPO GVODUJPOGPS(P UIBUDPOTJEFSFEUPCF IBSEQSFWJPVTMZ 'JHPG 4JMWFS
BOE$16T Q(s, a) = (1 − λ) Wv (s, a) Nv (s, a) + λ Wr (s, a) Nr (s, a) u(s, a) = cpuct P(s, a) ∑ b Nr (s, b) 1 + Nr (s, a) .$54JO"MQIB(P 7BMVF/FUXPSL .$54 P(s, a)
5 5 5 3 3 3 3 19 19 .... .... 3 3 3 3 19 19 19 19 Output1: Prediction of the next move 19 19 Output Layer Output2: Win Rate w -BZFST $POWPMVUJPOBM/FVSBM/FUXPSL w &BDIMBZFSYDPOWPMVUJPOMBZFS #BUDIOPSNBMJ[BUJPO 3FMV w -BZFSdBSF3FT/FU w 5SBJOFECZTFMGQMBZ EFUBJMTBSFEFTDSJCFEMBUFS w $IBOOFMT 'FBUVSFT JTQSFQBSFE UIFOFYUTMJEFTIPXTEFUBJMT w -FBSOJOHNFUIPEPGUIJTOFUXPSLEJTDVTTFEMBUFS 'PSOPX MFU`TBTTVNFXFIBWFUSBJOFEJUOJDFMZ p v