삼목을 정복하자

࢖ݾਸ ੿ࠂೞ੗ [email protected]

ఋѶ਷ ౮ఖషੑפ׮.

Tic-tac-toe (also known as Noughts and crosses or Xs and
Os) is a paper-and-pencil game for two players, X and O, who take turns marking the spaces in a 3×3 grid. The player who succeeds in placing three of their marks in a horizontal, vertical, or diagonal row wins the game.

࢖ݾ੉ ইפਗ਼ই?

౮ఖషۆ ੉ܴী ׮ٜ ࢤࣗೞ࣊ࢲ…

Ӓؘ۠ ৵ ౮ఖష?

ހప஠ܳ۽

౮ఖష৬  ހప஠ܳ۽ܳ ੿ࠂ೤द׮

ӒܻҊ ޷פݓझبਃ.

׮ܖ૑ ঋח Ѫ • ஶߥܖ࣊օ ׏ۡ ֎౟ਕ௼  (CNN, Convolutional Neural
Networks) • ঌ౵Ҋ (AlphaGO) • ӝ҅ ೟ण (Machine Learning) • ӝఋ ੋҕ૑מী ؀ೠ बച ղਊ • ҳӖ ஂস • ݫ੉௼স • ؂झ

׮ܖח Ѫ • ޷פݓझ (Minimax) • ހప஠ܳ۽ ౟ܻ Ѩ࢝ (Monte-Carlo
Tree Search)

਋ࢶ ޷פݓझ

਋ࢶ ਤఃೖ٣ই • Minimax (sometimes MinMax or MM[1]) is a
decision rule used in decision theory, game theory, statistics and philosophy for minimizing the possible loss for a worst case (maximum loss) scenario.

ইޖېب Ӓܿਵ۽

࢚؀ఢ ࢚؀ఢ: ࣚ೧઴ېਃ

਋ܻఢ ਋ܻఢ: ੉੊ ୃӡېਃ

࢚؀ఢ ࢚؀ఢ: Minimize

਋ܻఢ ਋ܻఢ: Maximise

Minimax

੹୓ܳ ࠁפ য়ܲଃ ӝ਍੉ ৡ׮.

౮ఖష ঱ઁೞաਃ?

౮ఖష ঱ઁೞաਃ? 9ѐ 8ѐ 7ѐ ੉Ѥ ౟ܻ੄ ੌࠗۄח Ѣ…

੼ࣻח যڌѱ ೞաਃ?

੼ࣻ • ੉ӝݶ +10 • ૑ݶ -10

੼ࣻо ৵ ੉ۧѱ ױࣽ?

੼ࣻ • ੉ӝݶ +10 • ૑ݶ -10 • 3 ಕ੉ૉ
੉റ, ݒ ಕ੉ૉ ݃׮ ੼ࣻ 1੼ х੼ೞӝ.

೐۽Ӓې߁਷?

౟ܻܳ ٮۄ ഐ୹೤द׮. Maxmize()

౟ܻܳ ٮۄ ഐ୹೤द׮. Maxmize() Minimize() Minimize()

౟ܻܳ ٮۄ ഐ୹೤द׮. Maxmize() Minimize() Minimize() Maxmize()Maxmize()Maxmize() Maxmize()

౟ܻܳ ٮۄ ഐ୹೤द׮. Maxmize() Minimize() Minimize() Maxmize()Maxmize()Maxmize() Maxmize() Maxmize() Maxmize()
Maxmize()Maxmize()Maxmize() Maxmize() Maxmize() Maxmize() Maxmize()Maxmize()Maxmize() Maxmize() Minimize() Minimize() Minimize() Minimize() Minimize() Minimize()

߄ق਷ (19*19)! ࠗఠ…

• ౟ܻח ׮ Ӓ۰ঠ ೞաਃ? • ֎ • Ӓۢ ߄ق਷
޷פݓझ ޅೞѷ֎ਃ. • ֎

Monte-Carlo

੐੄੸ੋ ੼ਵ۽  ਗ઱ਯਸ ҳ೧ࠇद׮

୽࠙੉ ݆਷ ੼੉ ੓׮ݶ  ੼੄ ࠙ನ۽ ਗ઱ਯਸ ҳ೤פ׮

ਗ੄ և੉ : ࢎпഋ੄ և੉  =  ਗ ղ੄ ੼ іࣻ
: ࢎпഋ ղ੄ ੼ іࣻ

៉*(r**2) / 4 * (r**2)=  ਗ ղ ੼ / ࢎпഋ
ղ ੼

៉=  ਗ ղ ੼ / ࢎпഋ ղ ੼ * 4

੐੄୶୹ਸ ৈ۞ߣ ೞݶ Ӕࢎ೧ܳ ҳೡ ࣻ ੓׮ ಪ ֢੉݅ &
਎ۈ

Monte-Carlo  Tree Search

౟ܻ ੿଼ਵ۽ যڃ ౟ܻ۽ оঠೡ૑ Ѿ੿.  eg. և੉ ਋ࢶਵ۽ ֢٘ܳ
׮ ୶оೞ੗. ࢶఖ:

֢٘ ೞաܳ ୶о. ഛ੢:

ഒ੗ ఠ޷օ ֢٘ (҃ӝ ՘)ө૑ ೒ۨ੉.  ೒ۨ੉ ب઺ী ߑޙೠ Ѫ਷
֢٘ ୶оೞ૑ ঋ਺.  eg. ےؒೞѱ 1000౸݅ ف੗. दޛۨ੉࣌:

दޛۨ੉࣌ Ѿҗ۽ दبೠ പࣻ৬ ੼ࣻܳ ࢚ਤ ֢٘ী јन ৉੹౵:

҅ࣘ ߈ࠂ೧ࢲ दب പࣻ৬ ੼ࣻܳ ৢ۰ࢲ weightܳ јन

ޙઁ੼

ޙઁ੼ • ੹ۚ੄ ࠗ੤ • दޛۨ੉࣌ दр੄ ೙ਃ • ୭੸੄
׹੉ ইפ׮. • ഛܫ੸ਵ۽ दޛۨ੉࣌੉ ݆ই ૕ࣻ۾ Ӕࢎ೧૗.

׮নೠ MCTSܳ ࠇद׮

Plain MCTS • ࢶఖীࢲ ಣ١ೣ. (և੉ ਋ࢶ) • ഛܫ੸ਵ۽ оמࢿ
হח Ҕীࢲ दрਸ ࠁն.

Epsilon greedy • ੐੄੄ εਸ о੿. • 1-ε ഛܫਸ ഝਊ.
(weightо ֫਷ Ҕਸ ഛੋ) • ε੄ ഛܫ۽ ఐ೷. (౟ܻܳ և൨) • Ҋ੿੸ਵ۽ ఐ೷ਸ ೞח ࠺ਊ. • ୡӝী ఐ೷ਸ ੸ѱ ೣ.

• ఐ೷җ ഝਊী Ӑഋ. • ୡӝী ఐ೷ೞҊ ੉റ ࠁ੿ೞח ध.
• UCBо ֫਷ Ҕਸ ߑޙ. • UCB = Upper Conﬁdence Bound

Not available yet. https://github.com/dalinaum/Alpha-Kunny

삼목을 정복하자

삼목을 정복하자

More Decks by Leonardo YongUk Kim

Other Decks in Programming

Featured

Transcript