Upgrade to Pro — share decks privately, control downloads, hide ads and more …

삼목을 정복하자

삼목을 정복하자

삼목이라 했지만 사실은 틱택토. 미니맥스와 몬테카를로 트리 검색으로 삼목을 정복해봅시다. MCTS(Monte-Carlo Tree Search), Minimax

64c1a3841da6d254ac11d9d842e04ce4?s=128

Leonardo YongUk Kim

March 31, 2016
Tweet

Transcript

  1. ࢖ݾਸ ੿ࠂೞ੗ dalinaum@gmail.com

  2. None
  3. ఋѶ਷ ౮ఖషੑפ׮.

  4. Tic-tac-toe (also known as Noughts and crosses or Xs and

    Os) is a paper-and-pencil game for two players, X and O, who take turns marking the spaces in a 3×3 grid. The player who succeeds in placing three of their marks in a horizontal, vertical, or diagonal row wins the game.
  5. ࢖ݾ੉ ইפਗ਼ই?

  6. ౮ఖషۆ ੉ܴী ׮ٜ ࢤࣗೞ࣊ࢲ…

  7. Ӓؘ۠ ৵ ౮ఖష?

  8. None
  9. ހప஠ܳ۽

  10. ౮ఖష৬
 ހప஠ܳ۽ܳ ੿ࠂ೤द׮

  11. ӒܻҊ ޷פݓझبਃ.

  12. ׮ܖ૑ ঋח Ѫ • ஶߥܖ࣊օ ׏ۡ ֎౟ਕ௼
 (CNN, Convolutional Neural

    Networks) • ঌ౵Ҋ (AlphaGO) • ӝ҅ ೟ण (Machine Learning) • ӝఋ ੋҕ૑מী ؀ೠ बച ղਊ • ҳӖ ஂস • ݫ੉௼স • ؂झ
  13. ׮ܖח Ѫ • ޷פݓझ (Minimax) • ހప஠ܳ۽ ౟ܻ Ѩ࢝ (Monte-Carlo

    Tree Search)
  14. ਋ࢶ ޷פݓझ

  15. ਋ࢶ ਤఃೖ٣ই • Minimax (sometimes MinMax or MM[1]) is a

    decision rule used in decision theory, game theory, statistics and philosophy for minimizing the possible loss for a worst case (maximum loss) scenario.
  16. None
  17. ইޖېب Ӓܿਵ۽

  18. ࢚؀ఢ ࢚؀ఢ: ࣚ೧઴ېਃ

  19. ਋ܻఢ ਋ܻఢ: ੉੊ ୃӡېਃ

  20. ࢚؀ఢ ࢚؀ఢ: Minimize

  21. ਋ܻఢ ਋ܻఢ: Maximise

  22. Minimax

  23. ੹୓ܳ ࠁפ য়ܲଃ ӝ਍੉ ৡ׮.

  24. ౮ఖష ঱ઁೞաਃ?

  25. ౮ఖష ঱ઁೞաਃ? 9ѐ 8ѐ 7ѐ ੉Ѥ ౟ܻ੄ ੌࠗۄח Ѣ…

  26. ੼ࣻח যڌѱ ೞաਃ?

  27. ੼ࣻ • ੉ӝݶ +10 • ૑ݶ -10

  28. ੼ࣻо ৵ ੉ۧѱ ױࣽ?

  29. ੼ࣻ • ੉ӝݶ +10 • ૑ݶ -10 • 3 ಕ੉ૉ

    ੉റ, ݒ ಕ੉ૉ ݃׮ ੼ࣻ 1੼ х੼ೞӝ.
  30. ೐۽Ӓې߁਷?

  31. ౟ܻܳ ٮۄ ഐ୹೤द׮. Maxmize()

  32. ౟ܻܳ ٮۄ ഐ୹೤द׮. Maxmize() Minimize() Minimize()

  33. ౟ܻܳ ٮۄ ഐ୹೤द׮. Maxmize() Minimize() Minimize() Maxmize()Maxmize()Maxmize() Maxmize()

  34. ౟ܻܳ ٮۄ ഐ୹೤द׮. Maxmize() Minimize() Minimize() Maxmize()Maxmize()Maxmize() Maxmize() Maxmize() Maxmize()

    Maxmize()Maxmize()Maxmize() Maxmize() Maxmize() Maxmize() Maxmize()Maxmize()Maxmize() Maxmize() Minimize() Minimize() Minimize() Minimize() Minimize() Minimize()
  35. ߄ق਷ (19*19)! ࠗఠ…

  36. • ౟ܻח ׮ Ӓ۰ঠ ೞաਃ? • ֎ • Ӓۢ ߄ق਷

    ޷פݓझ ޅೞѷ֎ਃ. • ֎
  37. Monte-Carlo

  38. ੐੄੸ੋ ੼ਵ۽
 ਗ઱ਯਸ ҳ೧ࠇद׮

  39. None
  40. ୽࠙੉ ݆਷ ੼੉ ੓׮ݶ
 ੼੄ ࠙ನ۽ ਗ઱ਯਸ ҳ೤פ׮

  41. ਗ੄ և੉ : ࢎпഋ੄ և੉
 =
 ਗ ղ੄ ੼ іࣻ

    : ࢎпഋ ղ੄ ੼ іࣻ
  42. ៉*(r**2) / 4 * (r**2)=
 ਗ ղ ੼ / ࢎпഋ

    ղ ੼
  43. ៉=
 ਗ ղ ੼ / ࢎпഋ ղ ੼ * 4

  44. ੐੄୶୹ਸ ৈ۞ߣ ೞݶ Ӕࢎ೧ܳ ҳೡ ࣻ ੓׮ ಪ ֢੉݅ &

    ਎ۈ
  45. Monte-Carlo
 Tree Search

  46. None
  47. ౟ܻ ੿଼ਵ۽ যڃ ౟ܻ۽ оঠೡ૑ Ѿ੿.
 eg. և੉ ਋ࢶਵ۽ ֢٘ܳ

    ׮ ୶оೞ੗. ࢶఖ:
  48. ֢٘ ೞաܳ ୶о. ഛ੢:

  49. ഒ੗ ఠ޷օ ֢٘ (҃ӝ ՘)ө૑ ೒ۨ੉.
 ೒ۨ੉ ب઺ী ߑޙೠ Ѫ਷

    ֢٘ ୶оೞ૑ ঋ਺.
 eg. ےؒೞѱ 1000౸݅ ف੗. दޛۨ੉࣌:
  50. दޛۨ੉࣌ Ѿҗ۽ दبೠ പࣻ৬ ੼ࣻܳ ࢚ਤ ֢٘ী јन ৉੹౵:

  51. ҅ࣘ ߈ࠂ೧ࢲ दب പࣻ৬ ੼ࣻܳ ৢ۰ࢲ weightܳ јन

  52. ޙઁ੼

  53. ޙઁ੼ • ੹ۚ੄ ࠗ੤ • दޛۨ੉࣌ दр੄ ೙ਃ • ୭੸੄

    ׹੉ ইפ׮. • ഛܫ੸ਵ۽ दޛۨ੉࣌੉ ݆ই ૕ࣻ۾ Ӕࢎ೧૗.
  54. ׮নೠ MCTSܳ ࠇद׮

  55. Plain MCTS • ࢶఖীࢲ ಣ١ೣ. (և੉ ਋ࢶ) • ഛܫ੸ਵ۽ оמࢿ

    হח Ҕীࢲ दрਸ ࠁն.
  56. Epsilon greedy • ੐੄੄ εਸ о੿. • 1-ε ഛܫਸ ഝਊ.

    (weightо ֫਷ Ҕਸ ഛੋ) • ε੄ ഛܫ۽ ఐ೷. (౟ܻܳ և൨) • Ҋ੿੸ਵ۽ ఐ೷ਸ ೞח ࠺ਊ. • ୡӝী ఐ೷ਸ ੸ѱ ೣ.
  57. • ఐ೷җ ഝਊী Ӑഋ. • ୡӝী ఐ೷ೞҊ ੉റ ࠁ੿ೞח ध.

    • UCBо ֫਷ Ҕਸ ߑޙ. • UCB = Upper Confidence Bound
  58. Not available yet. https://github.com/dalinaum/Alpha-Kunny