$30 off During Our Annual Pro Sale. View Details »

rinkou01_Cooperation Emergence under Resource-Constrained Peer Punishment

tom--bo
February 01, 2017

rinkou01_Cooperation Emergence under Resource-Constrained Peer Punishment

group learning with reading "Cooperation Emergence under Resource-Constrained Peer Punishment"

tom--bo

February 01, 2017
Tweet

More Decks by tom--bo

Other Decks in Research

Transcript

  1. Cooperation Emergence under Resource-Constrained Peer Punishment Samhar Mahmoud, Simon Miles,

    Michael Luck King’s College London, London, UK :7
  2. ABOUT THIS PAPER • Publisher – AAMAS’16 Proceedings of the

    2016 International Conference on Autonomous Agents & Multiagent Systems, p900-908 • Keywords – Metanorm, Emergence, Limited Enforcement Cost
  3. ABSTRACT • bšIĤĦĬĵĆîëā ·”"(peer punishment)Ĉ‰ž‘(social norm)ė$øĔ ýĎĆ³ • ôćķĭĽă÷āAxelrod[1]ćĶĩž‘ĠŁĵķĭ Ľðêēæûć[GĂêĔMahmoud[22,23]Žðê

    Ĕ • ÷ï÷æf=ćĶĩž‘ĠŁĵĂĈ”"Ć@øĔġ ĦĮė•U÷āîĒùæôć¦dĂĈ”"ĆďġĦĮ ė•U÷ýޑė$ĂñĔXvė_m÷āë Ĕ
  4. INTRODUCTION • ġĿijĸŁĩĤĦĬĵĆîëāĈj·ć wð… i·Ć™:ą0c§ôēæ;»pć|Ĉ¼÷ ë • ;»p3ĂąëĤĦĬĵĆîëā‰ž‘ (Social Norm)ĈšDM‚ąĜŁĥěĿĮė“!

    øĔýĎĆ³ĂêĔ • ĜŁĥěĿĮė“!÷扞‘ė$øĔevă÷ ā”"ėëĔôăðn¢öĕāñý
  5. INTRODUCTION • ń P2PćĴĘęĽhĨĴĮ Data Data Data Data

  6. INTRODUCTION • ń P2PćĴĘęĽhĨĴĮ Data Data Data Data Data ďĒìþół

    hĈ÷ąëł ŃĴļŁĻęĪŁń
  7. INTRODUCTION • ń P2PćĴĘęĽhĨĴĮ Data Data Data Data Data ŃĴļŁĻęĪŁń

    ” ” ďĒìþół hĈ÷ąëł
  8. INTRODUCTION • ń P2PćĴĘęĽhĨĴĮ Data Data Data Data Data h÷ąëăł

    ŃĴļŁĻęĪŁń ” ”
  9. INTRODUCTION • ÷ï÷æôćè”"ėíĔéăëì›xĆġĦĮė• U÷āëĔ†‹ðąë – ħĿĢŁıīĮľŁĞąĄćļĨŁĦðBąë}8ĂĈę ĿĩĻĞĤĺĿė§ôøôăšðġĦĮĆąēìĔ • ôì÷ý}8ė{÷ýćŅĀćķĭĽð”"ćġĦĮ ė•U÷ý}8ĂĈž‘ėWŒĂñąëôăėˆ÷æôć/

    ½ėŸtøĔXvėˆøç Ä ĝļĥİĽćÊåÖÜáßÕÐÄÑ憋 Å ÐÄÑė”"ð(‚Ć9)öúĒĕĔđìĆ[G÷ý ÎÒÙÝßäÕÐÅÅÁÅÆÑ憋
  10. RELATED WORK • ĶĩIJĽĵĠŁĵ憋ĂĈËÖÙáŽĆđÿā”"ė ³òøĔôăĂĴļŁĻęĪŁðZ!öĕĔăëì† ‹ðêĔÐÄÃÁÄÄÑ • eĂĴļŁĻęĪŁð”"ė «øŃÔßäÞãÖá¿ àäÞÚâÙÝÖÞãÀôăðĂñĔyuþăôć'kĈFČĔ

    ăÏÚÛÚ×ßáÒÛÚâĒðˆ÷āëĔÐÅÉÑ • ÌÖÜÓÚÞØĒĈôì÷ý'kėZíĔýĎĆ*¥÷ý6 .ĆĈ5²ėíĔ ’čė&íĔôăĂ*¥ė ®øĔôăĆW%÷āëĔÐÄÈÑ
  11. RELATED WORK • ôì÷ý†‹ĂĈ”"Đ5²ćė1>Ć÷āîēæ ÎÒÙÝßäÕĒĈ(‚Ć”"ćė9)öúĔôăĂIJ Ľĵė~öúĔôăĆW%÷āëĔÐÅÄÑ • eĂÎÚÜÜÖáĒÐÅÇÁÅÈÑĐÍäáÔÒÐÄÉÑĒĈR5ė_ø ĔôăĆg‡ąaYëć ’čė’čªčæq÷ë

    R5ė_öúĔôăė„]÷āëĔ • ôì÷ý†‹ĈQSćêĔĹŁģĆ@O÷đìă÷ý ďćþðæôĕĒĈā”"ėíĔôăšĆġ ĦĮė•U÷āëąë
  12. PEER PUNISHMENT & LIMITED RESOURCES • P2PćĴĘęĽhĤĦĬĵĆîëāĈ ĴĘęĽėhøĔôă = *¥

    h÷ąëôă = œē ăøĔôăðĂñĔ • œĔĜŁĥěĿĮĆ@÷āĈ”"æíĉ>i ·ĴĘęĽėh÷ąëăëì›(ėăĔôăð•í ĒĕĔç • œÿýĜŁĥěĿĮăsĆĴĘęĽėh÷ąë ĖóĆĈëïąëýĎæĜŁĥěĿĮć›(ė•U ÷āæ°ąi·ėt>÷ûć·ĴĘęĽėh÷ ąëăëìôăĆąĔ
  13. Metanorm Model (Interaction Model) • ĜŁĥěĿĮõăĆæ *¥(cooperation)ïœē(defection)ėt>øĔ C%or%D C%or%D C%or%D

    C%or%D C%or%D
  14. Metanorm Model (Interaction Model) • *¥Ĉš  Lðąë C +0

    +0 +0 +0 +0
  15. Metanorm Model (Interaction Model) • œē(temptationăď ì)Ĉ š©Ć ƒæ –Ć

    ƒ(hurt value)ėďýĒø C +3 +1 +1 +1 +1 D
  16. Metanorm Model (Agent Model) • ĜŁĥěĿĮĈboldnessăvengefulnessė›(ć LïĒQ-learning÷āëĔ • ôć2ĀćïĒĜŁĥěĿĮć]µ(policy)ð t>öĕĔ

    :—ö(boldness)ņ œē(temptation)ė›ì‡z 4P(vengefulness)%ņœēĜŁĥěĿĮė”øĔ‡z
  17. Metanorm Model (Punishment Mechanism) • ”ėíĔo­ĆĈ2Š¾ðêĔ 1. AxelrodćĝļĥİĽćķĭĽ(static model) –

    ”ć´Ĉ> 2. MahmoudĒĆđĔ[GöĕýķĭĽ (adaptive model) – ”ć´ĈĜŁĥěĿĮć¯,ć›(ïĒ(‚Ćt> – ĜŁĥěĿĮĈûĕüĕćº^ĜŁĥěĿĮć›(ć Cr(image)ė\øĔ – imageĆKÿā”øĔėt>øĔ
  18. EXPERIMENTAL EVALUATION • ĜŁĥěĿĮĈļĨŁĦė!¸öĕýyTĂæ-˜ą ›(ėpolicyĆđÿāt>›ì • ļĨŁĦĈ1ĻĚĿįŃāćĜŁĥěĿĮðęĿĩĻ ĞĤĺĿė›ìńõăĆļħīĮøĔ • Experimental

    EvaluationĂĈ£Ă¬ċý – Static model – Adaptive model ė¤øĔ • ıīĮľŁĞĈl<ıīĮľŁĞŀĦğŁĽĴļŁıī ĮľŁĞÛì
  19. PARAMETERS SETUP

  20. STATIC PUNISHMENT EXPERIMENTAL RESULTS Fig1%impact%of%limited%resources%with%punishment%on%final%B%and%V

  21. ADAPTIVE PUNISHMENT EXPERIMENTAL RESULTS Fig2%impact%of%limited%resources%with%punishment%on%final%B%and%V

  22. RESOURCE-AWARE PUNISHMENT MODEL • ôôČĂćXvĈĜŁĥěĿĮðš©ćļĨŁĦė• UøĔôăąò›(ė÷āëý • ûôĂĜŁĥěĿĮĈš©ćļĨŁĦăº^øĔĜŁ ĥěĿĮcæº^ĜŁĥěĿĮć¯,ć›((image) ïĒ”øĔ´ėt>øĔđìĆ[GøĔ

  23. RESOURCE-AWARE PUNISHMENT MODEL • ĜŁĥěĿĮĈ¶önć£V¶ė\ĀďćăøĔ • êĔĜŁĥěĿĮ(agi)Ĉûćº^ĜŁĥěĿĮ(agj)õ ăƜēć#.(defection proportion): dpij

    ė¡ øĔ • ôćdpij ėLocalDefImageăøĔ • öĒĆº^ĜŁĥěĿĮćdpij ćE2ė AvgDefImageăøĔ – LocalDefImageðAvgDefImageė¨íāëýĒHòæ¨ íāëąóĕĉFò”øĔNðêĔ
  24. RESOURCE-AWARE PUNISHMENT MODEL • ”øĔ#.(Deviation)ėt>øĔ • º^ĜŁĥěĿĮĊć+ļĨŁĦ´ė2ŽĆb÷ UniformRes(agi )ăøĔ •

    ôĕĈġĦĮ汹ćĂæġĦĮĆ@øĔ”ć#. (enforcement cost percentage: ECP)Ôć#. (EquivPunish)Ć9`øĔ • ôĕĒïĒ?¹Ć”øĔ´Ĉ EquivPunish x DeviationăąĔ
  25. Evaluation Fig3%impact%of%limited%resources%with%punishment%on%final%B%and%V

  26. CONCLUSION • ôĕČĂćĶĩIJĽĵķĭĽėëý†‹ĂĈ”" ĆġĦĮė•U÷ýďćðąïÿý • ôôĂĈ”"ð>ćĝļĥİĽćķĭĽăġĦĮ ė(‚Ć9)öúýķĭĽćeĆæ”"ćġĦĮ ėA÷IJĽĵć~渀ėˆ÷ý • Čý暩ćļĨŁĦė•U÷ā”"ėîôąĔķĭ

    Ľė_m÷æđē*¥ĊćIJĽĵė~öúĒĕĔ ôăėˆ÷ý
  27. FUTURE WORK • J憋ă÷āĈôćXvð ćo­( ćĠŁ ĵo­ńĆ°ĂñĔïŽðêĔ

  28. REFERENCE [1] R. Axelrod. An evolutionary approach to norms. American

    Political Science Review, 80(4):1095–1111, 1986. [10] E. Fehr and S. Ga%̈chter. Altruistic punishment in humans. Nature, 415(6868):137–140, Jan. 2002. [11] E. Fehr and S. Ga%̈chter. Cooperation and punishment in public goods experiments. The American Economic Review, 90(4):pp. 980–994, 2000. [17] D. Helbing, A. Szolnoki, M. Perc, and G. SzabA%̃%̧s. Punish, but not too hard: how costly punishment spreads in the spatial public goods game. New Journal of Physics, 12(8):083005, 2010. [18] R. Jurca and B. Faltings. An incentive compatible reputation mechanism. In Proceedings of the Second International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS ’03, pages 1026–1027. ACM, 2003. [21] S. Mahmoud, J. Keppens, N. Griffiths, and M. Luck. Efficient norm emergence through experiential dynamic punishment. In Proceedings of the 20th European Conference on Artificial Intelligence, pages 576–581. IOS Press, 2012. [22] S. Mahmoud, J. Keppens, M. Luck, and N. Griffiths. Norm establishment via metanorms in network topologies. In Proceedings of the 2011 [23] S. Mahmoud, J. Keppens, M. Luck, and N. Griffiths. Overcoming omniscience in axelrod’s model. In Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 03, WI-IAT ’11, pages 29–32. IEEE Computer Society, 2011. [26] N. Miller, P. Resnick, and R. Zeckhauser. Eliciting honest feedback in electronic markets. KSG Working Paper Series RWP02-039, 2002. [27] N. Miller, P. Resnick, and R. Zeckhauser. Eliciting informative feedback: The peer-prediction method. Management Science, 51:2005, 2005. [28] N. Nikiforakis. Punishment and counter-punishment in public good games: Can we really govern ourselves? Journal of Public Economics, 92:91–112, 2008.