Upgrade to Pro — share decks privately, control downloads, hide ads and more …

ストリートスナップデータに
統計的ネットワーク分析の適用を試みた

 ストリートスナップデータに
統計的ネットワーク分析の適用を試みた

TokyoR #78 LT

saltcooky

May 25, 2019
Tweet

More Decks by saltcooky

Other Decks in Science

Transcript

  1. ετϦʔτεφοϓσʔλʹ

    ౷ܭతωοτϫʔΫ෼ੳͷద༻ΛࢼΈͨ
    5PLZP3

    !TBMUDPPLZ

    View Slide

  2. ୭ʁ
    • !TBMUDPPLZ
    • 3ྺɿ೥͙Β͍͔ͳ
    • ۈઌɿݪ॓ʹ͋Δ*5ܥͷձࣾ
    • ࢓ࣄ಺༰ɿ3%తͳ෦ॺͰ3Λ࢖ͬͨ

    ɾσʔλ෼ੳ ׂ

    ɾػցֶश ׂ

    ɾલॲཧ ׂ

    • झຯɿ෰ϑΝογϣϯඒज़ؗ८Γ

    View Slide

  3. ωοτϫʔΫ෼ੳͱ͸
    ਓؒؔ܎ɺاۀؒͷؔ܎ɺੜ෺ؒͷؔ܎ɺίϯϐϡʔλωοτϫʔΫ
    ͳͲͷؔ܎΍ߏ଄Λ෼ੳ͢Δάϥϑཧ࿦Λϕʔεͱͨ͠෼ੳख๏
    (ग़య : https://www.slideshare.net/kashitan/tidygraphggraph) (https://www.amazon.co.jp/exec/obidos/ASIN/4320019288)
    ͜ΕͰษڧ͠·ͨ͠
    ࠷ۙͷTokyoRͩͱ

    @kashitan ͞Μ͕
    ൃදͨ͠Γͯͨ͠

    View Slide

  4. ωοτϫʔΫ෼ੳ
    Α͋͘Δͷ͸ωοτϫʔΫͷࢦඪͷࢉग़΍ߏ଄ͷநग़
    - த৺ੑ

    ͲͷΑ͏ͳਓ͕த৺తͳਓ෺͔
    - ίϛϡχςΟநग़

    ͲͷΑ͏ͳάϧʔϓʹ෼͔Ε͍ͯΔ͔
    - ૬ؔ܎਺

    ̎ͭͷωοτϫʔΫ͸ࣅ͍ͯΔ͔Ͳ͏͔
    - ίΞநग़

    ωοτϫʔΫͷີʹ݁߹ͨ͠த৺෦෼

    View Slide

  5. ωοτϫʔΫͷ͋Δ̎఺ͷ௖఺ؒ J K
    ͷล͸ɺ֬཰QJKͰ֬཰తʹൃੜ͢Δͱߟ͑Δ
    QJK͸ύϥϝʔλВΛ࣋ͭϩδεςΟοΫϞσϧͰදݱͰ͖Δ
    ௖఺J Kͱ௖఺K Lʹล͕ுΔ֬཰͸QJKºQKLͱදݱͰ͖Δ
    ౷ܭతωοτϫʔΫ෼ੳ
    K
    L J

    View Slide

  6. ࢦ਺ϥϯμϜάϥϑϞσϧ FYQPOFOUJBMSBOEPNHSBQINPEFM

    ɹϥϯμϜάϥϑ:ʹ͓͍ͯಛఆͷάϥϑߏ଄Z͕ಘΒΕΔ֬཰͸֤ล͕ுΔ֬཰ͷ

    ྦྷ৐ͰදݱͰ͖Δͱߟ͑ͨϞσϧ
    ౷ܭతωοτϫʔΫ෼ੳ
    yʹ͋Δลͷ਺
    ύϥϝʔλ
    ن֨Խఆ਺
    ωοτϫʔΫશମ
    ͷลͷൃੜ֬཰

    View Slide

  7. ࢦ਺ϥϯμϜάϥϑϞσϧɹQϞσϧ
    ɹϥϯμϜάϥϑ:ͷลͷൃੜ֬཰͸༷ʑͳཁૉʹΑΓ֬཰తʹܾ·ΔϞσϧ
    ౷ܭతωοτϫʔΫ෼ੳ
    ཁૉ
    ϊʔυͷಛ௃ྔɿ೥ྸɺॏΈɺ෦ॺʜ
    ลͷಛ௃ྔɿަࡍظؒɺ޷Έʜ
    ϊʔυؒͷؔ܎ͷಛ௃ɿ೥ྸࠩɺۈଓظؒࠩʜ
    ߏ଄తͳಛ௃ྔɿLελʔߏ଄ͷ਺ʜ
    ωοτϫʔΫͷߏ੒ཁ
    ཁૉͷ਺

    View Slide

  8. ద༻σʔλ

    View Slide

  9. ద༻σʔλ
    ೥ྸ
    ৬ۀ
    ࡱӨ৔ॴ
    ண༻ϒϥϯυ

    View Slide

  10. Ϟνϕʔγϣϯ
    ล͸ண༻ϒϥϯυͷ
    ڞ௨౓
    ϒϥϯυͷબ୒ͷੑ࣭Λ
    දݱͰ͖ͳ͍͔
    (͔ͳΓແཧ໼ཧ)

    View Slide

  11. σʔλऔಘ
    • ($1্Ͱ%PDLFSΛ༻͍ͯ3TUVEJP34FMFOJVN؀ڥΛ࡞੒
    • SWFTUQBDLBHFΛར༻ͯ͠εΫϨΠϐϯά
    • ϙΞιϯ෼෍ʹै͏ִؒͰϖʔδऔಘ ͳΜͱͳ͘

    • ໿Ұ೥෼ਓͷεφοϓσʔλΛऔಘ

    View Slide

  12. σʔλ֬ೝ
    ண༻ϒϥϯυϥϯΩϯά
    ண༻ϒϥϯυωοτϫʔΫ

    View Slide

  13. Ϟσϧ࡞੒(ྫ)
    ࢦ਺ϥϯμϜϞσϧ͸TUBUOFUQBDLBHFͰ࣮૷͕Ͱ͖·͢ɻ
    # ωοτϫʔΫΦϒδΣΫτͷ࡞੒

    network <- as.network(x = graph_matrix, directed = FALSE, loops = FALSE)
    # ֤Τοδʹઆ໌ม਺(೥ྸ)Λ௥Ճ
    network %v% "Age" <- Age
    # ֤Τοδͷ೥ྸͷࠩΛܭࢉ
    diff.age <- abs(sweep(matrix(snap_info$Age, nrow = 638, ncol = 638),
    2, snap_info$Age))
    # Ϟσϧ࡞੒

    model <- ergm( network ~ edges + edgecov(diff.age) + nodecov(“Age”) )


    View Slide

  14. Ϟσϧ࡞੒
    ࢦ਺ϥϯμϜϞσϧ͸TUBUOFUQBDLBHFͰ࣮૷͕Ͱ͖·͢ɻ
    # ετϦʔτεφοϓͷp*Ϟσϧੜ੒
    snap_net_model <- ergm(snap_net ~ 

    edges + # ลͷ਺
    nodecov(“Age")+ # ೥ྸࠩ
    edgecov(diff.age) + # ೥ྸ
    nodematch(“Occupation”) + # ৬ۀ
    nodematch("Point") ) # ࡱӨ৔ॴ


    View Slide

  15. ݁ՌΛݟͯΈΔ
    > summary(snap_net_model)
    < ུ >
    Monte Carlo MLE Results:
    Estimate Std. Error MCMC % z value Pr(>|z|)
    edges -5.2066393 0.2692526 0 -19.337 <1e-04 ***
    edgecov.diff.age -0.0015763 0.0094767 0 -0.166 0.8679
    nodecov.Age -0.0003136 0.0061215 0 -0.051 0.9591
    nodematch.Occupation -0.0453192 0.0842853 0 -0.538 0.5908
    nodematch.Point 0.1491330 0.0628610 0 2.372 0.0177 * 

    < ུ >
    AIC: 13485 BIC: 13536 (Smaller is better.)

    ࡱӨ৔ॴ͕ลͷൃੜʹ
    Өڹ͍ͯͦ͠͏
    AIC/BICͰม਺બ୒Մೳ

    View Slide

  16. ݁ՌΛݟͯΈΔ
    ϞσϧΛ༻͍ͯγϛϡϨʔγϣϯ
    ࣮ઢɿγϛϡϨʔγϣϯʹΑΔ஋

    ശͻ͛ਤɿ࣮σʔλͷ஋
    ౰ͯ͸·Γྑ͘ͳ͍ʜ

    View Slide

  17. ·ͱΊ
    • ࠓճͷεφοϓ৘ใͰ͸ɺண༻ϒϥϯυͷؔ܎ੑΛࢦ਺ϥϯμϜ
    άϥϑϞσϧͰ͏·͘දݱͰ͖·ͤΜͰͨ͠
    • ౷ܭతωοτϫʔΫ෼ੳ͸݁ߏ໘ന͍ͷͰɺษڧͯ͠ΈͯͶ
    • ࢲ΋౷ܭతωοτϫʔΫ෼ੳͷษڧଓ͚͍͖͍ͯͨͱࢥ͍·͢
    • ͳͷͰɺৄ͍͠ํ͸͝ڭतئ͍͠·͢

    View Slide

  18. • ڞཱग़൛ʮωοτϫʔΫ෼ੳୈ̎൛ʯླ໦౒ஶ

    IUUQTXXXBNB[PODPKQFYFDPCJEPT"4*/
    • \UJEZHSBQI^ͱ\HHSBQI^ʹΑΔϞμϯͳωοτϫʔΫ෼ੳ

    IUUQTXXXTMJEFTIBSFOFULBTIJUBOUJEZHSBQIHHSBQI
    • 3ʹΑΔωοτϫʔΫ෼ੳΛ·ͱΊ·ͨ͠ωοτϫʔΫͷࢦඪฤ

    IUUQTRJJUBDPNTBMUDPPLZJUFNTFEDFEGCDE
    • 3ʹΑΔωοτϫʔΫ෼ੳΛ·ͱΊ·ͨ͠౷ܭతωοτϫʔΫ෼ੳฤ

    IUUQTRJJUBDPNTBMUDPPLZJUFNTCBFGDFCGBDFBDCGD
    ࢀߟ

    View Slide