Upgrade to Pro — share decks privately, control downloads, hide ads and more …

階層クラスタリングにおける仮説検定

 階層クラスタリングにおける仮説検定

saltcooky

May 23, 2020
Tweet

More Decks by saltcooky

Other Decks in Science

Transcript

  1. ֊૚ΫϥελϦϯάʹ͓͚ΔԾઆݕఆ
    !TBMUDPPLZ
    5PLZP3
    1

    View Slide


  2. !TBMUDPPLZ
    • 3ྺɿ೥͙Β͍͔ͳ
    • ۈઌɿݪ॓ʹ͋Δ*5ܥͷձࣾ
    • ࢓ࣄ಺༰ɿ3%తͳ෦ॺͰ
    ɹɹɹ3Λ࢖ͬͨσʔλ෼ੳ԰͞Μ
    ৽ਓݚमͷ"*ߨ࠲ͷ४උ
    • झຯɿ෰ϑΝογϣϯඒज़ؗ८Γ
    2

    View Slide

  3. ಺༰ΛҰݴͰ
    ίϩφ΢Οϧεͷܥ౷෼ੳʹ΋ར༻͞Ε͍ͯΔ
    ΫϥελϦϯάʹ͓͚Δ౷ܭతԾઆݕఆΛ঺հ
    3
    5IFTQFDJFT4FWFSFBDVUFSFTQJSBUPSZTZOESPNFSFMBUFEDPSPOBWJSVTDMBTTJGZJOHO$P7BOEOBNJOHJU4"34$P7

    View Slide

  4. ΫϥελϦϯάͷؾʹͳΔ͜ͱ
    σʔλ͕গ͠มΘ͚ͬͨͩͰΫϥελܗ੒ͷ࢓ํ͕มԽ͢Δ
    ࠓಘΒΕ͍ͯΔσʔλͷΫϥελ͸ɺਅͷΫϥελ͔ʁ
    4
    ಘΒΕ͍ͯΔ
    σʔληοτ"
    ಘΒΕ͍ͯͳ͍
    σʔληοτ#
    σʔλ͕ҰͭมԽͱ
    Ϋϥελͷܗ੒͕มԽ
    ͢ΔՄೳੑ͕͋Δ
    ಘΒΕ͍ͯΔΫϥελʹ
    ॴଐ͍ͯ͠Δσʔλ͸
    ຊ౰ʹಉ͡Ϋϥελ͔ʁ

    View Slide

  5. ϒʔτετϥοϓ๏ʹΑΔݕఆ
    ͭͷର৅ ͷҨ఻ࢠ
    ʹΑΓܗ੒͞ΕΔथܗਤ ܥ౷थ
    ͸௨Γ
    5





    ը૾Ҿ༻(Ұ෦վม) http://stat.sys.i.kyoto-u.ac.jp/titech/multiboot-j.html

    View Slide

  6. ϒʔτετϥοϓ๏ʹΑΔݕఆ
    w /ݸͷઆ໌ม਺ Ҩ఻ࢠ഑ྻ
    ͷσʔλ͔ΒॏෳΛڐ͠ɺ/ݸͷઆ໌ม਺Λ
    ϦαϯϓϦϯάͯ͠ϒʔτετϥοϓඪຊΛੜ੒
    w ϦαϯϓϦϯάͨ͠σʔλΛ༻͍ͯथܗਤΛ࡞੒͢Δ
    w ͜ΕΛ#ճ܁Γฦ͢͜ͱͰಘΒΕͨ#ݸͷथܗਤಘΒΕ·͢
    6

    $ $
    5
    (
    (
    5
    (
    5 ( (
    ( (
    $
    (
    5
    (
    (
    $ ( 5
    $ $
    5
    (
    (
    5
    (
    5 ( (
    ( (
    $
    (
    5
    (
    (
    $ ( 5
    $ $
    5
    (
    (
    5
    (
    5 ( (
    ( (
    $
    (
    5
    (
    (
    $ ( 5

    View Slide

  7. 7
    ϒʔτετϥοϓ๏ʹΑΔथܗਤΛ/ݸ࡞੒͢Δ͜ͱͰɺ֤थܗਤ͕ಘΒ
    ΕΔ֬཰Λਪఆ͢Δ͜ͱ͕Ͱ͖Δɻ
    ը૾Ҿ༻(Ұ෦վม) http://stat.sys.i.kyoto-u.ac.jp/titech/multiboot-j.html
    ϒʔτετϥοϓ๏ʹΑΔݕఆ

    View Slide

  8. ϒʔτετϥοϓ๏ʹΑΔݕఆ 8
    4IJNPEBJSB)BTFHBXBݕఆ
    थܗਤͷ#4๏ʹΑΔݕఆͰ͸ɺݕग़ͷِӄੑ͕ଟ͘ͳͬͯ͠·͏ͨΊɺ
    ଟॏൺֱʹΑΔӨڹΛิਖ਼ͨ͠ݕఆ
    ը૾Ҿ༻(Ұ෦վม) http://stat.sys.i.kyoto-u.ac.jp/titech/multiboot-j.html

    View Slide

  9. Ϛϧνεέʔϧϒʔτετϥοϓ
    "QQSPYJNBUFMZ6OCJBTFEݕఆ
    ϦαϯϓϦϯά਺Λ/͔Β༷ʑͳ஋ΛͱΔ/`ʹ͢ΔϚϧνεέʔϧϒʔτ
    ετϥοϓ๏ʹΑΔݕఆ
    4)ݕఆͳͲΑΓෆภͳਪఆΛߦ͏͜ͱ͕Ͱ͖Δ
    9

    $ $
    5
    (
    (
    5
    (
    5 ( (
    ( (
    $
    (
    5
    (
    (
    $ ( 5 $ $ 5 5
    $ $
    5
    (
    (
    5
    (
    5 ( (
    ( (
    $
    (
    5
    (
    (
    $ ( 5 $ $ 5 5
    $ $
    5
    (
    (
    5
    (
    5 ( (
    ( (
    $
    (
    5
    (
    (
    $ ( 5 $ $ 5 5

    View Slide

  10. Ϛϧνεέʔϧϒʔτετϥοϓ 10
    ͳͥϚϧνεέʔϧʹ͢Δ͜ͱͰෆภͳਪఆΛߦ͏͜ͱ͕Ͱ͖Δͷ͔
    زԿֶతʹߟ͑ΔͱϚϧνεέʔϧʹ͢Δ͜ͱͰɺԾઆͷۭؒͷܗঢ়Λ
    ௨ৗͷݕఆʹ͚ۙͮΔ͜ͱ͕Ͱ͖Δ ৄ͘͠͸ࢀߟจݙ

    UݕఆͳͲ
    ܥ౷थʹ͓͚Δݕఆ
    ը૾Ҿ༻ https://www.ism.ac.jp/editsec/toukei/pdf/50-1-033.pdf

    View Slide

  11. 3Ͱ΍ͬͯΈΔ 11
    ./*45ͷσʔλ͔Β ͷσʔλΛݸͣͭϥϯμϜʹऔಘ

    View Slide

  12. 3Ͱ΍ͬͯΈΔ 12
    3Ͱ"6ݕఆΛߦ͏ͨΊʹ͸QWDMVTUύοέʔδΛར༻͢Δ
    ɹlibrary(pvclust)
    ɹlibrary(parallel)
    ɹcl <- makeCluster(detectCores()) #ฒྻԽͷ͓·͡ͳ͍
    ɹ
    ɹ# આ໌ม਺͕ߦɺର৅͕ྻͷঢ়ଶʹ͢ΔͨΊʹసஔ
    ɹmnist_df_t <-
    ɹɹɹɹmnist_df %>%
    ɹɹɹɹdplyr::select(-label) %>%
    ɹɹɹɹt()
    ɹcolnames(mnist_df_t) <- mnist_df$label
    ɹsa <- 9^seq(-1,1,length=13)ɹ# Ϛϧνεέʔϧϒʔτετϥοϓͷεέʔϧ഑ྻΛੜ੒
    ɹmnist_boot <- pvclust(data = mnist_df_t,
    ɹ r = 1/sa,
    ɹ nboot = 2000,
    ɹ method.hclust = "ward.D2",
    ɹɹɹɹɹɹɹɹɹɹɹɹɹparallel = cl)

    View Slide

  13. 3Ͱ΍ͬͯΈΔ 13
    ݁ՌΛ֬ೝ
    ɹ> mnist_boot
    ɹCluster method: ward.D2
    ɹDistance : euclidean
    ɹEstimates on edges:
    ɹ si au bp se.si se.au se.bp v c pchi
    ɹ1 0.999 1.000 0.999 0.000 0.000 0.000 -3.175 0.156 0.000
    ɹ2 0.979 0.990 0.971 0.001 0.001 0.001 -2.117 0.221 0.000
    ɹ3 0.579 0.801 0.759 0.004 0.002 0.001 -0.775 0.073 0.000
    ɹ4 0.392 0.718 0.655 0.003 0.002 0.001 -0.487 0.089 0.000
    ɹ5 0.771 0.900 0.852 0.004 0.002 0.002 -1.151 0.105 0.000
    ɹ6 0.112 0.634 0.455 0.003 0.002 0.001 -0.119 0.231 0.000
    ɹ7 0.213 0.795 0.323 0.004 0.002 0.002 -0.182 0.641 0.000
    ɹ8 0.532 0.770 0.755 0.004 0.002 0.001 -0.716 0.025 0.000
    ɹ9 0.266 0.779 0.394 0.004 0.002 0.002 -0.250 0.520 0.000
    ɹ10 0.915 0.969 0.878 0.003 0.001 0.002 -1.518 0.354 0.183
    ɹ11 0.223 0.757 0.397 0.004 0.002 0.001 -0.211 0.471 0.000
    ɹ12 0.229 0.704 0.475 0.003 0.002 0.001 -0.237 0.299 0.000
    ɹ13 0.246 0.752 0.409 0.004 0.002 0.001 -0.235 0.464 0.017
    ɹ14 0.229 0.691 0.492 0.003 0.002 0.001 -0.239 0.258 0.000

    View Slide

  14. 3Ͱ΍ͬͯΈΔ 14
    ݁ՌΛՄࢹԽͯ͠ΈΔ QΛ༗ҙͱ͢Δ

    ɹplot(mnist_boot, cex=0.5, cex.pv=0.5)ɹ# σϯυϩάϥϜͷϓϩοτ
    ɹpvrect(mnist_boot, alpha = 0.9, pv = “au") # p-value >= 0.9 Λғ͏
    ༗ҙ
    ͦͷଞ
    ༗ҙ
    ͦͷଞ

    View Slide

  15. 3Ͱ΍ͬͯΈΔ 15
    ݁ՌΛՄࢹԽͯ͠ΈΔ
    ɹplot(mnist_boot, cex=0.5, cex.pv=0.5)ɹ# σϯυϩάϥϜͷϓϩοτ
    ɹpvrect(mnist_boot, alpha = 0.9, pv = “au") # p-value >= 0.9 Λғ͏
    ڑ཭͸͍͕ۙ
    ༗ҙͰͳ͍

    View Slide

  16. ؆୯ʹ·ͱΊ
    w֊૚ΫϥελϦϯάʹ͓͚Δϒʔτετϥοϓ๏Λ༻͍ͨ
    ԾઆݕఆΛ঺հ
    wϦαϯϓϦϯά਺ΛϚϧνεέʔϧ ༷ʑͳ஋ΛͱΔ
    ʹ͢Δ
    ͜ͱͰෆภͳݕఆΛߦ͏͜ͱ͕Ͱ͖Δ
    w3Ͱ͸QWDMVTUύοέʔδͷQWDMVTUؔ਺Λ༻͍Δ͜ͱͰ࣮
    ߦ͢Δ͜ͱͰ͖Δ
    16

    View Slide

  17. ࢀߟ
    ϒʔτετϥοϓ๏ʹΑΔΫϥελ෼ੳͷόϥπΩධՁ
    IUUQTXXXJTNBDKQFEJUTFDUPVLFJQEGQEG
    Ϛϧνεέʔϧϒʔτετϥοϓͷ઴ۙཧ࿦
    ɹIUUQTXXXUFSSBQVCDPKQKPVSOBMTKKTTKQEGQEG
    ϚϧνεςοϓʹϚϧνεέʔϧɾϒʔτετϥοϓ๏Լฏӳण
    ɹIUUQTUBUTZTJLZPUPVBDKQUJUFDINVMUJCPPUKIUNM
    Ϛϧνεέʔϧϒʔτετϥοϓ๏ʹΑΔΫϥελϦϯάͷ༗ҙࠩݕఆ
    ɹ[email protected]
    17

    View Slide

  18. &/% 18
    &OKPZ

    View Slide