Upgrade to Pro — share decks privately, control downloads, hide ads and more …

データドリブンな仮説検証のためのSelective Inference

saltcooky
October 26, 2019

データドリブンな仮説検証のためのSelective Inference

TokyoR#82 で発表したLT

saltcooky

October 26, 2019
Tweet

More Decks by saltcooky

Other Decks in Science

Transcript

  1. σʔλυϦϒϯͳԾઆݕূͷͨΊͷ
    4FMFDUJWF*OGFSFODF
    !TBMUDPPLZ
    5PLZP3
    1

    View Slide


  2. • !TBMUDPPLZ
    • 3ྺɿ೥͙Β͍͔ͳ
    • ۈઌɿݪ॓ʹ͋Δ*5ܥͷձࣾ
    • ࢓ࣄ಺༰ɿ3%తͳ෦ॺͰ3Λ࢖ͬͨ

    ɾσʔλ෼ੳ԰͞Μ

    ɾσʔλूܭ԰͞Μ
    • झຯɿ෰ϑΝογϣϯඒज़ؗ८Γ
    2

    View Slide

  3. ಺༰ΛҰݴͰ

    ࠷ۙษڧͨ͠4FMFDUJWF*OGFSFODFΛ঺հ

    3

    View Slide

  4. 4
    ౷ܭ෼ੳͷ࿩
    Ծઆݕఆ

    ؼແԾઆ͕غ٫Ͱ͖Δ͔
    ݁ՌධՁ

    ༗ҙͳ͕ࠩ͋Δͱ൑໌
    ஌ࣝܦݧ͔ΒͷԾઆܗ੒

    ʮҿञ͸ʹྑ͍ʯ
    ஌ࣝۦಈܕͳ౷ܭ෼ੳ
    σʔλ

    View Slide

  5. 5
    σʔλʹجͮ͘Ծઆܗ੒

    ʮҿञ͸ʹྑ͍ʯ
    σʔλۦಈܕͳ౷ܭ෼ੳ
    σʔλ
    Ծઆݕఆ

    ؼແԾઆ͕غ٫Ͱ͖Δ͔
    ݁ՌධՁ

    ༗ҙͳ͕ࠩ͋Δͱ൑໌
    ౷ܭ෼ੳͷ࿩

    View Slide

  6. 6
    σʔλʹجͮ͘Ծઆܗ੒

    ʮҿञ͸ʹྑ͍ʯ
    σʔλۦಈܕͳ౷ܭ෼ੳ
    ಘΒΕͯσʔλʹ

    ۮવͦͷΑ͏ͳ܏޲͕

    ͋ͬͨͷ͔΋͠Εͳ͍
    σʔλ
    Ծઆݕఆ

    ؼແԾઆ͕غ٫Ͱ͖Δ͔
    ݁ՌධՁ

    ༗ҙͳ͕ࠩ͋Δͱ൑໌
    ͦͷΑ͏ͳ܏޲͕͋ΔσʔλͰ

    ݕఆΛ͢Δͱ༗ҙʹͳΓ΍͍͢ʁ
    ౷ܭ෼ੳͷ࿩

    View Slide

  7. 7
    Ծઆબ୒όΠΞε
    σʔλʹجͮ͘Ծઆܗ੒

    ʮҿञ͸ʹྑ͍ʯ
    σʔλۦಈܕͳ౷ܭ෼ੳ
    σʔλ
    Ծઆݕఆ

    ؼແԾઆ͕غ٫Ͱ͖Δ͔
    ݁ՌධՁ

    ༗ҙͳ͕ࠩ͋Δͱ൑໌
    Ծઆબ୒όΠΞε
    σʔλΛجʹԾઆܗ੒Λ͓͜ͳͬͨͱ͖

    Ծઆݕఆʹੜͯ͡͠·͏όΠΞε

    View Slide

  8. Ծઆબ୒όΠΞεͷ֬ೝ
    ࣍ͷΑ͏ͳਅͷճؼϞσϧΛߟ͑Δ




    ͦͯ͠ɺ໨తม਺ʹӨڹ͕ͳ͍ม਺ͷܭม਺͕ಘΒΕ͍ͯΔͱԾఆ

    ҎԼΛճ܁Γ܁Γฦ͢
    w TUFQXJTFʹΑΔม਺બ୒Λߦ͍ճؼϞσϧΛಘΔ
    w ಘΒΕͨϞσϧʹ͓͚Δ܎਺ͷ༗ҙ͔Ͳ͏͔ݕఆ͢Δ Ћ

    w غ٫ׂ߹غ٫͞Εͨճ਺ݕఆΛߦͳͬͨճ਺
    8

    View Slide

  9. 9
    Ծઆબ୒όΠΞεͷ֬ೝ
    Ќ Ќ Ќ Ќ Ќ Ќ Ќ Ќ Ќ Ќ
    غ٫͞Εͨ਺
    غ٫͞Ε͔ͨ
    ͬͨ਺

    غ٫ൺ཰

    ʻม਺બ୒ΛߦΘͳ͍৔߹ʼ

    όΠΞε͕ͳ͍৔߹ɺ༗ҙਫ४ͱغ٫ׂ߹͸౳͘͠ͳΔ


    View Slide

  10. Ծઆબ୒όΠΞεͷ֬ೝ
    ؼແԾઆͷ΋ͱͰ͸Q஋͸Ұ༷෼෍ʹै͓ͬͯΓɺόΠΞε͕ͳ͍
    10

    View Slide

  11. 11
    Ծઆબ୒όΠΞεͷ֬ೝ
    Ќ Ќ Ќ Ќ Ќ Ќ Ќ Ќ Ќ Ќ
    غ٫͞Εͨ਺
    غ٫͞Ε͔ͨ
    ͬͨ਺

    غ٫ׂ߹

    ม਺બ୒Λߦ͏৔߹

    غ٫ׂ߹͸༗ҙਫ४ΑΓඇৗʹେ͖͘ͳ͍ͬͯΔ


    View Slide

  12. 12
    Ծઆબ୒όΠΞεͷ֬ೝ
    Q஋͸Ұ༷෼෍ʹै͓ͬͯΒͣɺόΠΞε͕৐͍ͬͯΔ

    ˠ༗ҙͰͳ͍ͷʹ༗ҙͳม਺Ͱ͋Δͱ͢ΔՄೳੑ͕ߴ͘ͳΔ

    View Slide

  13. 4FMFDUJWF*OGFSFODF
    Ծઆܗ੒ͷΠϕϯτͷٯ૾%Λ৚݅෇͚Δ͜ͱͰɺ

    Ծઆબ୒όΠΞεͷͳ͍ਪఆΛߦ͏͜ͱ͕Ͱ͖Δ
    13
    ɿԾઆ ಛ௃ྔ
    ू߹
    ɿԾઆબ୒Πϕϯτ

    ಛ௃બ୒ΞϧΰϦζϜ

    ɿબ୒͞ΕͨԾઆ ಛ௃ྔ

    ը૾ग़యIUUQTXXXJFJDFPSHdTJUBGPSVNBSUJDMFQEG

    View Slide

  14. 4FMFDUJWF*OGFSFODF
    "ճؼ෼ੳʹ͓͚Δಛ௃બ୒ΠϕϯτΛͲ͏දݱ͢Ε͹ྑ͍͔
    2ઢܗͳܗͰදݱ͢Ε͹ྑ͍ʢ-FFFUBM ʣ
    14
    ը૾ग़యIUUQTXXXJFJDFPSHdTJUBGPSVNBSUJDMFQEG

    View Slide

  15. 4FMFDUJWF*OGFSFODF
    ճؼ෼ੳʹ͓͚Δಛ௃બ୒Πϕϯτ
    w .BSHJOBM4DSFBOJOH

    ɹϧʔϧఆٛʹج͍ͮͨม਺બ୒
    w MBTTP

    ɹ࠷దੑ৚݅ʹج͍ͮͨม਺બ୒
    w 4UFQXJTF

    ɹΞϧΰϦζϜʹج͍ͮͨม਺બ୒
    ͜ΕΒશͯઢܗͳಛ௃બ୒Πϕϯτ
    15
    ը૾ग़యIUUQTXXXJFJDFPSHdTJUBGPSVNBSUJDMFQEG

    View Slide

  16. 4FMFDUJWF*OGFSFODF
    ઢܗͳಛ௃બ୒ΠϕϯτΛ৚݅෇͚͜ͱͰɺؼແԾઆͷ෼෍͸

    ੾அਖ਼ن෼෍'ʹै͏
    16
    w 4FMFDUJWFQWBMVF
    w 4FMFDUJWFDPOpEFODFJOUFSWBMT
    ը૾ग़యIUUQTXXXJFJDFPSHdTJUBGPSVNBSUJDMFQEG

    View Slide

  17. 3Ͱ4FMFDUJWF*OGFSFODF
    3Ͱͷճؼ෼ੳͷ4FMFDUJWF*OGFSFODF͸

    TFMFDUJWF*OGFSFODFύοέʔδͰ࣮૷Մೳ
    17

    > # stepwiseʹ͓͚Δselective inferenceͷ࣮ߦྫ

    > library(selectiveInference)
    > gfit = fs(x,y) # x=આ໌ม਺ y=໨తม਺
    > out = fsInf(gfit,type = "aic",alpha = 0.05)

    > out # ݁Ռͷ֬ೝ


    View Slide

  18. 3Ͱ4FMFDUJWF*OGFSFODF
    3Ͱͷճؼ෼ੳͷ4FMFDUJWF*OGFSFODF͸

    TFMFDUJWF*OGFSFODFύοέʔδͰ࣮૷Մೳ
    18

    > # lassoʹ͓͚Δselective inferenceͷ࣮ߦྫ

    > gfit = glmnet(x,y)
    > lambda = .3
    > beta_ls = coef(gfit, s=lambda/n)[-1]
    > out = fixedLassoInf(x,y,beta_ls,lambda,sigma=sigma)

    View Slide

  19. 19
    3Ͱ4FMFDUJWF*OGFSFODF
    Ќ Ќ Ќ Ќ Ќ Ќ Ќ Ќ Ќ Ќ
    غ٫͞Εͨ਺
    غ٫͞Ε͔ͨ
    ͬͨ਺

    غ٫ׂ߹

    TUFQXJTFʹ͓͚Δ4*݁Ռ

    غ٫ׂ߹͸֓Ͷఔ౓ʹͳ͍ͬͯΔ

    View Slide

  20. 20
    3Ͱ4FMFDUJWF*OGFSFODF
    Q஋ͷ෼෍΋֓ͶҰ༷෼෍ʹै͓ͬͯΓɺόΠΞεͷͳ͍ਪఆ
    ͕Ͱ͖͍ͯΔͱߟ͑ΒΕΔ

    View Slide

  21. ΫϥελϦϯάʹ͓͚Δ4*
    ΫϥελϦϯάޙͷΫϥεؒʹ͕ࠩ͋Δ͔ͷݕఆ͢Δ࣌ʹ΋
    4FMFDUJWFJOGFSFODFΛߦΘͳ͍ͱ͍͚ͳ͍

    Ϋϥεͷܗ੒ʹ͸ར༻͍ͯ͠Δσʔλʹґଘ͍ͯ͠ΔͨΊ

    21
    ը૾ग़యIUUQTXXXJFJDFPSHdTJUBGPSVNBSUJDMFQEG

    View Slide

  22. ΫϥελϦϯάʹ͓͚Δ4*
    ֊૚ΫϥελϦϯάʹ͓͚ΔTFMFDUJWFJOGFSFODF͸ɺ
    QWDMVTU 㱢W
    ͱTDBMFCPPU 㱢W
    ύοέʔδͰ࣮૷Մೳ

    5FSBEB4IJNPEBJSB

    22

    View Slide

  23. ·ͱΊ
    w ม਺બ୒ΛߦͳͬͨճؼϞσϧʹ͓͚Δ܎਺ͷݕఆʹ͸ɺ

    Ծઆબ୒όΠΞε͕৐͍ͬͯΔ
    w TFMFDUJWFJOGFSFODFΛߦ͏͜ͱʹΑΓόΠΞεͷͳ͍ਪఆ
    Λߦ͏͜ͱ͕Ͱ͖Δ
    w TFMFDUJWF*OGFSFODFύοέʔδΛ࢖͍·͠ΐ͏
    w ΫϥελϦϯάʹ͓͚ΔTFMFDUJWFJOGFSFODF΋͋Γ·͢
    23

    View Slide

  24. ࢀߟ
    w &YBDUQPTUTFMFDUJPOJOGFSFODF XJUIBQQMJDBUJPOUPUIFMBTTP+BTPO%-FFFUBM

    IUUQTBSYJWPSHBCT
    w 4FMFDUJWFJOGFSFODFBGUFSWBSJBCMFTFMFDUJPOWJBNVMUJTDBMFCPPUTUSBQ5FSBEB
    4IJNPEBJSB 

    IUUQTBSYJWPSHBCT
    w "DPOEJUJPOBMBQQSPBDIUPJOGFSFODFBGUFSNPEFMTFMFDUJPO

    IUUQKPTIVBMPGUVTDPNQPTUDPOEJUJPOBMBQQSPBDIUPJOGFSFODFBGUFSNPEFM
    TFMFDUJPO
    w $PNQVUJOHTFMFDUJWFJOGFSFODFQWBMVFTPGDMVTUFSTVTJOHQWDMVTUBOETDBMFCPPU

    IUUQTUBUTZTJLZPUPVBDKQQSPHTDBMFCPPUQWDMVTUQEG
    w σʔλۦಈܕՊֶͷͨΊͷબ୒తਪ࿦ʢ4FMFDUJWF*OGFSFODFʣ

    IUUQTXXXJFJDFPSHdTJUBGPSVNBSUJDMFQEG
    w Ծઆݕఆʹ͓͚Δม਺બ୒ͷӨڹΛߟ͑Δ4FMFDUJWF*OGFSFODFೖ໳XJUI3

    IUUQTRJJUBDPNTBMUDPPLZJUFNTCFCGECDG
    24

    View Slide