Slide 1

Slide 1 text

tidymodels tidy 2024/12/07 Japan.R 2024 #JapanR @dropout009

Slide 2

Slide 2 text

REVISIO CDO X: @dropout009 Speaker Deck: dropout009 Blog: https://dropout009.hatenablog.com/

Slide 3

Slide 3 text

l l l tidymodels tidy l

Slide 4

Slide 4 text

No content

Slide 5

Slide 5 text

l REVISIO l l tidymodels

Slide 6

Slide 6 text

No content

Slide 7

Slide 7 text

l survival analysis l

Slide 8

Slide 8 text

𝑇 𝑇 0 𝑓(𝑡) 𝐹(𝑡) 𝑡 𝐹 𝑡 = Pr 𝑇 ≤ 𝑡 = + ! " 𝑓 𝑢 𝑑𝑢 𝑡 100% 𝑆 𝑡 𝑆 𝑡 = Pr 𝑇 > 𝑡 = 1 − 𝐹 𝑡

Slide 9

Slide 9 text

ℎ 𝑡 𝑡 Δ𝑡 Δ𝑡 ℎ 𝑡 = lim #"→! Pr 𝑡 < 𝑇 ≤ 𝑡 + Δ𝑡 ∣ 𝑇 > 𝑡 Δ𝑡 = lim #"→! Pr 𝑡 < 𝑇 ≤ 𝑡 + Δ𝑡, 𝑇 > 𝑡 Δ𝑡 × 1 Pr 𝑇 > 𝑡 = lim #"→! Pr 𝑡 < 𝑇 ≤ 𝑡 + Δ𝑡 Δ𝑡 × 1 Pr 𝑇 > 𝑡 = lim #"→! 𝐹 𝑡 + 𝛥𝑡 − 𝐹 𝑡 Δ𝑡 × 1 Pr 𝑇 > 𝑡 = 𝑓(𝑡) 𝑆 𝑡

Slide 10

Slide 10 text

𝑆′ 𝑡 = 𝑑 𝑑𝑡 1 − 𝐹(𝑡) = −𝑓 𝑡 ℎ 𝑡 = 𝑓 𝑡 𝑆(𝑡) = − 𝑆% 𝑡 𝑆 𝑡 = − 𝑑 𝑑𝑡 log 𝑆 𝑡 𝑆 𝑡 = exp − + ! " ℎ 𝑢 𝑑𝑢 𝑓 𝑡 = ℎ(𝑡)𝑆 𝑡 = ℎ(𝑡) exp − + ! " ℎ 𝑢 𝑑𝑢

Slide 11

Slide 11 text

l l l 𝜆 l ℎ 𝑡 = 𝜆 l 𝑆 𝑡 = exp − * ! " 𝜆𝑑𝑢 = 𝑒#$" l 𝑓 𝑡 = ℎ 𝑡 𝑆 𝑡 = 𝜆𝑒#$"

Slide 12

Slide 12 text

l l l l ℎ 𝑡 = 𝜙𝜆𝑡%#& l 𝑆 𝑡 = exp − * ! " 𝜙𝜆𝑡%#&𝑑𝑢 = 𝑒#$"! l 𝑓 𝑡 = ℎ 𝑡 𝑆 𝑡 = 𝜙𝜆𝑡%#&𝑒#$"!

Slide 13

Slide 13 text

ℎ 𝑡 = 𝜙𝜆𝑡!"# 𝑆 𝑡 = 𝑒"$%! • 𝜙 < 1 • 𝜙 = 1 • 1 < 𝜙 < 2 • 𝜙 = 2 • 𝜙 > 2

Slide 14

Slide 14 text

l 𝒙 l 蓄 𝜆 = 𝛼𝑒𝒙&𝜷 ℎ 𝒙 = 𝛼𝑒𝒙&𝜷 = 𝛼𝑒(')*((()⋯)*)() l 𝑥, 1 ℎ 𝑥-, … , 𝑥, + 1, … , 𝑥. = 𝛼𝑒(')*((()⋯ **)- (*)⋯)*)() = 𝛼𝑒𝒙&𝜷 𝑒(* = ℎ 𝒙 𝑒(* 𝛽, 1 𝑒(*

Slide 15

Slide 15 text

l l 𝑖 = 1, … , 𝑛 𝑑/ ∈ 0, 1 𝑡/ l 𝑡/ 𝑡" 𝑡# 𝑡$ 𝑡% 𝑑" = 1 𝑑# = 0 𝑑$ = 1 𝑑% = 0 機器1 機器2 機器3 機器4

Slide 16

Slide 16 text

𝐿 = O /0- 1 𝑓 𝑡/ 2+ Pr 𝑇 > 𝑡/ -32+ = O /0- 1 𝑓 𝑡/ 2+𝑆 𝑡/ -32+ = O /0- 1 𝑓 𝑡/ 𝑆 𝑡/ 2+ 𝑆 𝑡/ = O /0- 1 ℎ 𝑡/ 2+𝑆 𝑡/ log 𝐿 = P /0- 1 𝑑/ log ℎ 𝑡/ + log 𝑆 𝑡/ log 𝐿 = P /0- 1 𝑑/ log 𝜆 − 𝜆𝑡/ ⾒ log 𝐿 = 0 ,-# . 𝑑, log 𝛼𝑒𝒙& '𝜷 − 𝛼𝑒𝒙& '𝜷 𝑡,

Slide 17

Slide 17 text

tidymodels tidy

Slide 18

Slide 18 text

l tidymodels l modeldata IBM Watson 蓄 l churn: l tenure: l IBM Watson

Slide 19

Slide 19 text

tidymodels censored Surv(t, d) 0 1 rsample total_charge rsample

Slide 20

Slide 20 text

⾒ survival_reg(dist = "weibull") factor step_corr() ⾒ workflows parsnip recipes

Slide 21

Slide 21 text

tidy() parsnip

Slide 22

Slide 22 text

augment() ROC-AUC parsnip yardstick

Slide 23

Slide 23 text

l l brier_survival() l ROC-AUC roc_auc_survival() l Concordance index concordance_survival() l l 𝑠 𝑖 𝑤/4 𝑑/4 S 𝑆/ 𝑠 𝑠 BrierScore4 = 1 ∑/0- 1 𝑤/4 P /0- 1 𝑤/4 𝑑/4 − S 𝑆/ 𝑠 5

Slide 24

Slide 24 text

mtry min_n ⾒ ”aorsf” Accelerated Oblique Random Survival Forests https://docs.ropensci.org/aorsf/index.html tune dials parsnip

Slide 25

Slide 25 text

workflows yardstick recipes

Slide 26

Slide 26 text

⾒ tune_bayes() tune_grid() 10 tune rsample

Slide 27

Slide 27 text

tune

Slide 28

Slide 28 text

tune

Slide 29

Slide 29 text

No content

Slide 30

Slide 30 text

l l l l tidymodels R

Slide 31

Slide 31 text

No content

Slide 32

Slide 32 text

• Dobson, Annette J., and Adrian G. Barnett. An introduction to generalized linear models. Chapman and Hall/CRC, 2018. • Allison, Paul D( ), ( ). . . 2021. • , . . . 2022-2023. 52 . 2 . p.69-112. • , , . R tidymodels[ ] . . 2023. • Fitting and Predicting with censored. https://censored.tidymodels.org/index.html • Dynamic Performance Metrics for Event Time Data. https://www.tidymodels.org/learn/statistics/survival-metrics/ • Accounting for Censoring in Performance Metrics for Event Time Data. https://www.tidymodels.org/learn/statistics/survival-metrics-details/ • How long until building complaints are dispositioned? A survival analysis case study. https://www.tidymodels.org/learn/statistics/survival-case-study/