Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Prophetを使った時系列予測

Kan Nishida
February 26, 2020

 Prophetを使った時系列予測

- Prophetアルゴリズムの基本
- 検証のやり方
- 季節性での加法と乗法の使い分け
- 変数重要度と変数ごとの効果

Kan Nishida

February 26, 2020
Tweet

More Decks by Kan Nishida

Other Decks in Technology

Transcript

  1. 3 εϐʔΧʔ ੢ా צҰ࿠ CEO EXPLORATORY ུྺ 2016೥य़ɺσʔλαΠΤϯεͷຽओԽͷͨΊɺExploratory, Inc Λཱ

    ্ͪ͛Δɻ Exploratory, Inc.ͰCEOΛ຿ΊΔ͔ͨΘΒɺσʔλαΠΤϯεɾϒʔ τΩϟϯϓɾτϨʔχϯάͳͲΛ௨ͯ͠σʔλαΠΤϯεͷٕज़ͱख ๏ͷීٴͱڭҭʹऔΓ૊Ήɻ ถΦϥΫϧຊࣾͰɺ16೥ʹΘͨΓσʔλαΠΤϯεͷ։ൃνʔϜΛ཰ ͍ɺػցֶशɺϏοάɾσʔλɺϏδωεɾΠϯςϦδΣϯεɺσʔ λϕʔεʹؔ͢Δ਺ଟ͘ͷ੡඼ΛੈʹૹΓग़ͨ͠ɻ @KanAugust
  2. ୈ1ͷ೾ ୈ̎ͷ೾ ୈ̏ͷ೾ ϓϥΠϕʔτ(ߴ͍/ݹ͍) Φʔϓϯɾιʔε(ແྉ/࠷ઌ୺) UI & ϓϩάϥϛϯά ϓϩάϥϛϯά 2016

    2000 1976 ϚωλΠθʔγϣϯ ίϞσΟςΟԽ ຽओԽ ౷ܭֶऀ σʔλαΠΤϯςΟετ Exploratory ΞϧΰϦζϜ Ϣʔβʔɾ ମݧ πʔϧ Φʔϓϯɾιʔε(ແྉ/࠷ઌ୺) UI & ࣗಈԽ ϏδωεɾϢʔβʔ ςʔϚ σʔλαΠΤϯεͷຽओԽ
  3. 13 • Facebookʹ͍ͨσʔλαΠΤϯςΟετ ʢSean J. Taylor & co.ʣ͕࡞ͬͨ࣌ܥྻ༧ ଌΞϧΰϦζϜͰɺΦʔϓϯιʔεͱͯ͠ ެ։͞Ε͍ͯΔɻ(https://

    facebook.github.io/prophet) • ౷ܭɺ࣌ܥྻ༧ଌͷઐ໳஌͕ࣝͳͯ͘΋ ࢖͑ΔΑ͏ʹσβΠϯ͞Ε͍ͯΔɻ Prophet Sean J. Taylor @seanjtaylor
  4. 24 • ࣌ؒͷൃలΛϞσϧͰදݱ͢Δ͜ͱ͸͖͋ΒΊΔɻ • ͔ΘΓʹɺ୯ʹۂઢΛݟ͚ͭΔͱ͍͏໰୊ʹ͢Δ͜ͱʹΑͬͯҎԼͷΑ͏ͳ ར఺Λಘ͍ͯΔɻ • σʔλؒͷִ͕࣌ؒؒҰఆͰ͋Δඞཁ͸ͳ͍ɻ • ஋͕NA

    (ܽଛ஋)ͱͳΔ೔͕͋ͬͯ΋໰୊ͳ͍ɻ • ෳ਺ͷपظੑ (िͱ೥) ͕σϑΥϧτͰߟྀ͞ΕΔɻ • σϑΥϧτͷઃఆͰͦΕͳΓͷ༧ଌ͕Ͱ͖ΔɻઃఆՄೳͳύϥϝʔλͷଟ ͘͸ઐ໳஌ࣝແ͠ͰཧղՄೳɻ Prophetͷར఺
  5. 28

  6. Weekly tab shows up only when the data is daily

    or more granular levels (hour, minutes, etc.)
  7. Every week, the sales are low on Sunday and Monday,

    and the rest of the week is high.
  8. There are NA for some dates. You can impute NA

    as part of the Data Preprocessing.
  9. Under the Importance tab, you can see which seasonality has

    more effect on the forecasting outcome.
  10. • forecasted_value - ༧ଌ஋ • forecasted_value_high/forecasted_value_low - ෆ֬ఆ۠ؒ • trend

    - େہతͳ੒௕τϨϯυ • yearly - ೥पظͷτϨϯυ • weekly - िपظͷτϨϯυ 53 ༧ଌ෇͖ͷσʔλͷಡΈํ
  11. 66 • RMSE (Root Mean Square Error) : ༧ଌ͔ΒͷͣΕͷೋ৐ͷฏۉͷϧʔτ •

    MAE (Mean Absolute Error) : ༧ଌ͔ΒͷͣΕͷઈର஋ͷฏۉ • MAPE (Mean Absolute Percentage Error) : ύʔηϯτͰදͨ͠༧ଌ͔Βͷ ͣΕͷઈର஋ͷฏۉ • MASE (Mean Absolute Scaled Error) : MAEΛɺτϨʔχϯάσʔλͰͷφ Πʔϒ༧ଌʢҰͭલͷظͱಉ͡஋͕ݱΕΔͰ͋Ζ͏ͱ͍͏୯७ͳ༧ଌʣ ͷMAEͰׂͬͨ΋ͷɻ ࣌ܥྻ༧ଌͷධՁࢦඪ
  12. 
 22 + 22 + 22 + 42 
 4

    (఺ͷ਺) 4 + 4 + 4 + 16 
 4 7 = 2.65 68 RMSE (Root Mean Square Error) 2 2 4 2 = = ྫ͑͹ɺ࣮ଌ஋ͱ༧ଌ஋ͷޡ͕ࠩͦΕ ͧΕ2, 2, 2, 4ͩͬͨͱ͢Δͱɺܭࢉ͸ ҎԼͷΑ͏ʹͳΔɻ
  13. 
 2 + 2 + 2 + 4 
 4

    (఺ͷ਺) 70 ྫ͑͹ɺ࣮ଌ஋ͱ༧ଌ஋ͷޡ͕ࠩ ͦΕͧΕ2, 2, 2, 4ͩͬͨͱ͢Δ ͱɺܭࢉ͸ҎԼͷΑ͏ʹͳΔɻ = 2.5 MAE (Mean Absolute Error) 2 2 4 2
  14. 73 12 13 16 11 MAPE (Mean Absolute Percentage Error)

    2 2 4 2 ࣍ʹɺ࣮ଌ஋ͱ༧ଌ஋ͷޡࠩ Λ΋ͱΊΔɻ
  15. 74 100 100 100 100 MAPE (Mean Absolute Percentage Error)

    16.6% 15.4% 25% 18.2% ࣮ଌ஋ͱ༧ଌ஋ͷޡࠩΛ࣮ଌ஋Ͱ ׂͬͯ100Λ͔͚ɺͦΕͧΕͷ ύʔηϯςʔδΛ΋ͱΊΔɻ ਺ࣈ͕ϚΠφεͷ৔߹ɺϚΠφε ͷූ߸ΛͱΔ (ઈର஋).
  16. 75 100 100 100 100 MAPE (Mean Absolute Percentage Error)

    16.6% 15.4% 25% 18.2% 
 16.6 + 15.4 + 18.2 + 25 
 4 (఺ͷ਺) ࠷ޙʹɺ͜ΕΒͷ஋ͷฏۉΛग़͢ɻ = 18.8%
  17. 82 The difference between the actual line and the forecasted

    line becomes wider as the time progresses.
  18. 100

  19. The forecasting model quality has improved for a little bit.

    ϕʔεϞσϧʹച্ใुΛ෇͚଍ͨ͠ ϕʔεϞσϧʹച্ใुɺϚʔέςΟϯάඅ༻ɺׂҾ཰Λ෇͚଍ͨ͠