Slide 1

Slide 1 text

Single Model for Influenza Forecasting of Multiple Countries by Multi-task Learning Taichi Murayama, Shoko Wakamiya, Eiji Aramaki Nara Institute of Science and Technology (NAIST) ECML-PKDD 2021

Slide 2

Slide 2 text

2 Importance of forecasting infectious diseases l Social Impact: Influenza and COVID-19 l Influenza has resulted in between 12,000 - 61,000 deaths annually l The rapid increase of infected people causes the medical crisis l The necessity of preventive measures of infectious diseases by public health Background https://www.bcbswny.com/content/wny/provider/news/blue-bulletin/an-early-start-to-flu-season.html

Slide 3

Slide 3 text

3 Research of forecasting infectious disease (mainly Influenza) l For more accurate forecasting, they utilize ML and NN-based model, not compartment models such as SIR l Recently, the forecasting using user-generated content (UGC) has become the mainstream of research l Google Flu [Ginsberg et al. 2009] l Twitter [Paul et al., 2014] l Most research target one country’s epidemic situation Background

Slide 4

Slide 4 text

4 Research of forecasting infectious disease (mainly Influenza) l For more accurate forecasting, they utilize ML and NN-based model, not compartment models such as SIR l Recently, the forecasting using user-generated content (UGC) has become the mainstream of research l Google Flu [Ginsberg et al. 2009] l Twitter [Paul et al., 2014] l Most research target one country’s epidemic situation Background We propose a single model targeting at forecasting flu in multiple countries

Slide 5

Slide 5 text

5 The ILI (Influenza like illness) rates of multiple countries l Same behavior: epidemic in winter and calm down in summer Background: Can we make the forecasting model for multiple countries? min max 2017/01/05 2018/01/04 2019/01/03 The tendency of ILI rates in each country is similar ⇒ Are the useful features for forecasting ILI rates in future also similar?

Slide 6

Slide 6 text

6 Building a single model for forecasting influenza in multiple countries Overview l Method l We treat the forecasting task in multiple countries as multi-task problem l We utilize Encoder-Decoder model to be useful for forecasting ILI rates l Input: Time series of search queries (Google) + ILI rates in past l Output: ILI rates in future (1-week to 5-week ahead) l Advantage l Build the forecasting model with high accuracy l Treating as multi-task problem covers the problem of little data (In many countries, data on infectious diseases are small) Purpose of our research ILI rates Search queries

Slide 7

Slide 7 text

7 Building a single model for forecasting influenza in multiple countries Overview l Problem l How do we find suitable search queries in multiple languages? l Input: Time series of search queries (Google) + ILI rates in past l How do we effectively utilize two types of data (ILI rates and search queries) for the forecasting? l Search queries have possibility not to be useful for the forecasting [Emily L. Aiken, 2019] Purpose of our research

Slide 8

Slide 8 text

8 Building a single model for forecasting influenza in multiple countries Overview l Problem l How do we find suitable search queries in multiple languages? l Input: Time series of search queries (Google) + ILI rates in past l How do we effectively utilize two types of data (ILI rates and search queries) for the forecasting? l Search queries have possibility not to be useful for the forecasting [Emily L. Aiken, 2019] Purpose of our research Find the useful search queries leveraging by Word-Alignment method Propose the novel forecasting model leveraging by Attention mechanism

Slide 9

Slide 9 text

9 Building a single model for forecasting influenza in multiple countries Overview l Problem l How do we find suitable search queries in multiple languages? l Input: Time series of search queries (Google) + ILI rates in past l How do we effectively utilize two types of data (ILI rates and search queries) for the forecasting? l Search queries have possibility not to be useful for the forecasting [Emily L. Aiken, 2019] Purpose of our research Find the useful search queries leveraging by Word-Alignment method Propose the novel forecasting model leveraging by Attention mechanism

Slide 10

Slide 10 text

10 Dataset l ILI (Influenza like Illness) rates Data l Target: 5 countries United States(U.S.), Japan(JP), England(UK), France(FR), Australia (AU) l Term: 26th week in 2013 – 29th week in 2020 l Report ILI rates per a week (52 points in a year) l Search queries l We utilize Google Trend (the term is the same as ILI rates data) l Method for selecting search queries l English: List of previous research [Zou B, 2018] l Other languages (fr and jp): WT-based Preparation: Dataset

Slide 11

Slide 11 text

11 Find search queries other than in English Word-alignment + Time-series correlation-based (WT-based) Preparation: Dataset l Word-Alignment l Method of building multi-lingual embedding l Extract the candidate of search queries by calculating of cosine similarities between English queries and words of other languages l Obtaining words that are difficult to obtain by translation ex. Orthographical variant ʮΠϯϑϧʯorʮΠϯϑϧΤϯβʯ l Time-series correlation l Obtaining words with high correlation between time series of Google Trend and ILI rates Select search queries with higher values, which is the combination of the similarity score by word alignment and the correlation score of time series

Slide 12

Slide 12 text

12 Preparation: Dataset Examples of the selected search queries

Slide 13

Slide 13 text

13 Proposed Model (One country) l Input: Two types of data (Time series of search queries, Historical ILI rates) Output: ILI rates in future (1-week to 5-week ahead) l Encoder-Decoder based Model l Problem: How do we effectively input two types of data (ILI rates and search queries) ? Proposed Model (One country) ILI rates in past Time series of search queries Proposed Model ILI rates in future ILI rates Search queries

Slide 14

Slide 14 text

14 Proposed Model (One country) Proposed Model (One country)

Slide 15

Slide 15 text

15 Proposed Model (One country) Proposed Model (One country) Input

Slide 16

Slide 16 text

16 Proposed Model (One country) Proposed Model (One country) Encoder-Decoder Model Input

Slide 17

Slide 17 text

17 Output Encoder-Decoder Model Input Our proposed Encoder-Decoder architecture to forecast ILI rates Proposed Model (One country) Proposed Model (One country)

Slide 18

Slide 18 text

18 Proposed Model (One country) Proposed Model (One country) Point 1: Split seasonalized and deseasonalized component into the historical ILI rates Point 2: Utilize the Attention Mechanism

Slide 19

Slide 19 text

19 Point 1: Split seasonalized and deseasonalized component l Search queries are useful features for the forecasting of non-seasonal component. l Our model only forecasts non-seasonal of time series, assuming that seasonal of time series is constant throughout a year. l Split the historical ILI data by Seasonal-trend decomposition using LOESS (STL) min max 2017/01/05 2018/01/04 2019/01/03 Proposed Model (One country)

Slide 20

Slide 20 text

20 l Combining search queries and the non-seasonal part of the ILI rate by Attention mechanism l Model con consider the useful search queries by weighting heavily Query 1 Query 2 Query L ・・・ Non-seasonal ILI rates Image of Attention in our model Point 2: Utilize the Attention Mechanism Proposed Model (One country)

Slide 21

Slide 21 text

21 Proposed Model (One country) Proposed Model (One country) Point 1: Split seasonalized and deseasonalized component into the historical ILI rates Point 2: Utilize the Attention Mechanism

Slide 22

Slide 22 text

22 One country Multiple countries Proposed Model (Multiple countries) Proposed Model (Targeting at Multiple Countries)

Slide 23

Slide 23 text

23 One country Multiple countries Proposed Model (Multiple countries) v v Proposed Model (Targeting at Multiple Countries)

Slide 24

Slide 24 text

24 l Treat ILI forecasting in multiple countries as separate task l Basic components have shared parameters for capturing general features (circled by Blue) l Particular components to each country, such as Attention, have different parameters (circled by Red) Proposed Model (Targeting at Multiple Countries) Proposed Model (Multiple countries)

Slide 25

Slide 25 text

25 l Set the initial representations of Encoder (GRU) as learnable embedding for each country: Country Embedding Proposed Model (Targeting at Multiple Countries) Proposed Model (Multiple countries)

Slide 26

Slide 26 text

26 &YQFSJNFOUBM4FUUJOHT l No. of search queries: 5 countries (JP, US, FR, UK, AU) º10 queries l Test term: 30th-week in 2017 – 29th-week in 2018 (Today’s Presentation) l Evaluate the forecasting score in 1-5 week ahead l Evaluation Metrics l RMSE: Smaller value indicating better performance l !" : Higher value indicating better performance l Comparable methods: GRU / Transformer / Multi-task Elastic Net Experiments

Slide 27

Slide 27 text

27 &YQFSJNFOUBM3FTVMUT64 Learning from U.S. Variations of proposed model • Proposed w/o sq: 1 country without search queries • Proposed_single: 1 country • Proposed_multi2: Learning from two countries (JP and US) • Proposed_multi5: Learning from five countries Experiments Learning from multiple countries

Slide 28

Slide 28 text

28 &YQFSJNFOUBM3FTVMUT64 Learning from U.S. Experiments Learning from multiple countries l Proposed model achieves the highest accuracy among models targeting at one country

Slide 29

Slide 29 text

29 l Proposed model achieves the highest accuracy among models targeting at multiple countries Experiments Learning from U.S. Learning from multiple countries &YQFSJNFOUBM3FTVMUT64

Slide 30

Slide 30 text

30 Experiments &YQFSJNFOUBM3FTVMUT'PVSDPVOUSJFT

Slide 31

Slide 31 text

31 )PXQSPQPTFENPEFMVUJMJ[FTFBSDIRVFSJFT Discussion l Degree of improvement between with and without search queries (GRU vs Proposed) l GRU w/o sq ⇒ GRU: Average improvement: 0.007 in RMSE and 0.001 in "# l Proposed w/o sq ⇒ Proposed_single Average improvement: 0.091 in RMSE and 0.017 in "# l Inputting simply search queries does not contribute much to the improvement of accuracy, while the introduction of Attention mechanism makes a significant improvement of accuracy.

Slide 32

Slide 32 text

32 7JTVBMJ[BUJPOUIFXFJHIUJOHPG"UUFOUJPO 2017/30th 2018/28th 2018/50th 2018/8th 2018/18th 2018/40th fever flu flu and fever the flu flu cough cough fever symptoms of flu flu headache the flu virus flu and cold contagious flu l The weighting of attention of each search query: Red indicates bigger value and blue indicates smaller value. l Attention changes the weights of search queries at the beginning of year, when the flu pandemic. Discussion

Slide 33

Slide 33 text

33 l Build a single model to forecast influenza epidemics in multiple countries l Treating the forecasting task in multiple countries as multi-task problem l Utilize search queries by Attention mechanism l Proposed model achieves higher accuracy than other models in influenza forecast task l Our experimental results show that Attention mechanism has possibility to be useful for finding suitable search queries Conclusion