Analysis and Estimation of News Article Reading Time with Multimodal Machine Learning
Analysis and Estimation of News Article Reading Time with Multimodal Machine Learning
Shotaro Ishihara, Yasufumi Nakama (IEEE BigData 2022, Industrial & Government Track)
Shotaro Ishihara (Nikkei Inc.), and Yasufumi Nakama [email protected] IEEE BigData 2022, Industry and Government Program Does Text Length matter? Analysis and Estimation of News Article Reading Time with Multimodal Machine Learning
Summary 1: Dataset 3 ● text length ● headline / body text ● thumbnail image ● others like genre ● past reading history reading time ✅ Real-world content and access log of Nikkei
Summary 2: Text length 4 ● text length ● headline / body text ● thumbnail image ● others like genre ● past reading history reading time ✅ Doesn’t strongly correlate with reading time
Summary 3: Multimodal 5 ● text length ● headline / body text ● thumbnail image ● others like genre ● past reading history reading time ✅ Boosted performance
Research questions 9 1. How much does text length correlate with reading time? 2. How much do features other than text length improve the performance of reading time estimation?
Reading time dataset 10 ● A large dataset that includes reading time from Japanese financial news from the Nikkei. ○ About 1,000 articles a day, 800,000 paid subscribers (and data infrastructure) ○ Larger and more scalable than some existing data on recording eye movements [8] [9] and brain activity [10]
Experiments: Features & Models 14 1. The model was fixed to LightGBM [16] and the features were explored. 2. We fixed the features and observed differences. a. Ridge regression b. MLP c. Proposed method (w/wo E2E fine-tuning)
1. mean reading time 2. text length 3. minimum reading time 4. embedding of body text (dimension 193) 5. embedding of thumbnail image (dimension 88) Important features by LightGBM 16
Multimodal training tips 18 ● Different learning rate: 2e-5 for BERT, 1e-4 for Swin Transformer, and 1e-2 for the others ● CosineAnnealingLR: For training stability
Conclusion 19 ● We highlighted the importance of reading time and evaluated the implementation. ● Our analysis revealed reading time does not strongly correlate with text length. ● Our experiments showed a multimodal machine learning approach led to a more accurate estimation than simply using text length.