Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Generating a Pairwise Dataset for Click-through Rate Prediction of News Articles Considering Positions and Contents

Generating a Pairwise Dataset for Click-through Rate Prediction of News Articles Considering Positions and Contents

Presentation slide used in 2022 Computation + Journalism Conference
https://cj2022.brown.columbia.edu/

B1cc148711c6a37a5c922b6e72a4ad52?s=128

Shotaro Ishihara

June 09, 2022
Tweet

More Decks by Shotaro Ishihara

Other Decks in Research

Transcript

  1. Generating a Pairwise Dataset for Click-through Rate Prediction of News

    Articles Considering Positions and Contents Shotaro Ishihara (Nikkei, Inc.), Yasufumi Nakama shotaro.ishihara@nex.nikkei.com 2022 Computation + Journalism Conference June 9-11, 2022
  2. Research Overview 2 • Click-through Rate (CTR) prediction is a

    common task, and useful for evaluating the quality of headlines and thumbnail images. • However, the CTR prediction model trained with users log data is heavily affected by the display position. • Therefore, this research proposes a method for generating a pairwise dataset for training the CTR prediction model through a framework of pairwise learning-to-rank. • We verified its usefulness by experiments and discussed the potential for editing support.
  3. Nikkei Overview 3 • Nikkei's core business is newspaper publishing.

    Total print and digital subscribers of the Nikkei reach around 3 million. • The Nikkei is known as the must-read paper for Japanese professionals with extensive coverage of Japan's economy, industry and markets. • With more than 40 affiliated companies, the group business spreads to publishing, broadcasting, events, database services and index business. • Financial Times is also part of the Nikkei.
  4. Outline 4 • Introduction • Related Works • Proposed Method

    • Experiments • Use Case for Editing Support • Conclusion and Future Work
  5. Headlines and thumbnail images matters 5 • Many news services

    displays a list of articles, and individual article pages often provide guidance on related articles. • Readers also decide whether to move on to the article page based on the information displayed in the external inflow, such as social networking services and browser searches.
  6. One of the solutions to measure the quality 6 Nikkei,

    Inc. utilizes the pattern tests for providing multiple options. Randomized controlled trial Published Pattern test with multi-armed bandit Published Distribution rate
  7. Practical difficulties in online evaluation 7 • There are situations

    where it is desirable to present uniform information to all readers for news of high importance. • The possibility that low-quality options may negatively affect the user experience during experiments must be taken into account. • The workload of editors would be increased in terms of the need to produce several candidates of sufficiently high quality to present to the readers.
  8. Editing support of the CTR prediction model 8 headline thumbnail

    image model create, update feedback predicted CTR publish
  9. Position bias: Difficulty in machine learning 9 • The higher

    the position, the higher its CTR. • If the raw CTR data is simply used as a training dataset, there is a concern that a prediction model would be created that focuses with more importance on the display position than on the information of the article itself.
  10. Pairwise learning-to-rank 10 • We construct a pairwise dataset using

    the similarity of display positions and contents. • We build a model with learning-to-rank framework: focusing more on contents information by learning to compare the two pairs of articles. model CTR: 0.05, 0.01
  11. Outline 11 • Introduction • Related Works • Proposed Method

    • Experiments • Use Case for Editing Support • Conclusion and Future Work
  12. Related works in three perspectives 12 • CTR prediction: Deep

    learning [25, 26], Multi-modal [13] • Position bias: Pairwise learning-to-rank [9, 23] • Editing support: CTR prediction [17], Headline generation [16, 24] Case study on Yahoo! News, which is similar in problem setting.
  13. Position of this research 13 • CTR prediction: Deep learning

    [25, 26], Multi-modal [13] • Position bias: Pairwise learning-to-rank [9, 23] • Editing support: CTR prediction [17], Headline generation [16, 24] 1. Consideration of position bias derived from service UI. 2. Not only headlines but also thumbnail images. 3. Discussion on use case of headline generation.
  14. Outline 14 • Introduction • Related Works • Proposed Method

    • Experiments • Use Case for Editing Support • Conclusion and Future Work
  15. Overview of the proposed method 15 CTR of individual articles

    Generating a pairwise dataset display position = 1 cluster number = 1 display position = 1 cluster number = 2 … display position = 10 cluster number = 1000 Extracting two pairs of articles from a set that satisfy the set size condition Building a model for predicting CTR using pairwise learning-to-rank model CTR: 0.05, 0.01
  16. 📝 Notes: • Clustering: k-means++ [1] • Vectorizing: TF-IDF [19]

    • Hyperparameters: The number of clusters Clustering and creating candidate sets 16 CTR of individual articles Generating a pairwise dataset display position = 1 cluster number = 1 display position = 1 cluster number = 2 … display position = 10 cluster number = 1000
  17. Extracting two pairs of articles 17 CTR of individual articles

    Generating a pairwise dataset display position = 1 cluster number = 1 display position = 1 cluster number = 2 … display position = 10 cluster number = 1000 📝 Notes: • Hyperparameters: Maximum set size Extracting two pairs of articles from a set that satisfy the set size condition
  18. Building a model by pairwise learning-to-rank 18 CTR of individual

    articles Generating a pairwise dataset display position = 1 cluster number = 1 display position = 1 cluster number = 2 … display position = 10 cluster number = 1000 Extracting two pairs of articles from a set that satisfy the set size condition Building a model for predicting CTR using pairwise learning-to-rank model CTR: 0.05, 0.01
  19. Margin Ranking Loss 19 The loss function we use for

    pairwise learning-to-rank:
  20. Outline 20 • Introduction • Related Works • Proposed Method

    • Experiments • Use Case for Editing Support • Conclusion and Future Work
  21. Dataset from the Nikkei Online Edition 21 • SingleCTR: Raw

    CTR data • PatternCTR: ◦ Pattern test results. ◦ We use its accuracy for evaluation metric. • PairwiseCTR: ◦ Generated from SingleCTR. ◦ We use it for training and validation.
  22. Four types of models are prepared: • Baseline: with headline

    and thumbnail image. • Baseline + display position + published date time: including information as input. • Baseline + fixed CTR: correcting the CTR of the training dataset. • Proposed method: trained with PairwiseCTR. Models 22 headline BERT thumbnail image EfficientNet display position published date time fully connected layer
  23. Result tables 23

  24. Result summary 24 • Baseline: suggested the existence of position

    bias. • Baseline + display position + published date time: showed improvement for headlines, while no clear performance improvement could be confirmed for thumbnail images. • Baseline + fixed CTR: did not contribute to the performance. • Proposed method: showed particularly high performance for thumbnail images. There was also a certain improvement for headlines compared to the baseline, in some cases obtaining results as good as 0.720.
  25. Outline 25 • Introduction • Related Works • Proposed Method

    • Experiments • Use Case for Editing Support • Conclusion and Future Work
  26. Workflow in automatic headline generation 26 • Editor's decision-making can

    be assisted with the predicted CTR. • It should also be available as for one perspective for summarization. • We can also present a visualization of the weights.
  27. Be careful not to create clickbait 27 • It is

    necessary to be aware of the clickbait issues. • Even if the CTR is high, headlines and thumbnail images that do not match the body text would damage the user experience. • We also tackle this issue, for example creating a recognizing textual entailment model.
  28. Outline 28 • Introduction • Related Works • Proposed Method

    • Experiments • Use Case for Editing Support • Conclusion and Future Work
  29. Conclusion and Future Work 29 • This research proposed a

    method to generate a pairwise dataset for creating the CTR prediction model in the framework of pairwise learning-to-rank considering position bias. • The experiment reported the better performance potential, and the practical use as editing support was explained. • The future work is to expand the evaluation dataset for larger scale performance evaluation. 📧 shotaro.ishihara@nex.nikkei.com 📘 https://speakerdeck.com/upura/