Generating a Pairwise Dataset for Click-through Rate Prediction of News Articles Considering Positions and Contents

Generating a Pairwise Dataset for Click-through Rate Prediction of News
Articles Considering Positions and Contents Shotaro Ishihara (Nikkei, Inc.), Yasufumi Nakama [email protected] 2022 Computation + Journalism Conference June 9-11, 2022

Research Overview 2 • Click-through Rate (CTR) prediction is a
common task, and useful for evaluating the quality of headlines and thumbnail images. • However, the CTR prediction model trained with users log data is heavily affected by the display position. • Therefore, this research proposes a method for generating a pairwise dataset for training the CTR prediction model through a framework of pairwise learning-to-rank. • We veriﬁed its usefulness by experiments and discussed the potential for editing support.

Nikkei Overview 3 • Nikkei's core business is newspaper publishing.
Total print and digital subscribers of the Nikkei reach around 3 million. • The Nikkei is known as the must-read paper for Japanese professionals with extensive coverage of Japan's economy, industry and markets. • With more than 40 aﬃliated companies, the group business spreads to publishing, broadcasting, events, database services and index business. • Financial Times is also part of the Nikkei.

Outline 4 • Introduction • Related Works • Proposed Method
• Experiments • Use Case for Editing Support • Conclusion and Future Work

Headlines and thumbnail images matters 5 • Many news services
displays a list of articles, and individual article pages often provide guidance on related articles. • Readers also decide whether to move on to the article page based on the information displayed in the external inﬂow, such as social networking services and browser searches.

One of the solutions to measure the quality 6 Nikkei,
Inc. utilizes the pattern tests for providing multiple options. Randomized controlled trial Published Pattern test with multi-armed bandit Published Distribution rate

Practical diﬃculties in online evaluation 7 • There are situations
where it is desirable to present uniform information to all readers for news of high importance. • The possibility that low-quality options may negatively affect the user experience during experiments must be taken into account. • The workload of editors would be increased in terms of the need to produce several candidates of suﬃciently high quality to present to the readers.

Editing support of the CTR prediction model 8 headline thumbnail
image model create, update feedback predicted CTR publish

Position bias: Diﬃculty in machine learning 9 • The higher
the position, the higher its CTR. • If the raw CTR data is simply used as a training dataset, there is a concern that a prediction model would be created that focuses with more importance on the display position than on the information of the article itself.

Pairwise learning-to-rank 10 • We construct a pairwise dataset using
the similarity of display positions and contents. • We build a model with learning-to-rank framework: focusing more on contents information by learning to compare the two pairs of articles. model CTR: 0.05, 0.01

Related works in three perspectives 12 • CTR prediction: Deep
learning [25, 26], Multi-modal [13] • Position bias: Pairwise learning-to-rank [9, 23] • Editing support: CTR prediction [17], Headline generation [16, 24] Case study on Yahoo! News, which is similar in problem setting.

Position of this research 13 • CTR prediction: Deep learning
[25, 26], Multi-modal [13] • Position bias: Pairwise learning-to-rank [9, 23] • Editing support: CTR prediction [17], Headline generation [16, 24] 1. Consideration of position bias derived from service UI. 2. Not only headlines but also thumbnail images. 3. Discussion on use case of headline generation.

Overview of the proposed method 15 CTR of individual articles
Generating a pairwise dataset display position = 1 cluster number = 1 display position = 1 cluster number = 2 … display position = 10 cluster number = 1000 Extracting two pairs of articles from a set that satisfy the set size condition Building a model for predicting CTR using pairwise learning-to-rank model CTR: 0.05, 0.01

📝 Notes: • Clustering: k-means++ [1] • Vectorizing: TF-IDF [19]
• Hyperparameters: The number of clusters Clustering and creating candidate sets 16 CTR of individual articles Generating a pairwise dataset display position = 1 cluster number = 1 display position = 1 cluster number = 2 … display position = 10 cluster number = 1000

Extracting two pairs of articles 17 CTR of individual articles
Generating a pairwise dataset display position = 1 cluster number = 1 display position = 1 cluster number = 2 … display position = 10 cluster number = 1000 📝 Notes: • Hyperparameters: Maximum set size Extracting two pairs of articles from a set that satisfy the set size condition

Building a model by pairwise learning-to-rank 18 CTR of individual
articles Generating a pairwise dataset display position = 1 cluster number = 1 display position = 1 cluster number = 2 … display position = 10 cluster number = 1000 Extracting two pairs of articles from a set that satisfy the set size condition Building a model for predicting CTR using pairwise learning-to-rank model CTR: 0.05, 0.01

Margin Ranking Loss 19 The loss function we use for
pairwise learning-to-rank:

Dataset from the Nikkei Online Edition 21 • SingleCTR: Raw
CTR data • PatternCTR: ◦ Pattern test results. ◦ We use its accuracy for evaluation metric. • PairwiseCTR: ◦ Generated from SingleCTR. ◦ We use it for training and validation.

Four types of models are prepared: • Baseline: with headline
and thumbnail image. • Baseline + display position + published date time: including information as input. • Baseline + ﬁxed CTR: correcting the CTR of the training dataset. • Proposed method: trained with PairwiseCTR. Models 22 headline BERT thumbnail image EﬃcientNet display position published date time fully connected layer

Result tables 23

Result summary 24 • Baseline: suggested the existence of position
bias. • Baseline + display position + published date time: showed improvement for headlines, while no clear performance improvement could be conﬁrmed for thumbnail images. • Baseline + ﬁxed CTR: did not contribute to the performance. • Proposed method: showed particularly high performance for thumbnail images. There was also a certain improvement for headlines compared to the baseline, in some cases obtaining results as good as 0.720.

Workﬂow in automatic headline generation 26 • Editor's decision-making can
be assisted with the predicted CTR. • It should also be available as for one perspective for summarization. • We can also present a visualization of the weights.

Be careful not to create clickbait 27 • It is
necessary to be aware of the clickbait issues. • Even if the CTR is high, headlines and thumbnail images that do not match the body text would damage the user experience. • We also tackle this issue, for example creating a recognizing textual entailment model.

Conclusion and Future Work 29 • This research proposed a
method to generate a pairwise dataset for creating the CTR prediction model in the framework of pairwise learning-to-rank considering position bias. • The experiment reported the better performance potential, and the practical use as editing support was explained. • The future work is to expand the evaluation dataset for larger scale performance evaluation. 📧 [email protected] 📘 https://speakerdeck.com/upura/

Generating a Pairwise Dataset for Click-through...

Generating a Pairwise Dataset for Click-through Rate Prediction of News Articles Considering Positions and Contents

Shotaro Ishihara

More Decks by Shotaro Ishihara

Other Decks in Research

Featured

Transcript

Generating a Pairwise Dataset for Click-through Rate Prediction of News

Research Overview 2 • Click-through Rate (CTR) prediction is a

Nikkei Overview 3 • Nikkei's core business is newspaper publishing.

Outline 4 • Introduction • Related Works • Proposed Method

Headlines and thumbnail images matters 5 • Many news services

One of the solutions to measure the quality 6 Nikkei,

Practical diﬃculties in online evaluation 7 • There are situations

Editing support of the CTR prediction model 8 headline thumbnail

Position bias: Diﬃculty in machine learning 9 • The higher

Pairwise learning-to-rank 10 • We construct a pairwise dataset using

Outline 11 • Introduction • Related Works • Proposed Method

Related works in three perspectives 12 • CTR prediction: Deep

Position of this research 13 • CTR prediction: Deep learning

Outline 14 • Introduction • Related Works • Proposed Method

Overview of the proposed method 15 CTR of individual articles

📝 Notes: • Clustering: k-means++ [1] • Vectorizing: TF-IDF [19]

Extracting two pairs of articles 17 CTR of individual articles

Building a model by pairwise learning-to-rank 18 CTR of individual

Margin Ranking Loss 19 The loss function we use for

Outline 20 • Introduction • Related Works • Proposed Method

Dataset from the Nikkei Online Edition 21 • SingleCTR: Raw

Four types of models are prepared: • Baseline: with headline

Result tables 23

Result summary 24 • Baseline: suggested the existence of position

Outline 25 • Introduction • Related Works • Proposed Method

Workﬂow in automatic headline generation 26 • Editor's decision-making can

Be careful not to create clickbait 27 • It is

Outline 28 • Introduction • Related Works • Proposed Method

Conclusion and Future Work 29 • This research proposed a