[Paper Introduction] LLM-generated Explanations for Recommender Systems

LLM-generated Explanations for Recommender Systems Symbol Emergence System Lab. Journal
Clab Calendar 16 June 2025 Yusuke Sasai 1

Paper Information LLM-generated Explanations for Recommender Systems • Authors: Sebastian
Lubos, Thi Ngoc Trang Tran, Alexander Felfernig, Seda Polat Erdeniz, Viet-Man Le • ACM UMAP 2024: 28 June 2024 • https://dl.acm.org/doi/abs/10.1145/3631700.3665185 2

Contents 1. Background / Research Goal 2. Explanation Type 3.
Experiment 4. Result 5. Conclusion 3

Background • A recommender system is a system that proposes
relevant items to a user, considering their individual preferences. • Providing “explanations” and clearly communicating the reasons for recommendations contributes to improving the overall user experience. 5 (Created by Gemini)

Problem • Recommender systems often lack transparency and understandability. •
Transparency: How clearly the reasons for the recommendation are disclosed. • Understandability: How well the user can comprehend the explanation. • The generation of "robust and sound natural language explanations" is an ongoing research topic. • Large Language Models (LLMs) are considered a very effective approach to this problem. • Advanced NLP capabilities 6

Research Goal To validate the effectiveness of personalized explanations generated
by Large Language Models (LLMs) for items recommended by a recommender system. They define the following three research questions (RQs): 1. Do users prefer LLM-generated explanations compared to those from existing methods? 2. How do users rate the quality of LLM-generated explanations compared to those from existing methods? 3. What features of LLM-generated explanations are valued by users? 7

Explanation Types for Recommendations (Baselines) Explanation types can be divided
into three categories based on the recommendation algorithm [Tran 21]: • Feature-based Explanations • Item-based Explanations • Knowledge-based Explanations 9 [Tran 21 ]Thi Ngoc Trang Tran, Viet Man Le, Muesluem Atas, Alexander Felfernig, Martin Stettinger, and Andrei Popescu. 2021. Do Users Appreciate Explanations of Recommendations? An Analysis in the Movie Domain. In Fifteenth ACM Conference on Recommender Systems. 645–650.

Feature-based Explanations (FBExp) • Example: The movie "Legends of the
Fall" is recommended because you like the Romance, Drama, War, and Western genres. Additionally, "Legends of the Fall" is similar to movies you have liked in the past. 10 Similarity • Feature-based recommendations • A method that recommends based on the similarity between user preferences and item features. I like dinosaurs. Recommend Dinosaurs movie User

Item-based Explanations (IBExp) • Item-based (collaborative filtering) recommendations • A
recommendation method that uses past user ratings to recommend similar items 11 • Example: "The Shawshank Redemption" is recommended to you based on your ratings of "Forrest Gump", "The Hateful Eight", and "Up", because other users with similar preferences rated this movie positively. Similarity review recommend User

Knowledge-based Explanations (KBExp) • Knowledge-based recommendations • A method that
recommends items based on preferences explicitly specified by the user. 12 • Example: The movie "Indiana Jones and the Kingdom of the Crystal Skull" is recommended because you want to watch an Action movie directed by Steven Spielberg and starring Harrison Ford. I want to watch action movie. Recommend Action movie Similarity User

Compare the 3 Types of Explanations 13 Explanation types Features
Evaluation criteria Feature-based Explanations (FBExp) Calculate recommendations based on the similarity between user preferences and item characteristics. Transparency: Explain how the system works. Trust: Increase user trust in the system. Item-based Explanations (IBExp) Recommend similar items based on ratings given by users to movies they have watched in the past. Efficiency: Helps users make decisions faster. Knowledge-based Explanations (KBExp) Recommend items using explicitly specified user preferences Persuasiveness: How well it matches the preferences of the specified user. Satisfaction: Increases satisfaction with recommended items.

Experiment Setup • Objective: Compare explanations based on each recommendation
method with LLM-generated explanations to validate the research questions. • Research Questions 1. Do users prefer LLM-generated explanations compared to those from existing methods? 2. How do users rate the quality of LLM-generated explanations compared to those from existing methods? 3. What features of LLM-generated explanations are valued by users? 15

Experiment Setup • Dataset • Movie information + reviews: MovieLens
Latest Small • Detailed movie information: TMDB API • Generation of Recommendations • Recommended items are determined by a simple algorithm based on the baseline methods. • The LLM is not involved in the selection of recommended items. 16 Screenshots of the user study

Experiment Setup • LLM used in the experiment: Llama2-13B •
An open-source LLM with high benchmark scores when the research was conducted. • Temperature=0.01, Max Token=2000 • Generation of Explanations • Baseline methods: Explanations reflecting the features of each recommendation method are generated based on a template. • LLM: Generated using individual prompt templates (figure on the right) corresponding to each recommendation method. 17

Experiment Setup • Participants: 97 • Participants were divided into
3 groups to compare explanations. • FBExp • IBExp • KBExp • Subjects compare the following two items: • Baseline: Explanation based on the template • LLM-Exp: Explanation generated by LLM based on template input 18

Evaluation Metrics All questions were evaluated on a 5-point Likert
scale. • Evaluation metrics common to all explanation types: • Understandability of the explanation • Overall satisfaction with the recommendation • Effectiveness of the explanation in helping to evaluate the movie • Evaluation metrics specific to each explanation type: • Feature-based (FBExp): Transparency of the recommendation, Trust in the system • Item-based (IBExp): Efficiency in supporting decision-making • Knowledge-based (KBExp): Persuasiveness of the recommendation, Satisfaction with the item 19

Evaluation Metrics • Explanation Preference • Which of the two
displayed explanations do you prefer (or do you prefer neither)? • If you have a preferred explanation, rate its characteristics on a 5-point scale. • Clarity, Creativity, level of detail Time, General, Quality, Consideration of preferences, Length of the presented explanation 20 FBExp Example explanation Baseline The movie ’Legends of the Fall’ is recommended to you because you like Romance, Drama, War, and Western genres, and the movie ’Legends of the Fall’ is similar to the ones you liked before. LLM We recommend ’Legends of the Fall’ as it aligns with your preferred genres of Drama, Romance, and War. This epic tale set in the early 20th century follows the lives of two brothers and their families, exploring themes of love, loss, and loyalty amidst the backdrop of World War I. With its sweeping landscapes and emotional depth, this film is sure to captivate you with its timeless storytelling.

Experiment 4. Results 5. Conclusion 21

Results - RQ1 Q: Compared to existing methods, do users
prefer LLM explanations? A: Users clearly tend to prefer LLM-generated explanations. 22 Selection of preference between the baseline explanation and the LLM-generated explanation.

Results - RQ2 Q: How do users rate the quality
of LLM-generated explanations compared to those from existing methods? A: LLM-generated explanations received higher ratings on almost all metrics. 23 Mean ratings for each evaluation metric (5-point scale).

Results - RQ3 • Q: What characteristics of LLM-generated explanations
are valued by users? • A: • Significantly higher ratings were observed for many characteristics. • The "Length" characteristic received a low rating. Mean ratings for each characteristic. p-values from a binomial test on the likelihood of receiving a high rating for each characteristic. 24

Discussion • The explanations generated by LLM were highly evaluated,
particularly in terms of creativity, detail, and personalization. • The vast background knowledge possessed by LLM may have influenced the addition of information not found in the baseline to the explanation. • Limitation • The single domain of cinema • Fixed single LLM model and prompt • Not using external knowledge such as RAG 25

Conclusion • They investigated the effectiveness of post-hoc explanations generated
by LLMs in recommender systems. • The experimental results showed that compared to baseline explanations, LLM explanations were preferred more and their quality was rated significantly higher. • In particular, LLM explanations tended to be highly rated for creativity and detail, suggesting that the LLM's inherent knowledge and natural language capabilities have the potential to enhance the user experience. 26

[Paper Introduction] LLM-generated Explanations...

[Paper Introduction] LLM-generated Explanations for Recommender Systems

Yusuke Sasai

Featured

Transcript

LLM-generated Explanations for Recommender Systems Symbol Emergence System Lab. Journal

Paper Information LLM-generated Explanations for Recommender Systems • Authors: Sebastian

Contents 1. Background / Research Goal 2. Explanation Type 3.

Contents 1. Background / Research Goal 2. Explanation Type 3.

Background • A recommender system is a system that proposes

Problem • Recommender systems often lack transparency and understandability. •

Research Goal To validate the effectiveness of personalized explanations generated

Contents 1. Background / Research Goal 2. Explanation Type 3.

Explanation Types for Recommendations (Baselines) Explanation types can be divided

Feature-based Explanations (FBExp) • Example: The movie "Legends of the

Item-based Explanations (IBExp) • Item-based (collaborative filtering) recommendations • A

Knowledge-based Explanations (KBExp) • Knowledge-based recommendations • A method that

Compare the 3 Types of Explanations 13 Explanation types Features

Contents 1. Background / Research Goal 2. Explanation Type 3.

Experiment Setup • Objective: Compare explanations based on each recommendation

Experiment Setup • Dataset • Movie information + reviews: MovieLens

Experiment Setup • LLM used in the experiment: Llama2-13B •

Experiment Setup • Participants: 97 • Participants were divided into

Evaluation Metrics All questions were evaluated on a 5-point Likert

Evaluation Metrics • Explanation Preference • Which of the two

Contents 1. Background / Research Goal 2. Explanation Type 3.

Results - RQ1 Q: Compared to existing methods, do users

Results - RQ2 Q: How do users rate the quality

Results - RQ3 • Q: What characteristics of LLM-generated explanations

Discussion • The explanations generated by LLM were highly evaluated,

Conclusion • They investigated the effectiveness of post-hoc explanations generated