Upgrade to Pro — share decks privately, control downloads, hide ads and more …

[Paper Introduction] LLM-generated Explanations...

Avatar for Yusuke Sasai Yusuke Sasai
June 20, 2025
98

[Paper Introduction] LLM-generated Explanations for Recommender Systems

2025/06/16

Paper introduction@TanichuLab

https://sites.google.com/view/tanichu-lab-ku/

Avatar for Yusuke Sasai

Yusuke Sasai

June 20, 2025
Tweet

Transcript

  1. Paper Information LLM-generated Explanations for Recommender Systems • Authors: Sebastian

    Lubos, Thi Ngoc Trang Tran, Alexander Felfernig, Seda Polat Erdeniz, Viet-Man Le • ACM UMAP 2024: 28 June 2024 • https://dl.acm.org/doi/abs/10.1145/3631700.3665185 2
  2. Background • A recommender system is a system that proposes

    relevant items to a user, considering their individual preferences. • Providing “explanations” and clearly communicating the reasons for recommendations contributes to improving the overall user experience. 5 (Created by Gemini)
  3. Problem • Recommender systems often lack transparency and understandability. •

    Transparency: How clearly the reasons for the recommendation are disclosed. • Understandability: How well the user can comprehend the explanation. • The generation of "robust and sound natural language explanations" is an ongoing research topic. • Large Language Models (LLMs) are considered a very effective approach to this problem. • Advanced NLP capabilities 6
  4. Research Goal To validate the effectiveness of personalized explanations generated

    by Large Language Models (LLMs) for items recommended by a recommender system. They define the following three research questions (RQs): 1. Do users prefer LLM-generated explanations compared to those from existing methods? 2. How do users rate the quality of LLM-generated explanations compared to those from existing methods? 3. What features of LLM-generated explanations are valued by users? 7
  5. Explanation Types for Recommendations (Baselines) Explanation types can be divided

    into three categories based on the recommendation algorithm [Tran 21]: • Feature-based Explanations • Item-based Explanations • Knowledge-based Explanations 9 [Tran 21 ]Thi Ngoc Trang Tran, Viet Man Le, Muesluem Atas, Alexander Felfernig, Martin Stettinger, and Andrei Popescu. 2021. Do Users Appreciate Explanations of Recommendations? An Analysis in the Movie Domain. In Fifteenth ACM Conference on Recommender Systems. 645–650.
  6. Feature-based Explanations (FBExp) • Example: The movie "Legends of the

    Fall" is recommended because you like the Romance, Drama, War, and Western genres. Additionally, "Legends of the Fall" is similar to movies you have liked in the past. 10 Similarity • Feature-based recommendations • A method that recommends based on the similarity between user preferences and item features. I like dinosaurs. Recommend Dinosaurs movie User
  7. Item-based Explanations (IBExp) • Item-based (collaborative filtering) recommendations • A

    recommendation method that uses past user ratings to recommend similar items 11 • Example: "The Shawshank Redemption" is recommended to you based on your ratings of "Forrest Gump", "The Hateful Eight", and "Up", because other users with similar preferences rated this movie positively. Similarity review recommend User
  8. Knowledge-based Explanations (KBExp) • Knowledge-based recommendations • A method that

    recommends items based on preferences explicitly specified by the user. 12 • Example: The movie "Indiana Jones and the Kingdom of the Crystal Skull" is recommended because you want to watch an Action movie directed by Steven Spielberg and starring Harrison Ford. I want to watch action movie. Recommend Action movie Similarity User
  9. Compare the 3 Types of Explanations 13 Explanation types Features

    Evaluation criteria Feature-based Explanations (FBExp) Calculate recommendations based on the similarity between user preferences and item characteristics. Transparency: Explain how the system works. Trust: Increase user trust in the system. Item-based Explanations (IBExp) Recommend similar items based on ratings given by users to movies they have watched in the past. Efficiency: Helps users make decisions faster. Knowledge-based Explanations (KBExp) Recommend items using explicitly specified user preferences Persuasiveness: How well it matches the preferences of the specified user. Satisfaction: Increases satisfaction with recommended items.
  10. Experiment Setup • Objective: Compare explanations based on each recommendation

    method with LLM-generated explanations to validate the research questions. • Research Questions 1. Do users prefer LLM-generated explanations compared to those from existing methods? 2. How do users rate the quality of LLM-generated explanations compared to those from existing methods? 3. What features of LLM-generated explanations are valued by users? 15
  11. Experiment Setup • Dataset • Movie information + reviews: MovieLens

    Latest Small • Detailed movie information: TMDB API • Generation of Recommendations • Recommended items are determined by a simple algorithm based on the baseline methods. • The LLM is not involved in the selection of recommended items. 16 Screenshots of the user study
  12. Experiment Setup • LLM used in the experiment: Llama2-13B •

    An open-source LLM with high benchmark scores when the research was conducted. • Temperature=0.01, Max Token=2000 • Generation of Explanations • Baseline methods: Explanations reflecting the features of each recommendation method are generated based on a template. • LLM: Generated using individual prompt templates (figure on the right) corresponding to each recommendation method. 17
  13. Experiment Setup • Participants: 97 • Participants were divided into

    3 groups to compare explanations. • FBExp • IBExp • KBExp • Subjects compare the following two items: • Baseline: Explanation based on the template • LLM-Exp: Explanation generated by LLM based on template input 18
  14. Evaluation Metrics All questions were evaluated on a 5-point Likert

    scale. • Evaluation metrics common to all explanation types: • Understandability of the explanation • Overall satisfaction with the recommendation • Effectiveness of the explanation in helping to evaluate the movie • Evaluation metrics specific to each explanation type: • Feature-based (FBExp): Transparency of the recommendation, Trust in the system • Item-based (IBExp): Efficiency in supporting decision-making • Knowledge-based (KBExp): Persuasiveness of the recommendation, Satisfaction with the item 19
  15. Evaluation Metrics • Explanation Preference • Which of the two

    displayed explanations do you prefer (or do you prefer neither)? • If you have a preferred explanation, rate its characteristics on a 5-point scale. • Clarity, Creativity, level of detail Time, General, Quality, Consideration of preferences, Length of the presented explanation 20 FBExp Example explanation Baseline The movie ’Legends of the Fall’ is recommended to you because you like Romance, Drama, War, and Western genres, and the movie ’Legends of the Fall’ is similar to the ones you liked before. LLM We recommend ’Legends of the Fall’ as it aligns with your preferred genres of Drama, Romance, and War. This epic tale set in the early 20th century follows the lives of two brothers and their families, exploring themes of love, loss, and loyalty amidst the backdrop of World War I. With its sweeping landscapes and emotional depth, this film is sure to captivate you with its timeless storytelling.
  16. Contents 1. Background / Research Goal 2. Explanation Type 3.

    Experiment 4. Results 5. Conclusion 21
  17. Results - RQ1 Q: Compared to existing methods, do users

    prefer LLM explanations? A: Users clearly tend to prefer LLM-generated explanations. 22 Selection of preference between the baseline explanation and the LLM-generated explanation.
  18. Results - RQ2 Q: How do users rate the quality

    of LLM-generated explanations compared to those from existing methods? A: LLM-generated explanations received higher ratings on almost all metrics. 23 Mean ratings for each evaluation metric (5-point scale).
  19. Results - RQ3 • Q: What characteristics of LLM-generated explanations

    are valued by users? • A: • Significantly higher ratings were observed for many characteristics. • The "Length" characteristic received a low rating. Mean ratings for each characteristic. p-values from a binomial test on the likelihood of receiving a high rating for each characteristic. 24
  20. Discussion • The explanations generated by LLM were highly evaluated,

    particularly in terms of creativity, detail, and personalization. • The vast background knowledge possessed by LLM may have influenced the addition of information not found in the baseline to the explanation. • Limitation • The single domain of cinema • Fixed single LLM model and prompt • Not using external knowledge such as RAG 25
  21. Conclusion • They investigated the effectiveness of post-hoc explanations generated

    by LLMs in recommender systems. • The experimental results showed that compared to baseline explanations, LLM explanations were preferred more and their quality was rated significantly higher. • In particular, LLM explanations tended to be highly rated for creativity and detail, suggesting that the LLM's inherent knowledge and natural language capabilities have the potential to enhance the user experience. 26