Upgrade to Pro — share decks privately, control downloads, hide ads and more …

H&M 23th place solution

H&M 23th place solution

kaggleのH&Mコンペの23位になりました。その解法の概略図です。

コンペURL: https://www.kaggle.com/competitions/h-and-m-personalized-fashion-recommendations/
解法のdiscussion: https://www.kaggle.com/competitions/h-and-m-personalized-fashion-recommendations/discussion/324085

Kyohei Uto

May 27, 2022
Tweet

More Decks by Kyohei Uto

Other Decks in Science

Transcript

  1. H&M Personalized Fashion Recommendations
    Main process
    Most popular
    Most Popular items by
    segment(all, age, etc..)
    in the last 5 days
    ① Generate Item Candidates
    LGBM Ranker
    ② Create Features ⑤ Post Processing
    Copyright 2022 @kuto_bopro
    Candidates Strategies Features
    ③ Learn Ranking Model ④ Ensemble
    ・・・
    ・・・
    to cold start customers
    Use age based most popular
    items as predictions
    to all customers
    Remove the low offline sales
    ratio items from the customers
    who prefer offline.
    Blend 6 model predictions
    which is created by the different
    candidates and features
    Create below features by week,
    and merge to the generated
    candidates dataset.
    Dataset & CV Strategy
    train data week valid data week
    create candidates and features week
    104w
    103w
    102w
    101w
    100w
    0w
    Generate item candidates for each customers
    and each week with multiple strategies.
    transactions week
    User based CF
    Top similar items by
    user based CF and
    LightGCN
    Different color
    Different color items
    from past purchased
    Past purchased
    Past purchased items
    Item based CF
    Top similar items of
    past purchased items
    by item based CF
    Team: ZKMRD
    CF: Collaborative Filtering
    Best single model
    CV: 0.0390 (public) LB: 0.0325
    Key points about dataset
    ・Use only candidate examples generated, not all positive examples
    ・Use customers with at least one positive example in the candidates
    ・Remove not for sales now items from candidates
    CF features
    ・score by item based CF
    ・score/rank by LightGCN (GCN based CF model)
    Article dynamic attributes
    ・trend value
    ・weekly popular ranking
    ・purchase count (1day, 2day ago, last week..)
    ・purchase count by segment (age, item group)
    ・popular rank by segment (age, item group)
    ・mean sales channel
    Article static attributes
    ・basic attributes in articles.csv
    ・Bert sentence vector
    Customer dynamic attributes
    ・purchase count
    ・last purchase flag
    ・Days/Week since last purchase
    ・price/discount of purchased items
    ・mean sales channel
    ・purchase rate by item segment
    ・repurchase rate
    Customer static attributes
    ・basic attributes in customers.csv
    Private LB: 0.0329 (23th)

    View Slide