Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building & Deploying CF

Buzzvil
November 17, 2021

Building & Deploying CF

By Peter Kim

Buzzvil

November 17, 2021
Tweet

More Decks by Buzzvil

Other Decks in Programming

Transcript

  1. Building & Deploying CF
    Improving Buzzvil’s Retargeting & my life quality by 10X

    View Slide

  2. About myself
    ● 충성충성
    ● ML Engineer (~3 yrs)
    ● Ad Display Team

    View Slide

  3. About myself
    ● 충성충성
    ● ML Engineer (~3 yrs)
    ● Ad Display Team

    View Slide

  4. How did I spend my time?
    ● 99% Data Processing & Ops
    ● 1% Machine Learning

    View Slide

  5. How do I spend my time now?
    ● 99% 90% Data Processing & Ops
    ● 1% 10% Machine Learning

    View Slide

  6. 월 부킹 금액 10억+ (>20%의 전체 매출)

    View Slide

  7. Retargeting

    View Slide

  8. Retargeting as a recommendation problem

    View Slide

  9. As with any problem, there are many solutions.
    ● Best Selling
    ● Most Recently Viewed / Carted / Purchased
    ● BERT
    ● Collaborative Filtering (CF)

    View Slide

  10. As with any problem, there are many solutions
    ● Best Selling
    ● Most Recently Viewed / Carted / Purchased
    ● BERT
    ● Collaborative Filtering (CF)

    View Slide

  11. CF Intuition: 비슷한 유저를 찾아라!

    View Slide

  12. 0 for Not Purchased, 1 for Purchased
    Product 1 Product 2 Product 3 Product 4 Product 5
    User A 1 0 1 0 1
    User B 0 1 0 1 1
    User C 1 1 1 0 0
    User D 0 1 0 0 1

    View Slide

  13. Q: 유저 D가 산 상품들은?
    Product 1 Product 2 Product 3 Product 4 Product 5
    User A 1 0 1 0 1
    User B 0 1 0 1 1
    User C 1 1 1 0 0
    User D 0 1 0 0 1

    View Slide

  14. Q: 유저 D가 산 상품들은?
    Product 1 Product 2 Product 3 Product 4 Product 5
    User A 1 0 1 0 1
    User B 0 1 0 1 1
    User C 1 1 1 0 0
    User D 0 1 0 0 1

    View Slide

  15. Item-to-Item CF: 비슷한 유저 아이템을 찾아라!
    Product 1 Product 2 Product 3 Product 4 Product 5
    User A 1 0 1 0 1
    User B 0 1 0 1 1
    User C 1 1 1 0 0
    User D 0 1 0 0 1

    View Slide

  16. E.g. Similarity of Product 2 & Product 5?
    Product 1 Product 2 Product 3 Product 4 Product 5
    User A 1 0 1 0 1
    User B 0 1 0 1 1
    User C 1 1 1 0 0
    User D 0 1 0 0 1

    View Slide

  17. E.g. Similarity of Product 2 & Product 5?
    Product 1 Product 2 Product 3 Product 4 Product 5
    User A 1 0 1 0 1
    User B 0 1 0 1 1
    User C 1 1 1 0 0
    User D 0 1 0 0 1
    Sim(P2, P5) = 2

    View Slide

  18. E.g. Similarity of Product 2 & Product 1?
    Product 1 Product 2 Product 3 Product 4 Product 5
    User A 1 0 1 0 1
    User B 0 1 0 1 1
    User C 1 1 1 0 0
    User D 0 1 0 0 1
    Sim(P2, P1) = 1

    View Slide

  19. Many similarity functions can be used between two vectors


    Product 4
    0
    1
    0
    0
    Product 2
    0
    1
    1
    1

    View Slide

  20. So how do we build a
    recommendation system with this?

    View Slide

  21. 1. Match each of the user’s
    purchased items to similar items
    2. Combine them into a
    recommendation list

    View Slide

  22. 쿠팡

    View Slide

  23. Netflix

    View Slide

  24. Youtube

    View Slide

  25. Amazon

    View Slide

  26. For Buzzvil?
    ● 2차 테스트 중간 결과는 긍정적, 더블체크를 위해 재 테스트 중
    더 궁금하시면, Confluence 문서와 Redash에서 확인: 문서 #1, 문서 #2,

    View Slide

  27. ItemCF was popularized by Amazon, 20 years ago

    View Slide

  28. Scalable algorithm to compute item similarities

    View Slide

  29. 30일치의 SSG 구매 데이터를 가지고
    Item Similarities를 Compute해보자
    Product 1 ... Product
    400,000
    User 1 1 ... 1
    ... ... ... ...
    User 600,000 0 ... 1

    View Slide

  30. Version 1: Python Naive Version

    View Slide

  31. Time complexity = 7200 hours, 300일밖에 안걸리네...흠

    View Slide

  32. Find the bottleneck!

    View Slide

  33. Find the bottleneck!

    View Slide

  34. Q: How to optimize the cosine similarity computation?

    View Slide

  35. Q: How to optimize the cosine similarity computation?
    Product 4
    0
    1
    0
    0
    Product 2
    0
    1
    1
    1

    View Slide

  36. Hint: What is the difference between these two vectors?
    Product 2
    0
    1
    1
    1
    Product 1 ... Product
    400,000
    User 1 1 ... 1
    ... ... ... ...
    User 600,000 0 ... 1
    Vs.

    View Slide

  37. Hint: What is the difference between these two vectors?
    Product 2
    0
    1
    1
    1
    Product 1 ... Product
    400,000
    User 1 1 ... 1
    ... ... ... ...
    User 600,000 0 ... 1
    Vs.
    Sparse!

    View Slide

  38. View Slide

  39. Using sparse vectors reduces both memory and time taken
    from scipy.sparse import csr_matrix
    from sklearn.metrics.pairwise import cosine_similarity
    a = csr_matrix(...)
    b = csr_matrix(...)
    sim = cosine_similarity(a, b) # 6x faster

    View Slide

  40. Using sparse vectors reduces both memory and time taken
    Naive version now takes 1200 hours (50일) 300일
    전역하기 전까지는 끝낼 수 있다
    from scipy.sparse import csr_matrix
    from sklearn.metrics.pairwise import cosine_similarity
    a = csr_matrix(...)
    b = csr_matrix(...)
    sim = cosine_similarity(a, b) # 6x faster

    View Slide

  41. 50 days is still not fast enough for production!

    View Slide

  42. Embarrassingly Parallel Workload

    View Slide

  43. Python Distributed Applications API:
    import ray
    ray.init()
    @ray.remote
    def compute_partial(...):
    # compute partial similarities table
    futures = [compute_partial(), compute_partial(), ...]
    ray.get(futures)

    View Slide

  44. 7200 hours, 1200 hours, 4 hours
    Python Multiprocessing with Ray on r5.12xlarge
    Instance (48 cores) takes ~4 hours

    View Slide

  45. 7200 hours, 1200 hours, 4 hours
    Python Multiprocessing with Ray on r5.12xlarge
    Instance (48 cores) takes ~4 hours
    Now ready for production!

    View Slide

  46. How many EC2 instances do we need?
    train-ssg
    train-emart
    train-hs
    train-ns
    infer-ssg
    infer-emart
    infer-hs
    infer-ns

    View Slide

  47. How do we manage this?
    $ ssh
    $ tmux
    $ git pull
    $ crontab

    View Slide

  48. 이걸 내가 혼자서 다?
    train-ssg
    train-emart
    train-hs
    train-ns
    infer-ssg
    infer-emart
    infer-hs
    infer-ns
    $ ssh
    $ tmux
    $ git pull
    $ crontab

    View Slide

  49. 이걸 내가 혼자서 다?
    train-ssg
    train-emart
    train-hs
    train-ns
    infer-ssg
    infer-emart
    infer-hs
    infer-ns
    $ ssh
    $ tmux
    $ git pull
    $ crontab

    View Slide

  50. The wheel has been invented but it’s not yet ready
    DY 화이팅!

    View Slide

  51. Until then, let’s make my own.

    View Slide

  52. Inspiration
    ● DAGs & Scheduling from AirFlow
    ○ Decoupling scheduling and task details
    ● Serverless from Lambda
    ○ No more SSH, Tmux, Crontab

    View Slide

  53. Let’s see it in action!

    View Slide

  54. Lessons Learned
    ● Use sparse vectors to save memory and time
    ● Optimize algorithms from the single-core level, then multi-core
    ● Deploying & managing ML is hell but most of it can be automated
    ● Building your own tools can increase your life quality by 10X
    ○ Python + boto3 makes it very easy. PeterFlow is <200 lines.

    View Slide

  55. 10X Life Quality Improvement

    View Slide