$30 off During Our Annual Pro Sale. View Details »

Scaling ML embedding models to serve a billion queries

Scaling ML embedding models to serve a billion queries

This talk is aimed at providing a deeper insight into the scale, challenges and solutions formulated for powering embeddings based visual search in eBay. This talk walks the audience through the model architecture, application archite for serving the users, the workflow pipelines produced for building the embeddings to be used by Cassini, eBay's search engine and the unique challenges faced during this journey. This talk provides key insights specific to embedding handling and how to scale systems to provide real time clustering based solutions for users.

What You Will Learn:

The audience will learn how to productionize embedding based data pipelines, key challenges and potential solutions, introduction to different quantization algorithms and their advantages/disadvantages. The audience will also get a deeper view on how data pipelines and workflows are modeled for optimal scale.

Senthilkumar Gopal

October 23, 2022
Tweet

More Decks by Senthilkumar Gopal

Other Decks in Research

Transcript

  1. Scaling embedding
    models to serve a
    billion queries
    Senthilkumar Gopal
    @sengopal

    View Slide

  2. © 2022 eBay. All rights reserved.
    2
    Journey of a Query @ eBay

    View Slide

  3. © 2022 eBay. All rights reserved.
    3
    Search @ eBay
    How can we discover items without describing
    them?
    This is a problem across many domains where
    search is a core functionality.
    Question to ponder
    can we provide users with the ability to
    “discover” through visual cues instead?
    https://unsplash.com/photos/2oUiUu5QAys

    View Slide

  4. © 2022 eBay. All rights reserved.
    4
    Current Search Experience
    Nice Kilim Pillow for my couch!
    Is this a kilim pillow? Or a orange kilim pillows? Perhaps a orange throw kilim pillow??

    View Slide

  5. © 2022 eBay. All rights reserved.
    5
    k nearest neighbours search - A thought experiment
    Let’s represent an item TITLE as a 2-dimensional vector

    View Slide

  6. © 2022 eBay. All rights reserved.
    6
    So what is an embedding then?
    https://pytorch.org/tutorials/beginner/nlp/word_embeddings_tutorial.html
    Represents Semantic Similarity Similarity (sofa, couch)
    A real word example [R
    768
    ]

    View Slide

  7. © 2022 eBay. All rights reserved.
    7
    So what is an embedding then?
    Large Language Models - GPT 3 [175 B]
    ○ 45 TB text data - Wikipedia and books
    Neural network learns word associations from a large corpus.
    ○ Detects synonymous words.
    ○ Suggests words for a partial sentence.
    https://en.wikipedia.org/wiki/Word2vec
    https://ruder.io/word-embeddings-1/
    You shall know a word by the company it keeps
    - (Firth, J. R. 1957:11)

    View Slide

  8. © 2022 eBay. All rights reserved.
    8
    So what is an embedding then?
    http://web.stanford.edu/class/cs224n/slides/cs224n-2022-lecture02-wordvecs2.pdf
    8
    numbers
    companies
    books

    View Slide

  9. © 2022 eBay. All rights reserved.
    9
    So what is an embedding then?
    http://web.stanford.edu/class/cs224n/slides/cs224n-2022-lecture02-wordvecs2.pdf

    View Slide

  10. © 2022 eBay. All rights reserved.
    10
    What about an image?
    https://en.wikipedia.org/wiki/Convolutional_neural_network

    View Slide

  11. © 2022 eBay. All rights reserved.
    11
    Model Architecture
    Multiple Modalities - Inspired by CLIP *
    https://openai.com/blog/clip/

    View Slide

  12. © 2022 eBay. All rights reserved.
    12
    How do we “learn” an embedding?
    Devlin, Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv:1810.04805 (2018).
    https://en.wikipedia.org/wiki/Convolutional_neural_network
    Text Encoder Image Encoder
    R
    768
    That's how an
    embedding looks
    like!!!

    View Slide

  13. © 2022 eBay. All rights reserved.
    13
    Why do we need ANN?
    Exhaustive search
    curse of dimensionality
    ANN
    Approximate Nearest Neighbours
    All problems start with SCALE
    https://en.wikipedia.org/wiki/Proximity_analysis#/media/File:Euclidean_Voronoi_diagram.svg
    http://ann-benchmarks.com/index.html#algorithms

    View Slide

  14. © 2022 eBay. All rights reserved.
    14
    combining
    the
    elements
    together

    View Slide

  15. © 2022 eBay. All rights reserved.
    15
    How does this function?
    Display all inventory
    matching my visual appeal
    Is this a orange throw kilim pillow?
    Quickly pivot to entire inventory using a visual first cue

    View Slide

  16. Data Engineering

    View Slide

  17. © 2022 eBay. All rights reserved.
    17
    Design
    Trained model

    View Slide

  18. © 2022 eBay. All rights reserved.
    18
    Data Ingestion

    View Slide

  19. © 2022 eBay. All rights reserved.
    19
    Data Ingestion
    Challenges
    - Speed vs. resource trade off
    - Storage
    - Download errors
    - Downstream dependencies
    https://unsplash.com/photos/wrrgZwI7qOY

    View Slide

  20. © 2022 eBay. All rights reserved.
    20
    ML Platform and Inference

    View Slide

  21. © 2022 eBay. All rights reserved.
    21
    Cassini Indexing
    The Architecture of eBay Search Trotman, Andrew, Jon Degenhardt, and Surya Kallumadi - eCOM@ SIGIR. 2017.

    View Slide

  22. © 2022 eBay. All rights reserved.
    22
    Orchestration
    https://unsplash.com/photos/yUJVHiYZCGQ

    View Slide

  23. © 2022 eBay. All rights reserved.
    23
    Workflow Orchestration using Apache Airflow
    Processing modes
    - BULK
    - DELTA
    Apache Airflow Logo

    View Slide

  24. © 2022 eBay. All rights reserved.
    24
    Challenges with Apache Airflow
    Challenge Solution
    Multiple Spark versions Define task level parameters
    Multiple Docker image versions Python virtual environment packages
    Different platforms, zones, and network
    flakiness
    Retries, system monitoring
    Apache Airflow Logo

    View Slide

  25. © 2022 eBay. All rights reserved.
    25
    A/B Testing
    How do we test different models in production?
    Trained model

    View Slide

  26. © 2022 eBay. All rights reserved.
    26
    Data Publishing for A/B Tests using Airflow

    View Slide

  27. © 2022 eBay. All rights reserved.
    27
    Model Drift
    • Seasonality
    • Aging of the models
    Actions
    • Metrics monitoring
    • Downstream evaluation
    • Retraining
    Evolution
    Data Drift
    • Data Integrity
    • Data pipelines
    Actions
    • Fault tolerance
    • Monitoring of time, cpu,
    memory, disk

    View Slide

  28. © 2022 eBay. All rights reserved.
    28
    Key takeaways
    Similarity Scalability Monitoring

    View Slide

  29. © 2022 eBay. All rights reserved.
    29
    Questions?
    https://unsplash.com/photos/4V1dC_eoCwg
    slides are available at https://bit.ly/ebay-ml

    View Slide

  30. View Slide