$30 off During Our Annual Pro Sale. View Details »

Applied machine learning at facebook a datacenter infrastructure perspective HPCA18

Applied machine learning at facebook a datacenter infrastructure perspective HPCA18

Research Paper introduction to Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective.
Facebook MLaaS and Datacenter Design for Machine Learning.

Shunya Ueta

March 07, 2018
Tweet

More Decks by Shunya Ueta

Other Decks in Programming

Transcript

  1. Applied Machine Learning at Facebook
    : A Datacenter Infrastructure Perspective
    International Symposium on High-Performance Computer Architecture (HPCA) 18
    I
    Shunya Ueta (@hurutoriya) 2018-03-07

    View Slide

  2. Abstract
    “This paper describes the hardware and software infrastructure
    that supports machine learning at global scale.”
    2.1 billion Users served Machine Learning (ML-as-a- Service).
    Ranking posts for News Feed, Speech and Text Translations,and
    Photo and Real-time Video Classification
    FAIR System & Network

    View Slide

  3. What’s Contribution?
    ● MLaaS, Computer Vision represents only a small fraction of
    the resource requirements.
    ● FB relies upon an incredibly diverse set of ML approaches.
    ○ e.g. SVM, GBDT,Logistic Regression(LR)
    ● Inference used mainly CPU, Training used CPU and GPU.

    View Slide

  4. MLaas Pipeline Design on Facebook

    View Slide

  5. Major Services Leveraging Machine Learning
    1. News Feed : Ranking Alg. Almost user visit for News Feed.
    2. Ads: ML to determine which ads to display to a given user
    a. “Practical lessons from predicting clicks on ads at facebook,” ADKDD14
    3. Search : Videos, Photos, People, Events, etc.
    4. Sigma : is the general classification and anomaly detection framework
    5. Lumos : high-level attributes and embeddings from an image and its content
    6. Facer : Facebook’s face detection and recognition framework.
    7. Language Translation : Support translations for more than 45 languages. [link]
    8. Speech Recognition : provides automated captioning for video

    View Slide

  6. Machine Learning Models
    - LR and SVM are efficient to train and use for prediction.
    - MLP : ranking newsfeed, CNN : CV, RNN/LSTM : NLP

    View Slide

  7. MLaaS inside Facebook

    View Slide

  8. FBLeaner Flow

    View Slide

  9. FBLeaner Flow

    View Slide

  10. FBLeaner Flow

    View Slide

  11. DNN Framework
    ● PyTorch is optimized for research.
    Focuses on flexibility, debugging, and dynamic neural which
    ena enbles rapid experimentation.
    Not optimized for production and mobile deployments.
    ● Caffe2 iis optimized for production.
    Performance, Cross-platform Support, and coverage for
    CNN,RNN,MLP
    Third party package can use cuDNN, MKL, and Metal

    View Slide

  12. Research result transfer to production by ONNX
    ● Decoupling Research and Production Frameworks (Pytorch ←→Caffe2)

    View Slide

  13. RESOURCE IMPLICATIONS OF MACHINE LEARNING [link]

    View Slide

  14. Compute Type and Locality
    Distributed Training : P. Goyal et al. Takuya Akiba et al.

    View Slide

  15. RESOURCE REQUIREMENTS OF ONLINE INFERENCE WORKLOADS.

    View Slide

  16. Future of MLaaS at Facebook
    ● ML workloads benefit from SIMD, specialized convolution or matrix multiplication
    engines.
    ● Model compression, Quantization, and High-bandwidth memory
    ○ "Deep Compression: Compressing Deep Neural Networks with Pruning,
    Trained Quantization and Huffman Coding”, ICLR16 Song Han et al
    ○ "Binarynet: Training deep neural networks with weights and activations
    constrained to +1 or -1" Matthieu Courbariaux et al.
    ○ “Ternary Neural Networks for Resource-Efficient AI Applications” Hande
    Alemdar et al.
    ● Relational Work :
    ○ "TFX: A TensorFlow-Based Production-Scale Machine Learning Platform"
    KDD17

    View Slide

  17. Conclusion
    ● 2.1 billion Users served MLaaS at Facebook!!
    ● MLaaS, Computer Vision represents only a small fraction of
    the resource requirements.
    ● FB relies upon an incredibly diverse set of ML approaches.
    ○ e.g. SVM, GBDT,Logistic Regression(LR)
    ● Inference used mainly CPU, Training used CPU and GPU.

    View Slide