Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Applied machine learning at facebook a datacenter infrastructure perspective HPCA18

Applied machine learning at facebook a datacenter infrastructure perspective HPCA18

Research Paper introduction to Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective.
Facebook MLaaS and Datacenter Design for Machine Learning.

Shunya Ueta

March 07, 2018
Tweet

More Decks by Shunya Ueta

Other Decks in Programming

Transcript

  1. Applied Machine Learning at Facebook : A Datacenter Infrastructure Perspective

    International Symposium on High-Performance Computer Architecture (HPCA) 18 I Shunya Ueta (@hurutoriya) 2018-03-07
  2. Abstract “This paper describes the hardware and software infrastructure that

    supports machine learning at global scale.” 2.1 billion Users served Machine Learning (ML-as-a- Service). Ranking posts for News Feed, Speech and Text Translations,and Photo and Real-time Video Classification FAIR System & Network
  3. What’s Contribution? • MLaaS, Computer Vision represents only a small

    fraction of the resource requirements. • FB relies upon an incredibly diverse set of ML approaches. ◦ e.g. SVM, GBDT,Logistic Regression(LR) • Inference used mainly CPU, Training used CPU and GPU.
  4. Major Services Leveraging Machine Learning 1. News Feed : Ranking

    Alg. Almost user visit for News Feed. 2. Ads: ML to determine which ads to display to a given user a. “Practical lessons from predicting clicks on ads at facebook,” ADKDD14 3. Search : Videos, Photos, People, Events, etc. 4. Sigma : is the general classification and anomaly detection framework 5. Lumos : high-level attributes and embeddings from an image and its content 6. Facer : Facebook’s face detection and recognition framework. 7. Language Translation : Support translations for more than 45 languages. [link] 8. Speech Recognition : provides automated captioning for video
  5. Machine Learning Models - LR and SVM are efficient to

    train and use for prediction. - MLP : ranking newsfeed, CNN : CV, RNN/LSTM : NLP
  6. DNN Framework • PyTorch is optimized for research. Focuses on

    flexibility, debugging, and dynamic neural which ena enbles rapid experimentation. Not optimized for production and mobile deployments. • Caffe2 iis optimized for production. Performance, Cross-platform Support, and coverage for CNN,RNN,MLP Third party package can use cuDNN, MKL, and Metal
  7. Research result transfer to production by ONNX • Decoupling Research

    and Production Frameworks (Pytorch ←→Caffe2)
  8. Future of MLaaS at Facebook • ML workloads benefit from

    SIMD, specialized convolution or matrix multiplication engines. • Model compression, Quantization, and High-bandwidth memory ◦ "Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding”, ICLR16 Song Han et al ◦ "Binarynet: Training deep neural networks with weights and activations constrained to +1 or -1" Matthieu Courbariaux et al. ◦ “Ternary Neural Networks for Resource-Efficient AI Applications” Hande Alemdar et al. • Relational Work : ◦ "TFX: A TensorFlow-Based Production-Scale Machine Learning Platform" KDD17
  9. Conclusion • 2.1 billion Users served MLaaS at Facebook!! •

    MLaaS, Computer Vision represents only a small fraction of the resource requirements. • FB relies upon an incredibly diverse set of ML approaches. ◦ e.g. SVM, GBDT,Logistic Regression(LR) • Inference used mainly CPU, Training used CPU and GPU.