Applied machine learning at facebook a datacenter infrastructure perspective HPCA18

Applied Machine Learning at Facebook : A Datacenter Infrastructure Perspective
International Symposium on High-Performance Computer Architecture (HPCA) 18 I Shunya Ueta (@hurutoriya) 2018-03-07

Abstract “This paper describes the hardware and software infrastructure that
supports machine learning at global scale.” 2.1 billion Users served Machine Learning (ML-as-a- Service). Ranking posts for News Feed, Speech and Text Translations,and Photo and Real-time Video Classification FAIR System & Network

What’s Contribution? • MLaaS, Computer Vision represents only a small
fraction of the resource requirements. • FB relies upon an incredibly diverse set of ML approaches. ◦ e.g. SVM, GBDT,Logistic Regression(LR) • Inference used mainly CPU, Training used CPU and GPU.

MLaas Pipeline Design on Facebook

Major Services Leveraging Machine Learning 1. News Feed : Ranking
Alg. Almost user visit for News Feed. 2. Ads: ML to determine which ads to display to a given user a. “Practical lessons from predicting clicks on ads at facebook,” ADKDD14 3. Search : Videos, Photos, People, Events, etc. 4. Sigma : is the general classification and anomaly detection framework 5. Lumos : high-level attributes and embeddings from an image and its content 6. Facer : Facebook’s face detection and recognition framework. 7. Language Translation : Support translations for more than 45 languages. [link] 8. Speech Recognition : provides automated captioning for video

Machine Learning Models - LR and SVM are efficient to
train and use for prediction. - MLP : ranking newsfeed, CNN : CV, RNN/LSTM : NLP

MLaaS inside Facebook

FBLeaner Flow

DNN Framework • PyTorch is optimized for research. Focuses on
flexibility, debugging, and dynamic neural which ena enbles rapid experimentation. Not optimized for production and mobile deployments. • Caffe2 iis optimized for production. Performance, Cross-platform Support, and coverage for CNN,RNN,MLP Third party package can use cuDNN, MKL, and Metal

Research result transfer to production by ONNX • Decoupling Research
and Production Frameworks (Pytorch ←→Caffe2)

RESOURCE IMPLICATIONS OF MACHINE LEARNING [link]

Compute Type and Locality Distributed Training : P. Goyal et
al. Takuya Akiba et al.

RESOURCE REQUIREMENTS OF ONLINE INFERENCE WORKLOADS.

Future of MLaaS at Facebook • ML workloads benefit from
SIMD, specialized convolution or matrix multiplication engines. • Model compression, Quantization, and High-bandwidth memory ◦ "Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding”, ICLR16 Song Han et al ◦ "Binarynet: Training deep neural networks with weights and activations constrained to +1 or -1" Matthieu Courbariaux et al. ◦ “Ternary Neural Networks for Resource-Efficient AI Applications” Hande Alemdar et al. • Relational Work : ◦ "TFX: A TensorFlow-Based Production-Scale Machine Learning Platform" KDD17

Conclusion • 2.1 billion Users served MLaaS at Facebook!! •
MLaaS, Computer Vision represents only a small fraction of the resource requirements. • FB relies upon an incredibly diverse set of ML approaches. ◦ e.g. SVM, GBDT,Logistic Regression(LR) • Inference used mainly CPU, Training used CPU and GPU.

Applied machine learning at facebook a datacent...

Applied machine learning at facebook a datacenter infrastructure perspective HPCA18

Shunya Ueta

More Decks by Shunya Ueta

Other Decks in Programming

Featured

Transcript