and Data Mining ( SIGKDD ) • Part of Association for Computing Machinery ( ACM ) • SIGKDD has become an official ACM SIG since 1998 • ACM SIGKDD has hosted an annual conference since 1995 • Considered as the most influential forum for knowledge discovery and data mining research GDP Labs Confidential
- 17th 2017 • Consists of : ◦ Tutorial ◦ Workshop ◦ Main Conference ▪ Keynote ▪ Paper presentations • Sponsored by a lot of named tech companies from around the world GDP Labs Confidential
Data Science Methods to Effect Business Change • A/B Testing at Scale : Accelerating Software Innovation • Workshop on Causal Discovery • Anomaly Detection in Finance • Platforms and Infrastructure • Applied Machine Learning • Deep Learning • Intelligent Systems and Data Science • KDD Panel : The Future of Artificially Intelligent Assistants • Clustering • Web Applications GDP Labs Confidential
to Effect Business Change • Advanced analytic entry points ◦ The Technology Directive ◦ The Field of Dreams ◦ The Ambitious Executive ◦ The Smart Competitor • Are you asking the right questions? ◦ Business valuable questions • Agile Approach to Data Driven Decision Making ◦ What is agile? ◦ Managing uncertainty • Data science is expensive, but if you don’t do it, you will be left behind Danielle Leighton Lindsay Brin Janet Forbes T4G Limited GDP Labs Confidential
Golovin Google Research • Used on many applications ◦ A/B Testing ◦ Machine Learning ◦ Physical Design ◦ Robotics • Vizier ◦ Easy to use ◦ Reliable ◦ Scalable ◦ Flexible ◦ State of the art • They tried Vizier to bake chocolate chip cookies GDP Labs Confidential
Heng Tze Cheng Google Research • Productionizing ML pipeline is hard • End to end ( from data to serving ) • Design principles : ◦ One ML platform for many products ◦ Continuous training and serving ◦ Human in the loop ◦ Reliable and scalable • Steps : ◦ Data analysis ◦ Data validation ◦ Data transformation ◦ Trainer ◦ Model validation & evaluation ◦ Serving • Documentation -> passive way • Education -> active way ( teach, talk ) • Automation -> key principles GDP Labs Confidential
Facebook • Data drives all product at Facebook • FB Learner Flow ◦ Help non ML people to use ML ◦ 70% users are non AI-experts ◦ 25% engineers are active users • Applied Machine Learning at Facebook ◦ Computer Vision ◦ Deep Text ◦ Speech and Video ◦ AI Powered Camera • What’s next ◦ Multi modal learning ◦ Transfer learning ◦ Multi lingual modeling ◦ Weakly supervised / unsupervised learning GDP Labs Confidential
by machines will overcome data produced by people • Industrial Machine Learning ◦ Preventive Maintenance ◦ Anomaly / Failure detection ◦ Etc • Industrial level is more dangerous than social level • Optimization metric ◦ Higher accuracy != higher value ◦ Drive towards higher precision/low FPR • Models ◦ Physical model ◦ Data driven model • Interpretability and accuracy trade off • Small improvement helps a lot • Hard to convince everyone to transition GDP Labs Confidential
Systems • Recent Advances in Feature Selection: A Data Perspective • 2017 Edition of AdKDD and TargetAd • Three Principles of Data Science: Predictability, Stability, and Computability • Supervised Learning • Deep Learning • Intelligent Systems and Data Science • Hands-On Tutorial Declarative, Large-Scale Machine Learning with Apache SystemML • Hands-On Tutorial Tensor Flow GDP Labs Confidential