Slide 1

Slide 1 text

KDD 2017 Oey, Michael Julio Gregorius Edwadr

Slide 2

Slide 2 text

KDD ● Held by Special Interest Group on Knowledge Discovery and Data Mining ( SIGKDD ) ● Part of Association for Computing Machinery ( ACM ) ● SIGKDD has become an official ACM SIG since 1998 ● ACM SIGKDD has hosted an annual conference since 1995 ● Considered as the most influential forum for knowledge discovery and data mining research GDP Labs Confidential

Slide 3

Slide 3 text

KDD 2012 Beijing 2013 Chicago 2014 New York 2015 Sydney 2016 San Fransisco 2017 Halifax GDP Labs Confidential

Slide 4

Slide 4 text

KDD 2017 ● Held at Halifax, Canada ● August 13th - 17th 2017 ● Consists of : ○ Tutorial ○ Workshop ○ Main Conference ■ Keynote ■ Paper presentations ● Sponsored by a lot of named tech companies from around the world GDP Labs Confidential

Slide 5

Slide 5 text

KDD 2017 ● From Theory to Data Product - Applying Data Science Methods to Effect Business Change ● A/B Testing at Scale : Accelerating Software Innovation ● Workshop on Causal Discovery ● Anomaly Detection in Finance ● Platforms and Infrastructure ● Applied Machine Learning ● Deep Learning ● Intelligent Systems and Data Science ● KDD Panel : The Future of Artificially Intelligent Assistants ● Clustering ● Web Applications GDP Labs Confidential

Slide 6

Slide 6 text

From Theory to Data Product : Applying Data Science Methods to Effect Business Change ● Advanced analytic entry points ○ The Technology Directive ○ The Field of Dreams ○ The Ambitious Executive ○ The Smart Competitor ● Are you asking the right questions? ○ Business valuable questions ● Agile Approach to Data Driven Decision Making ○ What is agile? ○ Managing uncertainty ● Data science is expensive, but if you don’t do it, you will be left behind Danielle Leighton Lindsay Brin Janet Forbes T4G Limited GDP Labs Confidential

Slide 7

Slide 7 text

Google Vizier - A Service for Black Box Optimization Daniel Golovin Google Research ● Used on many applications ○ A/B Testing ○ Machine Learning ○ Physical Design ○ Robotics ● Vizier ○ Easy to use ○ Reliable ○ Scalable ○ Flexible ○ State of the art ● They tried Vizier to bake chocolate chip cookies GDP Labs Confidential

Slide 8

Slide 8 text

TFX : TensorFlow Extended - A Production Scale ML Platform Heng Tze Cheng Google Research ● Productionizing ML pipeline is hard ● End to end ( from data to serving ) ● Design principles : ○ One ML platform for many products ○ Continuous training and serving ○ Human in the loop ○ Reliable and scalable ● Steps : ○ Data analysis ○ Data validation ○ Data transformation ○ Trainer ○ Model validation & evaluation ○ Serving ● Documentation -> passive way ● Education -> active way ( teach, talk ) ● Automation -> key principles GDP Labs Confidential

Slide 9

Slide 9 text

Designing AI at Scale to Power Everyday Life Rajesh Parekh Facebook ● Data drives all product at Facebook ● FB Learner Flow ○ Help non ML people to use ML ○ 70% users are non AI-experts ○ 25% engineers are active users ● Applied Machine Learning at Facebook ○ Computer Vision ○ Deep Text ○ Speech and Video ○ AI Powered Camera ● What’s next ○ Multi modal learning ○ Transfer learning ○ Multi lingual modeling ○ Weakly supervised / unsupervised learning GDP Labs Confidential

Slide 10

Slide 10 text

Industrial Machine Learning Josh Bloom General Electric ● Data produced by machines will overcome data produced by people ● Industrial Machine Learning ○ Preventive Maintenance ○ Anomaly / Failure detection ○ Etc ● Industrial level is more dangerous than social level ● Optimization metric ○ Higher accuracy != higher value ○ Drive towards higher precision/low FPR ● Models ○ Physical model ○ Data driven model ● Interpretability and accuracy trade off ● Small improvement helps a lot ● Hard to convince everyone to transition GDP Labs Confidential

Slide 11

Slide 11 text

KDD 2017 ● Deep Learning for Personalized Search and Recommender Systems ● Recent Advances in Feature Selection: A Data Perspective ● 2017 Edition of AdKDD and TargetAd ● Three Principles of Data Science: Predictability, Stability, and Computability ● Supervised Learning ● Deep Learning ● Intelligent Systems and Data Science ● Hands-On Tutorial Declarative, Large-Scale Machine Learning with Apache SystemML ● Hands-On Tutorial Tensor Flow GDP Labs Confidential

Slide 12

Slide 12 text

Online Advertising GDP Labs Confidential

Slide 13

Slide 13 text

How ad works GDP Labs Confidential

Slide 14

Slide 14 text

How ad works GDP Labs Confidential

Slide 15

Slide 15 text

How ad works GDP Labs Confidential

Slide 16

Slide 16 text

How ad works Advertiser Publisher GDP Labs Confidential

Slide 17

Slide 17 text

Ad Pricing ● Popular Model ○ CPM -> cost per mille impression ○ CPC -> cost per click ○ CPA -> cost per acquisition ● Others ○ CPF -> cost per follower ○ CPV -> cost per view ○ CPI -> cost per install ○ CPD -> cost per download GDP Labs Confidential

Slide 18

Slide 18 text

Criteo GDP Labs Confidential

Slide 19

Slide 19 text

Criteo GDP Labs Confidential

Slide 20

Slide 20 text

Attribution Model GDP Labs Confidential

Slide 21

Slide 21 text

Attribution Probability GDP Labs Confidential

Slide 22

Slide 22 text

Attribution Aware Bidder GDP Labs Confidential

Slide 23

Slide 23 text

Bidding GDP Labs Confidential

Slide 24

Slide 24 text

Result GDP Labs Confidential