Upgrade to Pro — share decks privately, control downloads, hide ads and more …

ML Data Products

ML Data Products

Tendências, especificidades e aspectos gerais

Flavio Clesio

May 14, 2021
Tweet

More Decks by Flavio Clesio

Other Decks in Technology

Transcript

  1. ML Data Products Tendências, especificidades e aspectos gerais Flavio Clesio

    / @flavioclesio flavioclesio.com / flavioclesio at gmail dot com 05.2021
  2. About me • Staff Data/ML Engineer @ Artsy • MSc.

    Production Engineering (Computational Intelligence) • Some of my thoughts flavioclesio.com • Some conference talks (Strata Hadoop World, Spark Summit, PAPIS.io, The Developers Conference, etc.) • Independent Researcher (Applied Machine Learning) in spare time 2 flavioclesio Flávio Clésio
  3. Disclaimer #1 Topics that won't be discussed in this talk::

    • Agile / Scrum / Kanban / CMMI Methodologies • Corporate Culture • Spotify “Squad Model” • Slides with Terminator pictures • Regulatory aspects related with AI 3
  4. Disclaimer #2 All views expressed in this presentation are my

    own. They have not been reviewed or approved by my current, past, or either future employers. I do not speak on behalf of any company. All views expressed in this here are based in my personal empirical views that I experienced in recent years in the industry. Do not take any part of those views as hard science, best practices playbook for socio-technological systems, cautionary tales, BroScience, or any kind of science at all. The idea here is about only sharing some experiences from a practitioner standpoint. 4
  5. • Repetition • Marginal cost zero • High Frequency •

    Low latency • Network effects or Marketplace Dynamics • Removes users friction • Can scale in a platform • Low error cost (Doesn’t require 100% accuracy) • Shines where humans are brittle • High complex rules to cope manually ML Data Products - Main Characteristics 6
  6. ML Data Products - Product Role Types • Critical or

    Complementary • Private or Public • Proactive or Reactive • Visible or Invisible • Dynamic or Static 7 Apple - Defining the Role of Machine Learning in Your App
  7. • Classifiers • Recommender Systems • Matching • Prediction •

    Image Recognition • Text Classifiers • Translation • Text 2 Speech • Sequence Prediction ML Data Products - Landscape 8
  8. Trend in Organizations/Corporations: Increase in the faith that AI can

    help businesses... 11 [1] - The 2020 State of AI and Machine Learning Report
  9. How much money is in the table? • ~45% of

    total economic gains by 2030 will come from product enhancements, stimulating consumer demand. This is because AI will drive greater product variety, with increased personalization, attractiveness and affordability over time. [23] • ~45% of work activities could potentially be automated by today’s technologies, and 80% of that is enabled by machine learning.[10] 12
  10. Current State of Affairs (#1) 14 [11] - Why do

    87% of data science projects never make it into production?
  11. Current State of Affairs (#2) 15 [1] - The 2020

    State of AI and Machine Learning Report
  12. Machine Learning Projects: “Unknown unknowns” instead “Known unknowns” 19 [8]

    - But what is this “machine learning engineer” actually doing?
  13. Machine Learning Projects: Behavior instead flow • Software Engineering Projects:

    Deterministic and transparent representation of a business flow • Machine Learning Projects: Sometimes non-deterministic, opaque, and represents explicit and implicit embedded behaviors from data 21
  14. Machine Learning Projects: Uncertainty is the only certainty • Challenges

    abound: non-deterministic outcomes, uncertainty, opacity, fairness issues, and other factors make AI a difficult sell to decision-makers and upper management. [4] 22
  15. Machine Learning Projects: Uncertainty is the only certainty • Because

    it’s so different from traditional software development, where the risks are more or less well-known and predictable, AI rewards people and companies that are willing to take intelligent risks, and that have (or can develop) an experimental culture.[4] 23
  16. Machine Learning Projects: Main components 25 [16] - CD for

    Machine Learning Automating the end-to-end lifecycle of Machine Learning applications
  17. Machine Learning Projects: Technical Roles and touchpoints 26 [16] -

    CD for Machine Learning Automating the end-to-end lifecycle of Machine Learning applications
  18. Machine Learning Projects: Touchpoints and workflow 27 [16] - CD

    for Machine Learning Automating the end-to-end lifecycle of Machine Learning applications
  19. Machine Learning: Engineering Endeavors Machine Learning: The High-Interest Credit Card

    of Technical Debt • Machine Learning and Complex Systems • Complex Models Erode Boundaries • Data Dependencies Cost More than Code Dependencies • System-level Spaghetti • Dealing with Changes in the External World [18] 29
  20. Product Management for Machine Learning 32 [13] - Executive Briefing:

    Why managing machines is harder than you think
  21. Machine Learning Projects: Management Skills The Data Expertise of the

    AI PM • Skill-Data Lifecycle and Pipeline Management • Skill-Experimentation and Measurement • Skill-DS/ML/AI Development Process • [15] 33
  22. Machine Learning Projects: Important questions needed from Product Managers •

    The data PM understands the technological infrastructure involved in building products at a technical level. • What kind of infrastructure is needed to support the product? • Do machine learning models need to be scored in real-time or can they be prescored offline? • What is the plan for retraining models on new data? • How will the model’s success be evaluated over time? • What is the complexity cost for implementing the model in production? [9] 34
  23. Machine Learning Projects: Key lessons [24] In general, for Machine

    Learning to make sense for a business, your problem should have these characteristics: • Requires complex logic that’s impractical to solve with human-defined rules, or heuristics. • The problem will be scaling up very fast. • Requires personalization at scale. • Require rules that change quickly over time. • Has a known, pre-defined end result. • Does not require 100% accuracy. 36
  24. • Feasibility analysis can save tons of money that would

    be wasted in death march projects • Start with Vertical Prototyping considering a very narrow user case and expand it in further interactions • A good experimentation strategy as a well designed A/B testing or Multi-Bandit strategies can speedup the learning and enhancements • ML Algorithms + Business Rules/Heuristics it’s a powerful combination Machine Learning Projects: Key lessons 37
  25. Machine Learning Projects: Key lessons 38 • PM that knows

    the limitations AI/ML can craft better products • Offline evaluation it’s silver, but user experience evaluation it’s gold • As user experience it’s expensive, formulate testing protocols (e.g. Concierge Tests, Alpha, Stealth Tests, Early Adopters, Smoke Tests, etc.) • Embrace failure, but correct fast
  26. Closing thoughts • There is opportunities and companies are starting

    to realize this • Lack of AI/ML/DS skills it’s one of the main sources of project failure • Vertical Prototyping and then MVPs • Small iterations and if it promising, scale • Fairness, Transparency, Accountability is a real issue and needs to be considered in any project of this nature • Real understanding and pragmatism can cut the hype and help Product & Engineering teams ship 40
  27. ML Data Products Tendências, especificidades e aspectos gerais Flavio Clesio

    / @flavioclesio flavioclesio.com / flavioclesio at gmail dot com 05.2021
  28. References [1] - The 2020 State of AI and Machine

    Learning Report [2] - KDD, semma and CRISP-DM: A parallel overview [3] - Machine learning requires a fundamentally different deployment approach [4] - What you need to know about product management for AI [5] - The AI Hierarchy of Needs [6] - Rules of Machine Learning: Best Practices for ML Engineering [7] - Managing Machine Learning Projects [8] - But what is this “machine learning engineer” actually doing? [9] - Rise of the Data Product Manager 42
  29. References [10] - The First Wave of Corporate AI Is

    Doomed to Fail [11] - Why do 87% of data science projects never make it into production? [12] - Machine Learning Product Management: Lessons Learned [13] - Executive Briefing: Why managing machines is harder than you think [14] - What you need to know about product management for AI [15] - Practical Skills for The AI Product Manager [16] - Continuous Delivery for Machine Learning Automating the end-to-end lifecycle of Machine Learning applications [17] - What to Do When AI Fails [18] - Machine Learning: The High-Interest Credit Card of Technical Debt 43
  30. References [19] - AI adoption in the enterprise 2020 [20]

    - The New Business of AI (and How It’s Different From Traditional Software) [21] - AI Playbook [22] - A Proposal for the Dartmouth Summer Research Project on Artificial Intelligence [23] - PwC’s Global Artificial Intelligence Study: Exploiting the AI Revolution [24] - When and How to Add Machine Learning to a Product Roadmap [25] - CIO Survey: Top 3 Challenges Adopting AI and How to Overcome Them [26] - Failure rates for analytics, AI, and big data projects = 85% – yikes! [27] - Two years in the life of AI, ML, DL and Java 44
  31. References [28] - Too many machine learning papers? [29] -

    100+ AI Use Cases & Applications in 2020: In-Depth Guide [30] - The macroeconomic impact of artificial intelligence [31] - The Future of MLOps … and how did we get here? [32] - The age of analytics: Competing in a data-driven world 45