AI Observability ツ - Examples with Spark, Pandas, and Scikit-Learn

Slide 1

Slide 1 text

Slide 2

Slide 2 text

© Kensu, inc. 2021 AI Observability ツ Examples with Spark, Pandas, and Scikit-Learn 1. “Who’s that dude” (73’) 2. Introduction to `Datastrophes` (10’) 3. Solution: needs and methods (15’) 4. Showtime: implementation examples (15’) AGENDA

Slide 3

Slide 3 text

Slide 4

Slide 4 text

© Kensu, inc. 2021 Introduction to `Datastrophes` Like any projects, a data project needs to limit its scope. To do so many assumptions are necessary. Also, those assumptions are made by both the business (about the market) and the engineering (about the system) which leads, inevitably, to Datastrophes; Catastrophe = denouement of a drama. Datastrophe = catastrophe with data. ---------------------------------------------------------------------- Datastrophe = denouement of a DAMA (*). (*) DAta MAnagement

Slide 5

Slide 5 text

Slide 6

Slide 6 text

Slide 7

Slide 7 text

Slide 8

Slide 8 text

Slide 9

Slide 9 text

Slide 10

Slide 10 text

Slide 11

Slide 11 text

Slide 12

Slide 12 text

Slide 13

Slide 13 text

Slide 14

Slide 14 text

Slide 15

Slide 15 text

Slide 16

Slide 16 text

© Kensu, inc. 2021 Datastrophes ⇢? AI Winter 🥶🥶🥶 “The AI winter was a result of such hype, due to over-inflated promises by developers, unnaturally high expectations from end-users, and extensive promotion in the media.” https://www.actuaries.digital/2018/09/05/history-of-ai-winters/

Slide 17

Slide 17 text

© Kensu, inc. 2021 Datastrophes ⇢? AI Winter 🥶🥶🥶 4. Data variations are uncontrolled and unknown Why? POC Syndrome ROI drops over time It is not about garbage in, garbage out or data quality but trust what’s going on with data in production “15% of documentation overhead to ensure compliance and Data Catalog usefulness” -- project manager “Data is not available on time in production” -- data ops “Data suppliers changed schema, or semantic (business definition), impacting business rules accuracy” -- data engineer “The data is different than 6 months ago, all predictions are wrong” -- data scientist “Datastrophes” 1. Data is hard to find and usable in production 2. Cost of maintenance reduces team capabilities 3. Impact assessments are ineffective, incomplete

Slide 18

Slide 18 text

© Kensu, inc. 2021 AI Observability Wave-Particle duality A. Einstein: “It seems as though we must use sometimes the one theory and sometimes the other, while at times we may use either. We are faced with a new kind of difficulty. We have two contradictory pictures of reality; separately neither of them fully explains the phenomena of light, but together they do.”

Slide 19

Slide 19 text

© Kensu, inc. 2021 AI Observability A Machine Learning model can be seen as: - Data: it is a bunch of doubles resulting from the training process on the observations (i.e. the known world). - Application: it is used as a function (e.g. to predict). Moreover the behavior of the application part depends on the observations used in the training phase. Where our control resides in the hyper parameters we provide (found) during the training. It is like Java, Scala, Python, R, Go, SQL, etc. code changing automatically with its context, and what it’s becoming is unknown.

Slide 20

Slide 20 text

© Kensu, inc. 2021 AI Observability I can hear that it is raining cats and dogs. I see a poor person outside walking the street. Although, I don’t have to help as the umbrella does already the job Note: in this case, it is raining Schrödinger's cats

Slide 21

Slide 21 text

© Kensu, inc. 2021 AI Observability As per the Schrödinger's cat, an AI system can be considered, after a certain of time, good and bad simultaneously. Unless an observer looks into it and identifies its real state. However, especially with AI… the question is: What do we have to observe? In other words, which outputs shall we use to infer the internal state?

Slide 22

Slide 22 text

Slide 23

Slide 23 text

© Kensu, inc. 2021 (some) Links with Data Mesh ● Responsibility The domain becomes responsible for the data it exposes → The consumer shares the responsibility by exposing its usages and constraints ● Data as a Product Linked to the responsibility, SLAs (SLOs) have to be defined and communicated. But more importantly, their failures must be detected, or anticipated ● Federated Governance As data products are shared and promoted, (analytical) applications are mostly crossing several domain boundaries.

Slide 24

Slide 24 text

Slide 25

Slide 25 text

© Kensu, inc. 2021 Stay calm and let your codes/tools speak while they run At least 3 strategies have been used successfully (running in prod 😁): ● Catch events or use APIs of high-end tools (e.g. tableau): For example, lineage, nowadays, is being more and more implemented. ● Wrap your preferred libraries with auto-logging capabilities: Spark, Pandas, Dplyr, Spring, and so on can be beefed up with internals log reporting. ● Use opentracing philosophy to capture facets from your data usage that can be reconsolidated later on (trace reporter)

Slide 26

Slide 26 text

© Kensu, inc. 2021 Link with Data Intelligence Management (DIM) Data Management, or especially, Data Governance is often thought of as Data Catalog (metadata repository, glossary, workflow management, …) DM by essence focuses on the data, therefore for example, to allow one to find a dataset based on its metadata - e.g. where are the customers data? AI Observability allows an organization to also capture the purposes of the data usages through the lens of the applications. Such that a usage based catalog will allow to find dataset based on the purpose - e.g. how can I predict my churn?

Slide 27

Slide 27 text

THANKS! Ping me on @nooostab or LinkedIn Checkout Kensu DIM on https://kensu.io 🎺 📣 O’Reilly training (4/28): ML Monitoring in Python