Slide 1

Slide 1 text

Dances with unicorns Agile datascience from exploration to adoption

Slide 2

Slide 2 text

Un grand merci à nos sponsors et partenaires

Slide 3

Slide 3 text

Oman 11.30 AM

Slide 4

Slide 4 text

Paris 9.45 AM

Slide 5

Slide 5 text

<24h From user’s feedback to production

Slide 6

Slide 6 text

One month later...

Slide 7

Slide 7 text

This project has been abandoned

Slide 8

Slide 8 text

That’s me! Wassel Alazhar Consultant, developer, problem solver @wasselovski https://github.com/jcraftsman

Slide 9

Slide 9 text

Agenda ● The full story ● What went wrong? ● What did we learn? ○ How to bring value from datascience ○ Explore and build ○ Efficient collaboration ○ Product quality ● Why this talk? ● Takeaways

Slide 10

Slide 10 text

The full story

Slide 11

Slide 11 text

A global energy leader

Slide 12

Slide 12 text

A global energy leader Produce Deliver SELL

Slide 13

Slide 13 text

A global energy leader Produce Deliver SELL Sensors everywhere! All along the value chain

Slide 14

Slide 14 text

The problem to solve Produce Deliver SELL Sensors everywhere! All along the value chain

Slide 15

Slide 15 text

The problem to solve Two new generation power plants They are exactly the same but... Twin A Twin B

Slide 16

Slide 16 text

The problem to solve Twin B is way more performant (i.e., makes money) Twin A Twin B

Slide 17

Slide 17 text

The solution Datascience can help identifying better operational models for the power plants

Slide 18

Slide 18 text

The solution BIG DATA + DATA SCIENCE = MAGIC

Slide 19

Slide 19 text

The partner

Slide 20

Slide 20 text

The partner A unicorn is a privately held startup company valued at over $1 billion

Slide 21

Slide 21 text

The bill But wait… How much is that? Nevermind. It’s all on me! It’s called innovation. Great!

Slide 22

Slide 22 text

The team Data engineers Data scientists App developers

Slide 23

Slide 23 text

To the Silicon valley Data engineers Data scientists App developers

Slide 24

Slide 24 text

Week after week… Demo after demo

Slide 25

Slide 25 text

It couldn’t be any better

Slide 26

Slide 26 text

SURPRISE Now, it’s all yours! All you have to pay for is the run. Oh! No, thanks. I’m out of it.

Slide 27

Slide 27 text

Deception

Slide 28

Slide 28 text

What went wrong?

Slide 29

Slide 29 text

Building a software!

Slide 30

Slide 30 text

What was the Problem to solve? Do you remember the twin power plants? Twin A Twin B

Slide 31

Slide 31 text

Not what we’ve expected... Problem solved explained quickly No actionable findings

Slide 32

Slide 32 text

Instead we have delivered features! Degradation analysis Anomaly detection The software can detect dust in the steam turbine! PCA???

Slide 33

Slide 33 text

Feature ≠ VALUE

Slide 34

Slide 34 text

What did we learn?

Slide 35

Slide 35 text

Happy ending stories... Predictive maintenance Smart buildings Ice detection Heating and cooling efficiency

Slide 36

Slide 36 text

Business use case discovery Don’t start with a software! Explore Observe Confirm hypothesis or not! Discover

Slide 37

Slide 37 text

Business use case discovery Don’t explore in a dark lab! Get feedback!

Slide 38

Slide 38 text

Business use case discovery A python notebook is not a software! It’s a tool for a study!

Slide 39

Slide 39 text

From study to product delivery Business use case located? Build! VALUE

Slide 40

Slide 40 text

Product delivery Not like this!

Slide 41

Slide 41 text

Wait, what does datascience look like in 2018? How would you write a program for puppy recognition?

Slide 42

Slide 42 text

Wait, what does datascience look like in 2018? You can: ● Try to define what a puppy face is ● Code all these rules! Or, use Machine learning: ● Show a lot of puppy faces examples! You don’t need to tell the algorithm what to do. All you need is to show it a lot of examples!

Slide 43

Slide 43 text

Wait, what does datascience look like in 2018? Take care of your examples (data pipeline) Verify the results (predictions)

Slide 44

Slide 44 text

Putting it all together Discovery: Given a real world pictures sample, would it be possible to recognize a puppy face? The answer is 86% yes, 13% muffins, 1% unknown. Product: Play a dog kibble comercial whenever a puppy picture is displayed!

Slide 45

Slide 45 text

Explore and build Explore: ● Gathering data ● Cleaning data ● Feature engineering ● Defining model ● Training ● Predicting the output => Discover what you are able to do with your data Build: ● Data acquisition ● Data filtering ● Use model configuration ● Use model ● Training (or use a train set) ● Predicting the output => Steadily bring value from your data

Slide 46

Slide 46 text

Explore and build iteratively Explore: ● Gathering data ● Cleaning data ● Feature engineering ● Defining model ● Training ● Predicting the output => Discover what you are able to do with your data Build: ● Data acquisition ● Data filtering ● Use model configuration ● Use model ● Training (or use a train set) ● Predicting the output => Steadily bring value from your data

Slide 47

Slide 47 text

Explore and build iteratively Explore Build Business use case discovery Product delivery

Slide 48

Slide 48 text

Product delivery You’re not done with datascience! They should build together!

Slide 49

Slide 49 text

Building together Code review When? All the time! Who? Everyone! Why? Quality, collective ownership and joy!

Slide 50

Slide 50 text

Building together Pair programming When? All the time! Who? Everyone! Why? Quality, collective ownership and joy!

Slide 51

Slide 51 text

Building together Mob programming When? Whenever you start something new or complex. Who? Everyone! Why? Collective intelligence, collective ownership, quality and joy!

Slide 52

Slide 52 text

Building together TDD Let’s be serious! When? Whenever you change the product’s behaviour. Who? Everyone working on the product! Why? Collective intelligence, collective ownership, quality and joy!

Slide 53

Slide 53 text

Building together TDD Have you ever met a data scientist who write unit tests and refactor? I did! :) It’s hard to imagine doing TDD during an exploratory work though! (i.e., when the target observable behaviour is not yet defined)

Slide 54

Slide 54 text

Product delivery Spikes and user stories

Slide 55

Slide 55 text

Product delivery essentials Don’t lose time repeating boring stuff! Automate! Make data available for everyone! Don’t treat your infra like pets! Destroy and rebuild! Don’t over-engineer though!

Slide 56

Slide 56 text

Product adoption Stay close to the users! Don’t plan too many features! Incorporate feedback!

Slide 57

Slide 57 text

What is agile anyway?

Slide 58

Slide 58 text

Can datascience be agile? It’s still true! Even for: ● Big data ● AI ● Datascience

Slide 59

Slide 59 text

Why this talk?

Slide 60

Slide 60 text

Myths about datascience Well… Things have slightly changed since then… But not that much!

Slide 61

Slide 61 text

Myths about datascience

Slide 62

Slide 62 text

Myths about datascience

Slide 63

Slide 63 text

Unicorns

Slide 64

Slide 64 text

New unicorns - Same old stories You should draw your entire model before you start coding! Open a ticket! You need to hire a machine learning engineer!

Slide 65

Slide 65 text

Takeaways!

Slide 66

Slide 66 text

Takeaways! Make people together! Business value discovery => product delivery Explore and build iteratively Agile is still: ● Short feedback ● Small increments ● Take engineering seriously work learn

Slide 67

Slide 67 text

OCTO © 2018 - Reproduction interdite sans autorisation écrite préalable 67 OCTO Provence recrute ! C’EST AVANT TOUT UN ÉTAT D’ESPRIT START-UP APPUYÉ PAR DES EXPERTISES TECH, AGILE & CHANGE POUR ACCOMPAGNER DIGITALE TRANSFORMATION NOS CLIENTS DANS LEUR Contactez-nous sur [email protected]