Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Danse avec les unicorns : la data science en agile, de l'exploration à l'adoption

Esprit Agile
December 06, 2018

Danse avec les unicorns : la data science en agile, de l'exploration à l'adoption

Conférence par Wassel Alazhar lors de l'Agile Tour Aix-Marseille 2018.

"La data science sort enfin des grottes des scientifiques. Tout le monde est convaincu que cette discipline changera la donne.
Les outils et les plateformes foisonnent et les startups qui ont fait ce pari sont devenues des unicorns.

Cependant, construire des applications qui apportent une réelle valeur ajoutée reste un vrai challenge. Plusieurs projets prometteurs n'ont pas réussi à dépasser le stade du PoC.

L'année dernière, j'ai accompagné mon client dans son aventure avec une "licorne" californienne qui produit une plateforme de Big data analytics et AI.
Dans cette session, nous allons voir comment faire une data science agile pour l'ajout constant de valeur.

On s'intéressera à :

– l'exploration de la valeur métier
– la collaboration entre différents domaines d'expertise (data science, data integration, app development, UX…)
– la qualité du produit dans un domaine exploratoire

Une agilité à tous les étages !"

Esprit Agile

December 06, 2018
Tweet

More Decks by Esprit Agile

Other Decks in Business

Transcript

  1. Dances with unicorns
    Agile datascience
    from exploration to adoption

    View Slide

  2. Un grand merci à nos sponsors et partenaires

    View Slide

  3. Oman 11.30 AM

    View Slide

  4. Paris 9.45 AM

    View Slide

  5. <24h
    From user’s feedback to production

    View Slide

  6. One month later...

    View Slide

  7. This project has been
    abandoned

    View Slide

  8. That’s me!
    Wassel Alazhar
    Consultant, developer, problem solver
    @wasselovski
    https://github.com/jcraftsman

    View Slide

  9. Agenda
    ● The full story
    ● What went wrong?
    ● What did we learn?
    ○ How to bring value from datascience
    ○ Explore and build
    ○ Efficient collaboration
    ○ Product quality
    ● Why this talk?
    ● Takeaways

    View Slide

  10. The full story

    View Slide

  11. A global energy leader

    View Slide

  12. A global energy leader
    Produce Deliver SELL

    View Slide

  13. A global energy leader
    Produce Deliver SELL
    Sensors everywhere!
    All along the value chain

    View Slide

  14. The problem to solve
    Produce Deliver SELL
    Sensors everywhere!
    All along the value chain

    View Slide

  15. The problem to solve
    Two new generation power plants
    They are exactly the same
    but...
    Twin A Twin B

    View Slide

  16. The problem to solve
    Twin B is way more performant (i.e., makes money)
    Twin A Twin B

    View Slide

  17. The solution
    Datascience can
    help identifying
    better operational
    models for the
    power plants

    View Slide

  18. The solution
    BIG DATA
    +
    DATA SCIENCE
    =
    MAGIC

    View Slide

  19. The partner

    View Slide

  20. The partner
    A unicorn is a privately held startup company valued at over
    $1 billion

    View Slide

  21. The bill
    But wait…
    How much is that?
    Nevermind.
    It’s all on me!
    It’s called innovation.
    Great!

    View Slide

  22. The team
    Data engineers Data scientists App developers

    View Slide

  23. To the Silicon valley
    Data engineers Data scientists App developers

    View Slide

  24. Week after week… Demo after demo

    View Slide

  25. It couldn’t be any better

    View Slide

  26. SURPRISE
    Now, it’s all yours!
    All you have to pay
    for is the run.
    Oh!
    No, thanks.
    I’m out of it.

    View Slide

  27. Deception

    View Slide

  28. What went wrong?

    View Slide

  29. Building a software!

    View Slide

  30. What was the Problem to solve?
    Do you remember the twin power plants?
    Twin A Twin B

    View Slide

  31. Not what we’ve expected...
    Problem solved explained quickly
    No actionable findings

    View Slide

  32. Instead we have delivered features!
    Degradation analysis
    Anomaly detection
    The software can detect dust in the steam turbine!
    PCA???

    View Slide

  33. Feature

    VALUE

    View Slide

  34. What did we learn?

    View Slide

  35. Happy ending stories...
    Predictive maintenance
    Smart buildings
    Ice detection
    Heating and cooling efficiency

    View Slide

  36. Business use case discovery
    Don’t start with a software!
    Explore
    Observe
    Confirm hypothesis or not!
    Discover

    View Slide

  37. Business use case discovery
    Don’t explore in a dark lab!
    Get feedback!

    View Slide

  38. Business use case discovery
    A python notebook is not a software!
    It’s a tool for a study!

    View Slide

  39. From study to product delivery
    Business use case located?
    Build!
    VALUE

    View Slide

  40. Product delivery
    Not like this!

    View Slide

  41. Wait, what does datascience look like in 2018?
    How would you write a program for puppy recognition?

    View Slide

  42. Wait, what does datascience look like in 2018?
    You can:
    ● Try to define what a puppy face is
    ● Code all these rules!
    Or, use Machine learning:
    ● Show a lot of puppy faces examples!
    You don’t need to tell the algorithm what to do.
    All you need is to show it a lot of examples!

    View Slide

  43. Wait, what does datascience look like in 2018?
    Take care of your examples (data pipeline)
    Verify the results (predictions)

    View Slide

  44. Putting it all together
    Discovery:
    Given a real world pictures sample, would it be possible to
    recognize a puppy face?
    The answer is 86% yes, 13% muffins, 1% unknown.
    Product:
    Play a dog kibble comercial whenever a puppy picture is
    displayed!

    View Slide

  45. Explore and build
    Explore:
    ● Gathering data
    ● Cleaning data
    ● Feature engineering
    ● Defining model
    ● Training
    ● Predicting the output
    => Discover what you are able to do
    with your data
    Build:
    ● Data acquisition
    ● Data filtering
    ● Use model configuration
    ● Use model
    ● Training (or use a train set)
    ● Predicting the output
    => Steadily bring value from your
    data

    View Slide

  46. Explore and build iteratively
    Explore:
    ● Gathering data
    ● Cleaning data
    ● Feature engineering
    ● Defining model
    ● Training
    ● Predicting the output
    => Discover what you are able to do
    with your data
    Build:
    ● Data acquisition
    ● Data filtering
    ● Use model configuration
    ● Use model
    ● Training (or use a train set)
    ● Predicting the output
    => Steadily bring value from your
    data

    View Slide

  47. Explore and build iteratively
    Explore Build
    Business
    use case
    discovery
    Product delivery

    View Slide

  48. Product delivery
    You’re not done with datascience!
    They should build together!

    View Slide

  49. Building together
    Code review
    When? All the time!
    Who? Everyone!
    Why? Quality, collective
    ownership and joy!

    View Slide

  50. Building together
    Pair programming
    When? All the time!
    Who? Everyone!
    Why? Quality, collective
    ownership and joy!

    View Slide

  51. Building together
    Mob programming
    When? Whenever you start
    something new or complex.
    Who? Everyone!
    Why? Collective
    intelligence, collective
    ownership, quality and joy!

    View Slide

  52. Building together
    TDD
    Let’s be serious!
    When? Whenever you change the
    product’s behaviour.
    Who? Everyone working on the product!
    Why? Collective intelligence,
    collective ownership, quality and joy!

    View Slide

  53. Building together
    TDD
    Have you ever met a data
    scientist who write unit tests
    and refactor? I did! :)
    It’s hard to imagine doing TDD
    during an exploratory work though!
    (i.e., when the target observable
    behaviour is not yet defined)

    View Slide

  54. Product delivery
    Spikes and user stories

    View Slide

  55. Product delivery essentials
    Don’t lose time repeating boring stuff!
    Automate!
    Make data available for everyone!
    Don’t treat your infra like pets!
    Destroy and rebuild!
    Don’t over-engineer though!

    View Slide

  56. Product adoption
    Stay close to the users!
    Don’t plan too many features!
    Incorporate feedback!

    View Slide

  57. What is agile anyway?

    View Slide

  58. Can datascience be agile?
    It’s still true!
    Even for:
    ● Big data
    ● AI
    ● Datascience

    View Slide

  59. Why this talk?

    View Slide

  60. Myths about datascience
    Well… Things have slightly changed since then… But not that much!

    View Slide

  61. Myths about datascience

    View Slide

  62. Myths about datascience

    View Slide

  63. Unicorns

    View Slide

  64. New unicorns - Same old stories
    You should draw your entire
    model before you start coding!
    Open a ticket!
    You need to hire a machine
    learning engineer!

    View Slide

  65. Takeaways!

    View Slide

  66. Takeaways!
    Make people together!
    Business value discovery => product delivery
    Explore and build iteratively
    Agile is still:
    ● Short feedback
    ● Small increments
    ● Take engineering seriously
    work
    learn

    View Slide

  67. OCTO © 2018 - Reproduction interdite sans autorisation écrite préalable 67
    OCTO Provence recrute !
    C’EST AVANT TOUT
    UN ÉTAT D’ESPRIT START-UP
    APPUYÉ PAR DES EXPERTISES
    TECH, AGILE & CHANGE
    POUR ACCOMPAGNER
    DIGITALE
    TRANSFORMATION
    NOS CLIENTS DANS LEUR
    Contactez-nous sur
    [email protected]

    View Slide