Upgrade to Pro — share decks privately, control downloads, hide ads and more …

EuRuKo 2017 - Data driven production apps

Sai Warang
September 29, 2017

EuRuKo 2017 - Data driven production apps

A talk geared towards making a legacy Rails app smart and setting up the pillars for future success

Sai Warang

September 29, 2017
Tweet

Other Decks in Technology

Transcript

  1. Data-driven production
    apps
    Sai Warang @cyprusad

    !

    View full-size slide

  2. The Problem Statement

    View full-size slide

  3. The Problem Statement
    inspired by true events

    View full-size slide

  4. Turning back time
    Platform for merchants to sell online

    Fund merchants who are showing promise to help them do
    better!

    Use data we already have to predict how merchants will do
    in the future

    View full-size slide

  5. The Setup
    Rails app that allows merchants to set up an online store,
    and that can process orders

    Following standard Rails design patterns to set up your
    application

    View full-size slide

  6. Model View Controller
    Web
    Mobile
    Controller
    Model
    View

    View full-size slide

  7. Model
    Ruby class that encapsulates behaviour

    Has a persistence layer backed by a relational database

    Lastly, has historical data

    View full-size slide

  8. Model
    class Shop < ApplicationRecord
    has_many :orders
    end
    class Order < ApplicationRecord
    belongs_to :shop
    end

    View full-size slide

  9. Controller
    Wrangle out the behaviour and data that lives in models

    Serve different consumers with this model data and logic

    View full-size slide

  10. View
    Templates that describe how to present data

    Doesn’t know where the data comes from that gets filled
    into the views

    View full-size slide

  11. Cool, basics covered
    So what’s the problem?

    View full-size slide

  12. Artificial Intelligence

    View full-size slide

  13. Paradigm shifts
    Having an app that works well as the user expects is not
    enough

    Need to be able to predict sane choices for the user and
    automate repetitive actions in your apps

    Recommendations

    View full-size slide

  14. .. specifically, we want to
    Predict which merchant is risky to fund and which merchant
    is not risky to fund

    Predict the amount that a merchant will reasonably be able
    to pay back

    Confidently automate underwriting

    View full-size slide

  15. The Dream ™
    shop = Shop.new(merchant_id)


    funding = BigData::Predict.amount(shop)

    View full-size slide

  16. Does MVC suffice?

    View full-size slide

  17. Engage the genius part of
    the brain

    View full-size slide

  18. AI and Machine Learning
    tools

    View full-size slide

  19. AI and Machine Learning
    tools

    View full-size slide

  20. Wait a minute
    This is getting complex

    View full-size slide

  21. Wait a minute
    We can already process
    data within application

    View full-size slide

  22. What is the simplest thing?
    Aggregations ±

    Cron background jobs for more complex calculations

    View full-size slide

  23. Caveats
    Aggregations ±

    Cron background jobs for more complex calculations

    Performance issues in the app as it performs complex
    calculations leading to CPU resources getting tied up and
    there will be that one moment when you start dropping
    traffic because of DB locks

    View full-size slide

  24. What is another simple
    thing?
    Reporting views and dashboards

    View full-size slide

  25. More caveats
    Reporting views and dashboards

    Can your data be trusted? Does your app have test data in
    production? (Here’s looking at you past Sai )

    View full-size slide

  26. Data quality
    Complex
    analysis
    Real time
    predictions
    Rails app

    View full-size slide

  27. Data quality
    Complex
    analysis
    Real time
    predictions
    Rails app

    View full-size slide

  28. Data quality
    Complex
    analysis
    Real time
    predictions
    Rails app

    View full-size slide

  29. Data quality
    Complex
    analysis
    Real time
    predictions
    Rails app

    View full-size slide

  30. The Unknown
    Web
    Mobile
    Controller
    Model
    View

    View full-size slide

  31. Data Warehouse

    View full-size slide

  32. Extract Transform Load
    ETL all your tables into a warehouse

    Timely snapshots of production database

    Sent over to an external system

    Clean your data

    View full-size slide

  33. ActiveRecord::Base.connection.tables.each do |table|
    klass = table.classify
    if Time.now > @checkpoint[klass]
    warehouse_extract(klass.constantize.where(updated_at:
    @checkpoint[klass]..Time.now))
    end
    end

    View full-size slide

  34. ActiveRecord::Base.connection.tables.each do |table|
    klass = table.classify
    if Time.now > @checkpoint[klass]
    warehouse_extract(klass.constantize.where(updated_at:
    @checkpoint[klass]..Time.now, test: false, secret: false))
    end
    end

    View full-size slide

  35. Benefits of warehousing
    No more constraint of production environment

    Opens up the possibility of using many Machine Learning
    libraries

    No stress on production database

    View full-size slide

  36. But..
    Is simply pulling data out of
    live production system
    enough?

    View full-size slide

  37. Dimensional Modelling

    View full-size slide

  38. Main idea
    Not entity relationships

    Doesn’t follow the schema of production database tables

    Modelled around business domain

    View full-size slide

  39. Principal components
    Facts (numerical values)

    Dimensions (data points that define the facts)

    View full-size slide

  40. Star schema
    Periodic snapshot fact
    Dimensions
    Dimensions
    Dimensions Dimensions
    Dimensions

    View full-size slide

  41. For example
    Daily sales over a
    period of time
    Inventory
    Date
    Product Order
    Shop

    View full-size slide

  42. Good habits
    Dimensional modelling of data in your warehouse relies on
    you modelling your application well

    View full-size slide

  43. Data quality
    Complex
    analysis
    Real time
    predictions
    Rails app

    ETL +
    Dimensional
    Modelling
    ✅ ✅

    View full-size slide

  44. Data quality
    Complex
    analysis
    Real time
    predictions
    Rails app

    ETL +
    Dimensional
    Modelling
    ✅ ✅

    View full-size slide

  45. Data is the new product

    View full-size slide

  46. HTTP REST APIs
    Data warehouse tooling that generates insights

    HTTP REST API to deliver those insights to your app

    View full-size slide

  47. Insight models
    Easily plug view layer

    Loose, can be freely modelled (JSON blobs, some fields)

    JOIN tables in normal flow of application logic

    Native Rails validations (data integrity) when inputted back
    into Rails

    View full-size slide

  48. Data quality
    Complex
    analysis
    Real time
    predictions
    Rails app

    ETL +
    Dimensional
    Modelling
    ✅ ✅
    Web tooling +
    New models
    ✅ ✅

    View full-size slide

  49. The New Architecture
    Web
    Mobile
    Controller
    Model
    View
    Dimensionally
    modelled data
    ETL
    Insights

    View full-size slide

  50. That’s a lot of pieces
    How is this not fragile?

    View full-size slide

  51. The Failure Scenarios

    View full-size slide

  52. In order to succeed,
    avoid the most common
    ways to fail

    View full-size slide

  53. Resiliency questionnaire
    How are we keeping APIs up to date?

    What happens when your ETL cron goes down?

    What happens when your dimension schemas become
    outdated?

    What is the end user experience?

    View full-size slide

  54. Infrastructure guard rails
    Dimensional modelling validations

    Metrics

    Chat ops that notify when something is not quite right

    View full-size slide

  55. Human verification
    Web
    Mobile
    Controller
    Model
    View
    Dimensionally
    modelled data
    ETL
    Insights

    View full-size slide

  56. Thank you!
    @cyprusad

    Come chat with me and my colleagues!

    View full-size slide