Upgrade to Pro — share decks privately, control downloads, hide ads and more …

EuRuKo 2017 - Data driven production apps

Sai Warang
September 29, 2017

EuRuKo 2017 - Data driven production apps

A talk geared towards making a legacy Rails app smart and setting up the pillars for future success

Sai Warang

September 29, 2017
Tweet

Other Decks in Technology

Transcript

  1. Data-driven production
    apps
    Sai Warang @cyprusad

    !

    View Slide

  2. View Slide

  3. The Problem Statement

    View Slide

  4. The Problem Statement
    inspired by true events

    View Slide

  5. Turning back time
    Platform for merchants to sell online

    Fund merchants who are showing promise to help them do
    better!

    Use data we already have to predict how merchants will do
    in the future

    View Slide

  6. The Setup
    Rails app that allows merchants to set up an online store,
    and that can process orders

    Following standard Rails design patterns to set up your
    application

    View Slide

  7. Model View Controller
    Web
    Mobile
    Controller
    Model
    View

    View Slide

  8. Model
    Ruby class that encapsulates behaviour

    Has a persistence layer backed by a relational database

    Lastly, has historical data

    View Slide

  9. Model
    class Shop < ApplicationRecord
    has_many :orders
    end
    class Order < ApplicationRecord
    belongs_to :shop
    end

    View Slide

  10. Controller
    Wrangle out the behaviour and data that lives in models

    Serve different consumers with this model data and logic

    View Slide

  11. View
    Templates that describe how to present data

    Doesn’t know where the data comes from that gets filled
    into the views

    View Slide

  12. Cool, basics covered
    So what’s the problem?

    View Slide

  13. Artificial Intelligence

    View Slide

  14. Paradigm shifts
    Having an app that works well as the user expects is not
    enough

    Need to be able to predict sane choices for the user and
    automate repetitive actions in your apps

    Recommendations

    View Slide

  15. .. specifically, we want to
    Predict which merchant is risky to fund and which merchant
    is not risky to fund

    Predict the amount that a merchant will reasonably be able
    to pay back

    Confidently automate underwriting

    View Slide

  16. The Dream ™
    shop = Shop.new(merchant_id)


    funding = BigData::Predict.amount(shop)

    View Slide

  17. Does MVC suffice?

    View Slide

  18. Engage the genius part of
    the brain

    View Slide

  19. AI and Machine Learning
    tools

    View Slide

  20. AI and Machine Learning
    tools

    View Slide

  21. Wait a minute
    This is getting complex

    View Slide

  22. Wait a minute
    We can already process
    data within application

    View Slide

  23. What is the simplest thing?
    Aggregations ±

    Cron background jobs for more complex calculations

    View Slide


  24. View Slide

  25. Caveats
    Aggregations ±

    Cron background jobs for more complex calculations

    Performance issues in the app as it performs complex
    calculations leading to CPU resources getting tied up and
    there will be that one moment when you start dropping
    traffic because of DB locks

    View Slide

  26. What is another simple
    thing?
    Reporting views and dashboards

    View Slide


  27. View Slide

  28. More caveats
    Reporting views and dashboards

    Can your data be trusted? Does your app have test data in
    production? (Here’s looking at you past Sai )

    View Slide

  29. Data quality
    Complex
    analysis
    Real time
    predictions
    Rails app

    View Slide

  30. Data quality
    Complex
    analysis
    Real time
    predictions
    Rails app

    View Slide

  31. Data quality
    Complex
    analysis
    Real time
    predictions
    Rails app

    View Slide

  32. Data quality
    Complex
    analysis
    Real time
    predictions
    Rails app

    View Slide

  33. The Unknown
    Web
    Mobile
    Controller
    Model
    View

    View Slide

  34. Data Warehouse

    View Slide

  35. Extract Transform Load
    ETL all your tables into a warehouse

    Timely snapshots of production database

    Sent over to an external system

    Clean your data

    View Slide

  36. ActiveRecord::Base.connection.tables.each do |table|
    klass = table.classify
    if Time.now > @checkpoint[klass]
    warehouse_extract(klass.constantize.where(updated_at:
    @checkpoint[klass]..Time.now))
    end
    end

    View Slide

  37. ActiveRecord::Base.connection.tables.each do |table|
    klass = table.classify
    if Time.now > @checkpoint[klass]
    warehouse_extract(klass.constantize.where(updated_at:
    @checkpoint[klass]..Time.now, test: false, secret: false))
    end
    end

    View Slide

  38. Benefits of warehousing
    No more constraint of production environment

    Opens up the possibility of using many Machine Learning
    libraries

    No stress on production database

    View Slide

  39. But..
    Is simply pulling data out of
    live production system
    enough?

    View Slide

  40. Dimensional Modelling

    View Slide

  41. Main idea
    Not entity relationships

    Doesn’t follow the schema of production database tables

    Modelled around business domain

    View Slide

  42. Principal components
    Facts (numerical values)

    Dimensions (data points that define the facts)

    View Slide

  43. Star schema
    Periodic snapshot fact
    Dimensions
    Dimensions
    Dimensions Dimensions
    Dimensions

    View Slide

  44. For example
    Daily sales over a
    period of time
    Inventory
    Date
    Product Order
    Shop

    View Slide

  45. Good habits
    Dimensional modelling of data in your warehouse relies on
    you modelling your application well

    View Slide

  46. View Slide

  47. Data quality
    Complex
    analysis
    Real time
    predictions
    Rails app

    ETL +
    Dimensional
    Modelling
    ✅ ✅

    View Slide

  48. Data quality
    Complex
    analysis
    Real time
    predictions
    Rails app

    ETL +
    Dimensional
    Modelling
    ✅ ✅

    View Slide

  49. Data is the new product

    View Slide

  50. HTTP REST APIs
    Data warehouse tooling that generates insights

    HTTP REST API to deliver those insights to your app

    View Slide

  51. Insight models
    Easily plug view layer

    Loose, can be freely modelled (JSON blobs, some fields)

    JOIN tables in normal flow of application logic

    Native Rails validations (data integrity) when inputted back
    into Rails

    View Slide

  52. Data quality
    Complex
    analysis
    Real time
    predictions
    Rails app

    ETL +
    Dimensional
    Modelling
    ✅ ✅
    Web tooling +
    New models
    ✅ ✅

    View Slide

  53. The New Architecture
    Web
    Mobile
    Controller
    Model
    View
    Dimensionally
    modelled data
    ETL
    Insights

    View Slide

  54. That’s a lot of pieces
    How is this not fragile?

    View Slide

  55. The Failure Scenarios

    View Slide

  56. In order to succeed,
    avoid the most common
    ways to fail

    View Slide

  57. Resiliency questionnaire
    How are we keeping APIs up to date?

    What happens when your ETL cron goes down?

    What happens when your dimension schemas become
    outdated?

    What is the end user experience?

    View Slide

  58. Infrastructure guard rails
    Dimensional modelling validations

    Metrics

    Chat ops that notify when something is not quite right

    View Slide

  59. View Slide

  60. Human verification
    Web
    Mobile
    Controller
    Model
    View
    Dimensionally
    modelled data
    ETL
    Insights

    View Slide

  61. Thank you!
    @cyprusad

    Come chat with me and my colleagues!

    View Slide