EuRuKo 2017 - Data driven production apps

Ecf559571ebd26f6600a5a1881090e1b?s=47 Sai Warang
September 29, 2017

EuRuKo 2017 - Data driven production apps

A talk geared towards making a legacy Rails app smart and setting up the pillars for future success

Ecf559571ebd26f6600a5a1881090e1b?s=128

Sai Warang

September 29, 2017
Tweet

Transcript

  1. Data-driven production apps Sai Warang @cyprusad !

  2. None
  3. The Problem Statement

  4. The Problem Statement inspired by true events

  5. Turning back time Platform for merchants to sell online Fund

    merchants who are showing promise to help them do better! Use data we already have to predict how merchants will do in the future
  6. The Setup Rails app that allows merchants to set up

    an online store, and that can process orders Following standard Rails design patterns to set up your application
  7. Model View Controller Web Mobile Controller Model View

  8. Model Ruby class that encapsulates behaviour Has a persistence layer

    backed by a relational database Lastly, has historical data
  9. Model class Shop < ApplicationRecord has_many :orders end class Order

    < ApplicationRecord belongs_to :shop end
  10. Controller Wrangle out the behaviour and data that lives in

    models Serve different consumers with this model data and logic
  11. View Templates that describe how to present data Doesn’t know

    where the data comes from that gets filled into the views
  12. Cool, basics covered So what’s the problem?

  13. Artificial Intelligence

  14. Paradigm shifts Having an app that works well as the

    user expects is not enough Need to be able to predict sane choices for the user and automate repetitive actions in your apps Recommendations
  15. .. specifically, we want to Predict which merchant is risky

    to fund and which merchant is not risky to fund Predict the amount that a merchant will reasonably be able to pay back Confidently automate underwriting
  16. The Dream ™ shop = Shop.new(merchant_id)
 
 funding = BigData::Predict.amount(shop)

  17. Does MVC suffice?

  18. Engage the genius part of the brain

  19. AI and Machine Learning tools

  20. AI and Machine Learning tools

  21. Wait a minute This is getting complex

  22. Wait a minute We can already process data within application

  23. What is the simplest thing? Aggregations ± Cron background jobs

    for more complex calculations
  24. Caveats Aggregations ± Cron background jobs for more complex calculations

    Performance issues in the app as it performs complex calculations leading to CPU resources getting tied up and there will be that one moment when you start dropping traffic because of DB locks
  25. What is another simple thing? Reporting views and dashboards

  26. More caveats Reporting views and dashboards Can your data be

    trusted? Does your app have test data in production? (Here’s looking at you past Sai )
  27. Data quality Complex analysis Real time predictions Rails app

  28. Data quality Complex analysis Real time predictions Rails app

  29. Data quality Complex analysis Real time predictions Rails app ⚠

  30. Data quality Complex analysis Real time predictions Rails app ⚠

  31. The Unknown Web Mobile Controller Model View

  32. Data Warehouse

  33. Extract Transform Load ETL all your tables into a warehouse

    Timely snapshots of production database Sent over to an external system Clean your data
  34. ActiveRecord::Base.connection.tables.each do |table| klass = table.classify if Time.now > @checkpoint[klass]

    warehouse_extract(klass.constantize.where(updated_at: @checkpoint[klass]..Time.now)) end end
  35. ActiveRecord::Base.connection.tables.each do |table| klass = table.classify if Time.now > @checkpoint[klass]

    warehouse_extract(klass.constantize.where(updated_at: @checkpoint[klass]..Time.now, test: false, secret: false)) end end
  36. Benefits of warehousing No more constraint of production environment Opens

    up the possibility of using many Machine Learning libraries No stress on production database
  37. But.. Is simply pulling data out of live production system

    enough?
  38. Dimensional Modelling

  39. Main idea Not entity relationships Doesn’t follow the schema of

    production database tables Modelled around business domain
  40. Principal components Facts (numerical values) Dimensions (data points that define

    the facts)
  41. Star schema Periodic snapshot fact Dimensions Dimensions Dimensions Dimensions Dimensions

  42. For example Daily sales over a period of time Inventory

    Date Product Order Shop
  43. Good habits Dimensional modelling of data in your warehouse relies

    on you modelling your application well
  44. None
  45. Data quality Complex analysis Real time predictions Rails app ⚠

    ETL + Dimensional Modelling ✅ ✅
  46. Data quality Complex analysis Real time predictions Rails app ⚠

    ETL + Dimensional Modelling ✅ ✅
  47. Data is the new product

  48. HTTP REST APIs Data warehouse tooling that generates insights HTTP

    REST API to deliver those insights to your app
  49. Insight models Easily plug view layer Loose, can be freely

    modelled (JSON blobs, some fields) JOIN tables in normal flow of application logic Native Rails validations (data integrity) when inputted back into Rails
  50. Data quality Complex analysis Real time predictions Rails app ⚠

    ETL + Dimensional Modelling ✅ ✅ Web tooling + New models ✅ ✅
  51. The New Architecture Web Mobile Controller Model View Dimensionally modelled

    data ETL Insights
  52. That’s a lot of pieces How is this not fragile?

  53. The Failure Scenarios

  54. In order to succeed, avoid the most common ways to

    fail
  55. Resiliency questionnaire How are we keeping APIs up to date?

    What happens when your ETL cron goes down? What happens when your dimension schemas become outdated? What is the end user experience?
  56. Infrastructure guard rails Dimensional modelling validations Metrics Chat ops that

    notify when something is not quite right
  57. None
  58. Human verification Web Mobile Controller Model View Dimensionally modelled data

    ETL Insights
  59. Thank you! @cyprusad Come chat with me and my colleagues!