Slide 1

Slide 1 text

Data-driven production apps Sai Warang @cyprusad !

Slide 2

Slide 2 text

No content

Slide 3

Slide 3 text

The Problem Statement

Slide 4

Slide 4 text

The Problem Statement inspired by true events

Slide 5

Slide 5 text

Turning back time Platform for merchants to sell online Fund merchants who are showing promise to help them do better! Use data we already have to predict how merchants will do in the future

Slide 6

Slide 6 text

The Setup Rails app that allows merchants to set up an online store, and that can process orders Following standard Rails design patterns to set up your application

Slide 7

Slide 7 text

Model View Controller Web Mobile Controller Model View

Slide 8

Slide 8 text

Model Ruby class that encapsulates behaviour Has a persistence layer backed by a relational database Lastly, has historical data

Slide 9

Slide 9 text

Model class Shop < ApplicationRecord has_many :orders end class Order < ApplicationRecord belongs_to :shop end

Slide 10

Slide 10 text

Controller Wrangle out the behaviour and data that lives in models Serve different consumers with this model data and logic

Slide 11

Slide 11 text

View Templates that describe how to present data Doesn’t know where the data comes from that gets filled into the views

Slide 12

Slide 12 text

Cool, basics covered So what’s the problem?

Slide 13

Slide 13 text

Artificial Intelligence

Slide 14

Slide 14 text

Paradigm shifts Having an app that works well as the user expects is not enough Need to be able to predict sane choices for the user and automate repetitive actions in your apps Recommendations

Slide 15

Slide 15 text

.. specifically, we want to Predict which merchant is risky to fund and which merchant is not risky to fund Predict the amount that a merchant will reasonably be able to pay back Confidently automate underwriting

Slide 16

Slide 16 text

The Dream ™ shop = Shop.new(merchant_id)
 
 funding = BigData::Predict.amount(shop)

Slide 17

Slide 17 text

Does MVC suffice?

Slide 18

Slide 18 text

Engage the genius part of the brain

Slide 19

Slide 19 text

AI and Machine Learning tools

Slide 20

Slide 20 text

AI and Machine Learning tools

Slide 21

Slide 21 text

Wait a minute This is getting complex

Slide 22

Slide 22 text

Wait a minute We can already process data within application

Slide 23

Slide 23 text

What is the simplest thing? Aggregations ± Cron background jobs for more complex calculations

Slide 24

Slide 24 text

Slide 25

Slide 25 text

Caveats Aggregations ± Cron background jobs for more complex calculations Performance issues in the app as it performs complex calculations leading to CPU resources getting tied up and there will be that one moment when you start dropping traffic because of DB locks

Slide 26

Slide 26 text

What is another simple thing? Reporting views and dashboards

Slide 27

Slide 27 text

Slide 28

Slide 28 text

More caveats Reporting views and dashboards Can your data be trusted? Does your app have test data in production? (Here’s looking at you past Sai )

Slide 29

Slide 29 text

Data quality Complex analysis Real time predictions Rails app

Slide 30

Slide 30 text

Data quality Complex analysis Real time predictions Rails app

Slide 31

Slide 31 text

Data quality Complex analysis Real time predictions Rails app ⚠

Slide 32

Slide 32 text

Data quality Complex analysis Real time predictions Rails app ⚠

Slide 33

Slide 33 text

The Unknown Web Mobile Controller Model View

Slide 34

Slide 34 text

Data Warehouse

Slide 35

Slide 35 text

Extract Transform Load ETL all your tables into a warehouse Timely snapshots of production database Sent over to an external system Clean your data

Slide 36

Slide 36 text

ActiveRecord::Base.connection.tables.each do |table| klass = table.classify if Time.now > @checkpoint[klass] warehouse_extract(klass.constantize.where(updated_at: @checkpoint[klass]..Time.now)) end end

Slide 37

Slide 37 text

ActiveRecord::Base.connection.tables.each do |table| klass = table.classify if Time.now > @checkpoint[klass] warehouse_extract(klass.constantize.where(updated_at: @checkpoint[klass]..Time.now, test: false, secret: false)) end end

Slide 38

Slide 38 text

Benefits of warehousing No more constraint of production environment Opens up the possibility of using many Machine Learning libraries No stress on production database

Slide 39

Slide 39 text

But.. Is simply pulling data out of live production system enough?

Slide 40

Slide 40 text

Dimensional Modelling

Slide 41

Slide 41 text

Main idea Not entity relationships Doesn’t follow the schema of production database tables Modelled around business domain

Slide 42

Slide 42 text

Principal components Facts (numerical values) Dimensions (data points that define the facts)

Slide 43

Slide 43 text

Star schema Periodic snapshot fact Dimensions Dimensions Dimensions Dimensions Dimensions

Slide 44

Slide 44 text

For example Daily sales over a period of time Inventory Date Product Order Shop

Slide 45

Slide 45 text

Good habits Dimensional modelling of data in your warehouse relies on you modelling your application well

Slide 46

Slide 46 text

No content

Slide 47

Slide 47 text

Data quality Complex analysis Real time predictions Rails app ⚠ ETL + Dimensional Modelling ✅ ✅

Slide 48

Slide 48 text

Data quality Complex analysis Real time predictions Rails app ⚠ ETL + Dimensional Modelling ✅ ✅

Slide 49

Slide 49 text

Data is the new product

Slide 50

Slide 50 text

HTTP REST APIs Data warehouse tooling that generates insights HTTP REST API to deliver those insights to your app

Slide 51

Slide 51 text

Insight models Easily plug view layer Loose, can be freely modelled (JSON blobs, some fields) JOIN tables in normal flow of application logic Native Rails validations (data integrity) when inputted back into Rails

Slide 52

Slide 52 text

Data quality Complex analysis Real time predictions Rails app ⚠ ETL + Dimensional Modelling ✅ ✅ Web tooling + New models ✅ ✅

Slide 53

Slide 53 text

The New Architecture Web Mobile Controller Model View Dimensionally modelled data ETL Insights

Slide 54

Slide 54 text

That’s a lot of pieces How is this not fragile?

Slide 55

Slide 55 text

The Failure Scenarios

Slide 56

Slide 56 text

In order to succeed, avoid the most common ways to fail

Slide 57

Slide 57 text

Resiliency questionnaire How are we keeping APIs up to date? What happens when your ETL cron goes down? What happens when your dimension schemas become outdated? What is the end user experience?

Slide 58

Slide 58 text

Infrastructure guard rails Dimensional modelling validations Metrics Chat ops that notify when something is not quite right

Slide 59

Slide 59 text

No content

Slide 60

Slide 60 text

Human verification Web Mobile Controller Model View Dimensionally modelled data ETL Insights

Slide 61

Slide 61 text

Thank you! @cyprusad Come chat with me and my colleagues!