Slide 1

Slide 1 text

LIGHTWEIGHT BUSINESS INTELLIGENCE COREY EHMKE APRIL 2013

Slide 2

Slide 2 text

WHO AM I? A developer with a long memory & a longer history An active open source contributor A senior engineer at Trunk Club A lifelong learner

Slide 3

Slide 3 text

Service-oriented startup Software optimizes and streamlines business processes Technology is a key differentiator Engineering provides leverage for scaling the business TRUNK CLUB

Slide 4

Slide 4 text

OUR ENGINEERING GOALS Startups make critical decisions on a daily basis Better data leads to better decision-making Our mission is ∴ to provide this data in a timely and useful form

Slide 5

Slide 5 text

SOME CRITICAL DATA POINTS Marketing campaign performance Member on-boarding funnel Trunk lifecycle Stylist interactions Product performance

Slide 6

Slide 6 text

BUSINESS INTELLIGENCE: THREE APPROACHES

Slide 7

Slide 7 text

WHAT IS BUSINESS INTELLIGENCE? Collection & organization of mission-critical knowledge Historical view of business operations Tools to support decision making

Slide 8

Slide 8 text

THE NAÏVE APPROACH

Slide 9

Slide 9 text

THE NAÏVE APPROACH Reporting out of the transactional database Raw SQL embedded in your code for “performance” Granting direct db access to stakeholders

Slide 10

Slide 10 text

THE NAÏVE APPROACH: SHORTCOMINGS Fighting the schema with complex joins Poor performance Impact on production resources

Slide 11

Slide 11 text

DON’T TRUST THIS GUY WITH YOUR DATABASE.

Slide 12

Slide 12 text

THE ENTERPRISE APPROACH

Slide 13

Slide 13 text

THE ENTERPRISE APPROACH Distinct BI database Nightly ETL (extract, transform, load) Schema designed for reporting Separate hardware and software stack Combination of static and dynamic reports

Slide 14

Slide 14 text

THE ENTERPRISE APPROACH: SHORTCOMINGS 24 hour delay in information Expensive to configure and maintain Requires highly specialized resources Hard to change your mind or adapt Enterprise-y

Slide 15

Slide 15 text

VERY, VERY ENTERPRISE-Y

Slide 16

Slide 16 text

DON’T TRUST THESE GUYS WITH YOUR DATABASE, EITHER.

Slide 17

Slide 17 text

TRADITIONAL BI: THE WRONG HAMMER

Slide 18

Slide 18 text

TRADITIONAL BI: THE WRONG HAMMER Waterfall approach Painful data migrations Brittle ETL processes No automated testing Logic embedded in the data store

Slide 19

Slide 19 text

IN SHORT... TRADITIONAL BI DOESN’T FEEL AGILE.

Slide 20

Slide 20 text

A LIGHTWEIGHT APPROACH TO BUSINESS INTELLIGENCE

Slide 21

Slide 21 text

THE FOUR NOBLE GOALS OF THE LIGHTWEIGHT BI APPROACH Provide real-time data to support decisions Leverage familiar technology & infrastructure Use existing development staff Support an iterative, agile approach

Slide 22

Slide 22 text

TRUST YOUR DEVELOPERS.

Slide 23

Slide 23 text

GETTING STARTED Collaborate! Determine KPIs that actually matter Figure out what sort of questions the business is asking on a daily basis

Slide 24

Slide 24 text

TURN INFERENCES INTO FACTS Find a way to tell a story with your data Design your schema based on facts De-normalize like a boss

Slide 25

Slide 25 text

PRESENT ANSWERS Provide a central, single source of truth Present the data wherever it’s needed Dashboard design is harder than you think Plan for iterations & ongoing collaboration

Slide 26

Slide 26 text

PLEASE GODS NO.

Slide 27

Slide 27 text

DON’T MAKE THESE.

Slide 28

Slide 28 text

BRINGING RUBY TO THE PARTY Makes agile, test-driven development easy Quickly deploy new apps Plenty of visualization libraries Powerful SQL & NoSQL ORMs Great data munging capabilities

Slide 29

Slide 29 text

LEVERAGING MONGODB Flexible and dynamic schemas Support for native datatypes Powerful querying and aggregation Fast and performant Easy to scale up

Slide 30

Slide 30 text

GO AHEAD... TRY THIS AT HOME LIGHTWEIGHT BI TECHNIQUES

Slide 31

Slide 31 text

PARALLEL DB DEPLOYMENT Modern frameworks support multiple orms Use SQL for transactions Use NoSQL for reporting Business logic in your apps, not in your database

Slide 32

Slide 32 text

STATISTICAL MODELS Collections of facts, not attributes Data spanning deep and wide object graphs De-normalized and optimized for reporting

Slide 33

Slide 33 text

STATISTICAL MODELS module  MemberDataProfiles    class  Performance        include  Mongoid::Document        include  Mongoid::Timestamps        field  :member_id,                                  :type  =>  Integer        field  :annual_trunk_frequency,        :type  =>  Integer,    :default  =>  0        field  :average_trunk_value,              :type  =>  Float,        :default  =>  0.0        field  :is_customer,                              :type  =>  Boolean,    :default  =>  false        field  :keep_rate,                                  :type  =>  Float,        :default  =>  0.0        field  :last_transaction_date,          :type  =>  Date        field  :member_creation_date,            :type  =>  Date        field  :total_value,                              :type  =>  Float,        :default  =>  0.0        field  :total_number_of_trunks,        :type  =>  Integer,    :default  =>  0    end end

Slide 34

Slide 34 text

STREAMING ETL Event-triggered, continuous data extraction Calculations are defined in code rather than SQL or ETL scripts Allow resource-intensive data munging to happen in the background Provide near-real-time data

Slide 35

Slide 35 text

STREAMING ETL #  rabbit_notifier.rb class  RabbitNotifier        def  self.notify_of_action(model,  action,  extras  =  {})            notify(                :model_actions,                  model,                headers(model,  extras.merge('action'  =>  action.to_s))            )          end ... end

Slide 36

Slide 36 text

STREAMING ETL #  listeners.rb ListenerConfig.map  do    route  :product_shipped,      :to  =>  UpdatesProduct    route  :product_returned,    :to  =>  UpdatesProduct end

Slide 37

Slide 37 text

STREAMING ETL #  updates_product.rb def  self.with(json)    product  =  Product.init_with_params(extracted_params(json))    product.update_stats! end def  self.extracted_params(json)    JSON.parse(json,  :symbolize_names  =>  true) end

Slide 38

Slide 38 text

STREAMING ETL #  product.rb class  Product    include  Mongoid::Document    include  Mongoid::Timestamps    include  Products::Calculations    field  :name    field  :price,                          :type  =>  Float,      :default  =>  0.0    field  :cost,                            :type  =>  Float,      :default  =>  0.0    field  :profit_margin,          :type  =>  Float,      :default  =>  0.0    field  :keep_rate,                  :type  =>  Float,      :default  =>  0.0    field  :quantity_shipped,    :type  =>  Integer,  :default  =>  0    field  :quantity_returned,  :type  =>  Integer,  :default  =>  0    field  :profitability,          :type  =>  Float,      :default  =>  0.0    field  :trunkability,            :type  =>  Float,      :default  =>  0.0    field  :positive_feedback,  :type  =>  Array,      :default  =>  []    field  :negative_feedback,  :type  =>  Array,      :default  =>  [] ...

Slide 39

Slide 39 text

STREAMING ETL #  product.rb  cont’d        ...    def  self.init_with_params(params={})        product  =  Product.where(:name  =>  params[:name]).first            product  ||=  Product.new(params)    end    def  update_stats!        calculate_stats        save    end end

Slide 40

Slide 40 text

APIS EVERYWHERE Provide easy access to your data Allow reuse of data in novel ways Quickly build dashboards and data explorers Use the Faceted gem to make building APIs easy

Slide 41

Slide 41 text

LIGHTWEIGHT BUSINESS INTELLIGENCE: EXAMPLES

Slide 42

Slide 42 text

LEVIATHAN

Slide 43

Slide 43 text

OUR TECHNOLOGY ECOSYSTEM

Slide 44

Slide 44 text

LEVIATHAN Records events from all applications Subscribes to all message queues Collects and displays real-time data Browse, search, & drill-down interface Longitudinal analysis with dynamic cohorts

Slide 45

Slide 45 text

LEVIATHAN: EVENT MODEL class  Event    include  Mongoid::Document    include  Mongoid::Timestamps    field  :label    field  :application    field  :details,  :type  =>  Hash,  :default  =>  {}    def  self.record!(label,  application,  params={})        Event.create(            :label  =>  label,            :application  =>  application,            :details  =>  params        )    end    def  self.search_details(criteria={})        where("details.#{criteria.keys.first}"  =>  /#{criteria.values.first}/i)    end end

Slide 46

Slide 46 text

No content

Slide 47

Slide 47 text

No content

Slide 48

Slide 48 text

No content

Slide 49

Slide 49 text

No content

Slide 50

Slide 50 text

No content

Slide 51

Slide 51 text

No content

Slide 52

Slide 52 text

No content

Slide 53

Slide 53 text

LEADERBOARD

Slide 54

Slide 54 text

LEADERBOARD Real-time sales performance Prominently placed in sales team area Visibility = competition

Slide 55

Slide 55 text

No content

Slide 56

Slide 56 text

PRODUCT INFORMANT

Slide 57

Slide 57 text

PRODUCT INFORMANT Inventory performance metrics Extracts facts from a dozen RDBMS tables Delivers on decision support

Slide 58

Slide 58 text

No content

Slide 59

Slide 59 text

No content

Slide 60

Slide 60 text

SUMMING UP Real-time data is valuable and possible Step outside of purely relational thinking Use the technologies you’re most familiar with

Slide 61

Slide 61 text

SUMMING UP Wrap your data in easy-to-use APIs Build micro-apps to deliver stakeholder-specific dashboards Be agile, not enterprisey

Slide 62

Slide 62 text

KEEP IN TOUCH! [email protected] @bantik on twitter bantik.github.com trunkclub.com/corey