Opower(ing) Energy Efficiency

Anton Vattay 10/06/2015 2 Opower(ing) Energy Efficiency

What we do at Opower •  Motivate and enable a
sustainable energy future! •  Reduce energy usage by working with utilities to inform their customers about their energy usage and how to reduce it. •  We send out reports, various alerts, and provide a web presence. •  Since we started we have saved ~ 6 TeraWatt Hours! 3

Segmentation & Targeting at Opower •  We provide the tools
that allow our users to ▪  Explore our customer population ▪  Break it down into smaller populations ▪  Customize the experience for those customers ▪  Do it all visually without deep technical expertise in our datastore

Engineering is people! 5 Joanna Kochaniak Jan Rubio Nayyara Samuel
Jamie Swogger Salman Suhail Nicholas Grippin Franklin Zheng Ravjot Pasricha Anton Vattay Ben Siemon

The Problem •  Unique Customers ▪  Optimize energy savings ▪ 
Many attributes •  Utilities Requirements ▪  Respect their structure •  Prove our Savings ▪  Randomized Control Tests ▪  Verified by 3rd parties 6 •  Customer ▪  Recipient ▪  Did not opt out ▪  Active Account ▪  Is a home owner •  Has a larger house •  Content Improve Insulation New Water Heater etc..

The Old Way •  Datastores ▪  MySQL cluster ▪  HDFS
•  Ad-hoc selection •  Hours of execution •  Hard to generate •  No Attribute Library 7 •  select * from customer ▪  inner join account… ▪  inner join service point ▪  inner join preferences ▪  inner join blah… ▪  left join report ▪  where blah is null ▪  etc… •  Come back in a few hours

How do we solve this? •  Use a search index!
▪  Provide a place to fuse our data ▪  Query data quickly, on the order of seconds •  Create a DSL to model the problem •  Create a UI to visualze it •  Why Elasticsearch? ▪  Easiest to set up ▪  Extensive documentation ▪  Good scaling story

The Solution 9

How did we get there? 10 •  How do we
get out data into Elasticsearch? •  How do we support validating energy savings? •  How do we represent and enforce our data schema? •  How do we represent belonging to some group? •  How do we represent the document hierarchy? •  How does this DSL look? •  How does the UI work? •  How does Elasticsearch scale with our needs?

Data Import / Refresh Guiding principles ▪  Idempotent ▪  Disposable
▪  Create don’t update ▪  Concurrency (r/w speed) •  Map Reduce and batch ▪  Read elementary components ▪  Compose into hierarchy ▪  (Over)Write to ES ▪  Can also do minor transforms 11

Balanced Random Population Splits Start with balanced population ▪  Measuring
savings is easier at the end. ▪  Must balance the variance of many attributes •  What is “good enough” balance? •  Solution ▪  Run many random splits (thousands) ▪  Index candidates (lists of customer ids) ▪  Query Elasticsearch for variance of each candidate ▪  Choose the best balanced candidate 12

Document Schema Guiding Principles ▪  Idempotent refresh ▪  Immutable /
Overwrite Only ▪  Mapping not canonical •  Class + Annotations ▪  Document hierarchy ▪  Attribute metadata 13

Membership Representation •  As a basic field, required customer re-write
▪  Changes often •  Documents of customer ids ▪  Terms Lookup Filter was slow •  “Marker Document” ▪  Child document of Customer ▪  Tiny data-less document ▪  Independent update and query 14 Group 1 = 1, 2, 5, 8, 9, … , n Customer1 Group 1

Document/Attribute Hierarchy •  All queries are relative to a root
“Customer” object •  Mix of child and nested documents ▪  Children for independent refresh •  Remaining challenge: Querying ▪  Want: Active/ELEC ▪  Get: One active GAS, one inactive ELEC 15 Customer Utility Account Utility Account Active? F Active? T Type: ELEC Type: GAS

Presenting Queries Visually 16

DSL Design (SRL) 17 •  Guiding Principles ▪  Model a
dataflow ▪  Encapsulate against future backwards incompatible changes. o  Have done multiple Elasticsearch upgrades ▪  Easy to generate •  Each population (node in the tree) ▪  Generate Elasticsearch filter to get counts or aggregations. •  Function ▪  Split on Attribute ▪  Balanced Random Split ▪  Merges

Scaling Elasticsearch •  Started with 0.90.7 ▪  Upgrade often. ▪ 
Metrics •  Some war stories ▪  Child document id cache (< 1.3.2) o  The summer of manual cache clears ▪  The incredibly slow terms lookup query ▪  An endless cycle of garbage collection ▪  An incredible deep and foreboding search queue and UX implications ▪  Aliasing across all indexes can cause a field cache overflow 18

It all comes together •  Elasticsearch is really critical in
bringing our vision to life ▪  Currently used for selecting almost every utility with Opower ▪  Used daily by non-technical users ▪  Enabled difficult selections ▪  Met cost saving goals for operations ▪  Deliver unique campaigns to our customers ▪  Saved energy! •  It’s been two years ▪  Planning on 2.0 upgrade ▪  Long term core technology

Dedicated to Anthony Vattay Sr.

Opower(ing) Energy Efficiency

Opower(ing) Energy Efficiency

Elastic Co

More Decks by Elastic Co

Other Decks in Technology

Featured

Transcript

1

Anton Vattay 10/06/2015 2 Opower(ing) Energy Efficiency

What we do at Opower •  Motivate and enable a

Segmentation & Targeting at Opower •  We provide the tools

Engineering is people! 5 Joanna Kochaniak Jan Rubio Nayyara Samuel

The Problem •  Unique Customers ▪  Optimize energy savings ▪

The Old Way •  Datastores ▪  MySQL cluster ▪  HDFS

How do we solve this? •  Use a search index!

The Solution 9

How did we get there? 10 •  How do we

Data Import / Refresh Guiding principles ▪  Idempotent ▪  Disposable

Balanced Random Population Splits Start with balanced population ▪  Measuring

Document Schema Guiding Principles ▪  Idempotent refresh ▪  Immutable /

Membership Representation •  As a basic field, required customer re-write

Document/Attribute Hierarchy •  All queries are relative to a root

Presenting Queries Visually 16

DSL Design (SRL) 17 •  Guiding Principles ▪  Model a

Scaling Elasticsearch •  Started with 0.90.7 ▪  Upgrade often. ▪

It all comes together •  Elasticsearch is really critical in

Dedicated to Anthony Vattay Sr.