Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building a Weather Data Services Platform on Riak (RICON East 2013)

Building a Weather Data Services Platform on Riak (RICON East 2013)

Presented by Sathish Gaddipati at RICON East 2013

In this talk Sathish will discuss the size, complexity and use cases surrounding weather data services and analytics, which will entail an overview of the architecture of such systems and the role of Riak in these patterns.

About Sathish

Sathish is a senior technology executive with strong entrepreneurial drive and enjoy linking technology capabilities with business needs. Hands on experience on complex technology transformation initiatives and leading large and highly capable global teams. In-depth knowledge in the state of the art technologies and its application in multiple industry settings.

Basho Technologies

May 13, 2013
Tweet

More Decks by Basho Technologies

Other Decks in Technology

Transcript

  1. Building  Weather  Data  Services  Pla5orm  on  
    Riak  
     
     
    Sathish  Gaddipa+  
    VP  -­‐  Data  Management  

    View full-size slide

  2. The  Weather  Company  

    View full-size slide

  3. Science  
    Tech.  
    Data  
    The  Weather  Company’s  Core  

    View full-size slide

  4. What  to  Expect?  
    Use  Cases     Architecture  
    Compu+ng  
    Challenges  
    Components  
    Objec+ves    
    RIAK    
    Data  
    Governance  
    API  
    Management  
    Next  Steps  
    With  RIAK  

    View full-size slide

  5. WDS  
    Insurance  
    Retail  
    Weather  Data  Service  -­‐  Use  Cases  
    ü  Tornado    and  flood  forecasts  
    ü  Weather  warnings  
    ü  Historical    weather  trends  
     
    Wind  
    Energy  
    ü  Wind  Speed  Forecast  
    ü  Historical  Wind  Speeds  
    ü  Precipita+on  forecast  
    ü  Temperature  forecast  
    ü  Extreme  weather  forecast  
    Max.  Premium  Rate  
    Min.  Claims    
    PPC                        
    Maintenance                        
    Inventory  Management  
    Distribu+on    

    View full-size slide

  6. WDS  
    Media  
    Ad.  
    World  
    ü  Hourly  Forecasts  
    ü  Daily  Forecast  
    ü  Current  condi+ons  
     
    ü  Temperature  Forecast  
    ü  Historical    trens  
    ü  Real-­‐+me  condi+ons  
    ü  Forecasts  
    ü  Customer  loca+on  
    Weather    data  
    Weather  content    
    Bidding                        
    Demand  Forecas+ng              
    Impression  serving  
    Improved  Ad.  Exchange  
    Energy    
    Exchange  
    Weather  Data  Service  -­‐  Use  Cases  

    View full-size slide

  7. WDS  
    Mobile    
    Apps  
    Hospitality  
    ü  Hourly  Forecasts  
    ü  Daily  Forecast  
    ü  Current  condi+ons  
     
    ü  Weather  Forecast  
    ü  Historical    Forecast  
    ü  Watches  and  warnings  
    ü  Current  condi+ons  
    ü  Forecasts  
    ü  Airline  delays/  
                 cancella+ons  
    Weather    data  
    Weather  content    
    Local  weather        
    Na+onal  Weather            
    Room  rates  
    Revenue  op+miza+on  
    Government  
    Weather  Data  Service  -­‐  Use  Cases  

    View full-size slide

  8. WDS  
    ü  Hourly  Forecasts  
    ü  Daily  Forecast  
    ü  Current  condi+ons  
    ü  Historical  trends  
     
    ü  Historical    data  
    ü  Correla+ons    between  
                 consumer  spent  vs.    
                 weather  condi+ons  
    ü  Air  turbulence  and    
                   wind  speeds  
    ü  Weather  forecasts  
    ü  Current  condi+ons  
    Weather    Data  
    Weather  Content                  
       Business  Impact  
    Consumer  Behavior            
    Op+mal  Routes  
    Flight  Schedules                        
    Internal    
    (Digital  &  
     Cable)  
    Weather  
    Analy+cs  
    Airlines  
    Weather  Data  Service  -­‐  Use  Cases  

    View full-size slide

  9. Weather  Data  Services  PlaYorm  

    View full-size slide

  10.  
     
    1.  Reduce  +me  to  deploy  and  market  new  data  sets  
    2.  Reduce  opera+ng  cost  of  data  services  
    3.      Centralize  data  services  across  the  company    
                     -­‐  Eliminate  duplicate  data  feeds,  storage  and  APIs    
                     -­‐  Provide  system  of  record  for  TWC  weather  data  products  
     
    4.      Provide  visibility  of  data  access    
                     -­‐  Who  is  accessing  what  data  and  how  frequently    
                     -­‐  Metering    
     
    5.  Provides  data  governance  process  and  framework  
     
    6.  Serves  world’s  best  weather  forecast  across  all  products    
     
    7.  Low  latency,  highly  scalable  APIs  
     
    8.  Secured  access  to  data  
     
    9.  Centralized  and  scalable  architecture  
     
    10.  Consistent  “rich”  content  across  plaYorms  
     
     
    Weather  Data  Services  –    
    Top  10  Objec+ves  

    View full-size slide

  11. 1.Distribute  thousands  of  gridded  binary  files  to  mul+ple  loca+ons  
         across  globe    within  5  minutes  
     
    2.  Serve  more  than  billion  data  services  API  requests/day  
     
    3.  Metering  and  Authen+ca+on  of  API  calls  with  low  latency  
     
    4.  Process  mul+ple  TBs  of  data  every  day  
     
    5.  Ensure  business  con+nuity  
     
    6.  Leverage  data  caching    
     
    7.  Store  petabytes  of  historical  data  
     
    8.  Meshing  weather  data  with  consumer  behavior  and  derive  analy+cs  
     
    9.  Build  flexible  data  inges+on  plaYorm  to  manage  100s  of  data  feeds  
           from  external  par+es  
     
    10.    Maintain  above  systems  within  OPEX  budget  
     
    Weather  Data  Services    –    
    Top  10  Compu+ng  Challenges  

    View full-size slide

  12. Data  Governance  
     and  Organiza=on  
    (3  Months)  
    Fast  to  Market  API  
    (6-­‐8  Months)  
    Current  
     Systems  
    SUN  Pla5orm  
    Data  Governance  
     and  Organiza=on  
    Data  service  PlaYorm  Development  Approach  

    View full-size slide

  13. 1  
    Top  Architecture  Considera+ons  
    1.  Non-­‐Blocking  Data  Inges+on  
     
    2.  Pull  and  Push  data  service  
     
    3.  Load  balanced  data  processing    across    data  centers  
     
    4.  Use  memory  based  data  storage  for  real  +me  data  systems  
     
    5.  Easily  scalable,  highly  available  and  easy  to  maintain  large  historical    
           data  sets.    
     
    6.  Data  caching  to  achieve  low  latency  
     
    7.  To  ensure  business  con+nuity,  parallel  process  between    two  
           geographical  loca+ons  
     
    8.  Use  COTS  based  API  management  for  authen+ca+on,  metering  and    
           developer  on  boarding.    
     
    9.  Data  Replica+on  to  mul+ple  loca+ons  from  one  loca+on  within    
                 60+GB  data  within  5  mins  

    View full-size slide

  14. 1  
    Historical  data  service  PlaYorm  -­‐  RIAK    
    §  Easy  administra+on  
    §  Data  center  to  data  center  replica+on  
    §  Ease  of  scaling  
    §  High  availability  
    §  Text  and  numeric  data  
    §  KV  Store  
    §  More  reads  than  writes  

    View full-size slide

  15. RIAK  Test  Environment  
    node1  
    node2  
    node3  
    node4  
    node5  
    node6  
    Load  
    Node  
    Load  
    Node  
    Load  
    Balancer  
    Riak  Cluster  
    M1.xlarge  
    4  cores  
    15  GB  RAM  
    4  EBS  1000  IOP    
    Volumes    
    RAID  10  
    C1.medium  
    2  cores  
    1.7  GB  RAM  
    C1.medium  
    2  cores  
    1.7  GB  RAM  
    Zone  #  1  
    Zone  #  2  
    Load  
    Node  
    C1.medium  
    2  cores  
    1.7  GB  RAM  

    View full-size slide

  16. 1  
    RIAK  Test  Results  –  600  Concurrent  User  Tests  
    Test  configura+on  -­‐  Apache  bench  -­‐n  20000  c  100  
    6  Terminal  sessions  running  the  above,  
    So  concurrent  user  load  is  100  *  6  (c  *  #terminals)  =  600  
     
    Concurrent
    User Load
    Request per
    Second(mean)
    Response Time(mean) CPU Utilization
    600 Riak1- 979
    Riak2- 891
    Riak3 - 881
    Riak4- 906
    Riak5 -968
    Riak6- 984
    -----------------------------
    Total - 5609/sec
    Riak 1- 102 ms
    Riak 2- 112 ms
    Riak 3- 113 ms
    Riak 4 - 110 ms
    Riak 5 - 103 ms
    Riak 6 - 101 ms
    ------------------------
    Average - 106 ms
    Riak 1- 25- 30%
    Riak 2- 35- 40%
    Riak 3- 35-40%
    Riak 4- 20-25%
    Riak 5- 35- 40%
    Riak 6- 25- 30%
    -----------------------
    Well below 60%

    View full-size slide

  17. Mobile    
    Apps  
    Hospitality  
    RIAK  Test  Results  –  1800  Concurrent  User  Tests  
    Test  configura+on  -­‐  Apache  bench  -­‐n  20000  c  300  
    6  Terminal  sessions  running  the  above,  
    So  concurrent  user  load  is  300  *  6  (c  *  #terminals)  =  1800  
    Concurrent User
    Load
    Request per Second(mean) Response Time(mean) CPU Utilization
    1800 Riak1- 923
    Riak2- 896
    Riak3 - 907
    Riak4- 939
    Riak5 -964
    Riak6- 965
    -----------------------------
    Total - 5594/sec
    Riak 1- 324 ms
    Riak 2- 334 ms
    Riak 3- 330 ms
    Riak 4 - 319 ms
    Riak 5 - 311ms
    Riak 6 - 310 ms
    ------------------------
    Average - 324 ms
    Riak 1- 35-40%
    Riak 2- 35- 40%
    Riak 3- 35-40%
    Riak 4- 35-40%
    Riak 5- 35- 40%
    Riak 6- 35-40%
    -----------------------
    Well below 60%

    View full-size slide

  18. RIAK    Read  Test  Results  

    View full-size slide

  19. Data  Services  -­‐  Prerequisites      
    Data  
    services  
    Data    
    Organiza+on  
    Process  &  
    Governance  
    Technology  
    SUN  PlaYorm   Data  Services  Org.  
    Processes  and  Governance  around  
    Exis+ng  and  New  Data  Services  Opportuni+es  
    Data  Gaps  
    Data  Quality  

    View full-size slide

  20. Data  Services  Governance  
    B2B  Data  Requests  
    B2C  Data  Policy  
    Requests  
    Weather  FX  Data  
    Requests  
    Other  Data  Requests  
    Data    
    Services  
    Org.  
    Data  
    Acquisi+on  
    Data  API  
    Develop.  
    IT  Capacity  
    Security  &  
    Privacy  
    Cross  
    Channel  
    Impact  
    Cost  
    Es+mates  

    View full-size slide

  21. Data  Services  Governance  (DSG)    
    DSG  
    DG  Sponsor  
    Data  
    Stakeholders    
    Data  
    Steward  
    Facilitator  
    Data  Services  
    Organiza+on  
    COO  
    Divisional  Heads  
    Data  Enthusiasts  
    Data  Stakeholders  
    Data  Experts  
     

    View full-size slide

  22. Ø  Metering  
    Ø  Authen+ca+on  
    Ø  Developer  Onboarding  
    Ø  Billing  Interface  
    Ø  User  Analy+cs  
    q Mashery  
    q Layer  7  
    q WSO2  
    q Oracle  
    API  Management  

    View full-size slide

  23. Insurance  
    Retail  
    Next  Steps  With  Riak  
    q Replica+on  Tests  
     
    q Caching  on  top  op  Riak  

    View full-size slide

  24. Ques+ons?  

    View full-size slide