Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Scaling up Business Intelligence from the scrat...

Scaling up Business Intelligence from the scratch and to 15 countries worldwide / Budapest BI Forum - Oct 15, 2015

In the talk described our experience of setting up data reporting and Business Intelligence processes for an international company. Starting with an Excel file and bunch SQL queries, switching from in-house reporting solution to centralised hosted reports for building a flexible system for monitoring KPI of the company.

Attendees will learn from our experience how to integrate Tableau into the processes of a company, how to build independent ETL subsystems that scale to petabyte size and other useful learnings.

We will cover our early days with cloud solutions, that do not provide a DWH platform, so you can not expect any kind of production requirements. In the talk, we will go through the process of automatically duplicating our Tableau datasources to Amazon Redshift. That will enable us to be more flexible with scaling data, be sure about backup strategies and many-many more points. We will introduce our python toolchain that helps us in a daily management of our BI.

Sergii Khomenko

October 15, 2015
Tweet

More Decks by Sergii Khomenko

Other Decks in Programming

Transcript

  1. Scaling up Business Intelligence 01 from the scratch and to

    15 countries Sergii Khomenko, Data Scientist [email protected], @lc0d3r Budapest BI Forum - October 15, 2015
  2. Sergii Khomenko 2 Data scientist at one of the biggest

    fashion communities, Stylight. Data analysis and visualisation hobbyist, working on problems not only in working time but in free time for fun and personal data visualisations. Speaker at Berlin Buzzwords 2014, ApacheCon Europe 2014, Puppet Camp London, Berlin Buzzwords 2015 , Tableau Conference on Tour. Upcoming talks: October 28-30, 2015 - Crunch Practical Big Data Conference, Budapest - Building data pipelines: from simple to more advanced - hands-on experience
  3. Profitable Leads Stylight provides its partners with high- quality leads

    enabling partner shops to leverage Stylight as a ROI positive traffic channel. Inspiration Stylight offers shoppable inspiration that makes it easy to know what to buy and how to style it. Branding & Reach Stylight offers a unique opportunity for brands to reach an audience that is actively looking for style online. Shopping Stylight helps users search and shop fashion and lifestyle products smarter across hundreds of shops. 3 Stylight – Make Style Happen Core Target Group Stylight help aspiring women between 18 and 35 to evolve their style through shoppable inspiration.
  4. Experienced & Ambitious Team Innovative cross- functional organisation with flat

    hierarchy builds a 
 unique team spirit. • +200 employees • 40 PhDs/Engineers • 28 years average age • 63% female • 23 nationalities • 0 suits 5
  5. Pros and Cons 8 • Data consistency • Not flexible

    structure – report change • Difficult to scale • Time-consuming – for new ad-hoc • Maintain and support – in-house development
  6. Pros and Cons 9 In short term • More flexible

    - add that advance feature • Easy to add alerting
  7. 11

  8. 13

  9. Pros and Cons 16 • Easy to start using •

    Works for free • All datasources in one place • Unified routine
  10. Pros and Cons 17 • Hard to scale • Not

    production ready • No backups • No control over data • No control over failures
  11. 19 • Tableau is a good DS editor • We

    already have so many DS • Current tasks and sprints
  12. 20

  13. P i c t u r e o f t

    h e o l d s e t u p
  14. 24 • We have all DS accessible • We know

    where data comes from • Structure re-creation • Migration without any manual input
  15. P i c t u r e o f t

    h e o l d s e t u p
  16. Benefits 30 • Control over backups • Control over refreshes

    • Scale DWH up to petabyte scale • Easy to add new ETL stages(EMR) • More open for new challenges
  17. Cross-Functional Team 36 Department: mission oriented team with all resources

    and the least dependencies Product Team: builds the software the department or its customers use Squad: team that executes the product development 36 Department Product Team Squad PO Engineer Engineer Designer Data Scientist Head of Business Role Business Role
  18. Cross-Functional Team 37 • You build it - you run

    it • You check your numbers (domain knowledge) • You provide your data as interface layer • Data report comes after data tracking 37 Department Product Team Squad PO Engineer Engineer Designer Data Scientist Head of Business Role Business Role
  19. Make it even more awesome 40 • Data definition unifications

    - ibis? • Pipeline unification - Luigi? • Flexible to integrate new things • Open Source our Python toolchain • Tableau replacement re:dash, AWS QuickSight
  20. Related talks 42 • Helping Data Teams with Puppet /

    Puppet Camp London • Secure Data Scalability at Stylight with Tableau Online and Amazon Redshift / Tableau Conference on Tour - Berlin