Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Secure Data Scalability at Stylight with Tablea...

Secure Data Scalability at Stylight with Tableau Online and Amazon Redshift / Tableau Conference on Tour - Berlin - Jun 9, 2015

Tableau provides a rich feature set, in particular Tableau Server/Online data storage. It comes easy and cheap to use, allowing for a quick start and fast growth. However, it is no permanent storage solution. We painfully experienced this first hand at beginning of the year.

We provide a quick overview on how we initially set up our Tableau infrastructure and how this limited us. Focus will be on steps we came up with to improve our working environment.

Talk will be targeted at companies relying on Tableau environment. We will go through the process of automatically duplicating your Tableau data sources to Amazon Redshift. We will introduce our python toolchain for daily management. It enables you to be more flexible with scaling your data, being sure about backup strategies and many more points.

http://ontour15.tableau.com/berlin/schedule/tuesday#session-1456

Sergii Khomenko

June 09, 2015
Tweet

More Decks by Sergii Khomenko

Other Decks in Programming

Transcript

  1. BACKING UP OUR TABLEAU - STEPS TOWARDS A RELIABLE REPORTING

    SOLUTION Sergii Khomenko, Data Scientist Dr. Konstantin Wemhöner, Head of Business Intelligence
  2. WHAT IS STYLIGHT ? A S H O R T

    I N T R O D U C T I O N STYLIGHT.de Seite 2 / 42
  3. CONTENT meets COMMERCE N E W The best place to

    discover & shop fashion. STYLIGHT.DE STYLIGHT.de Seite 3 / 42
  4. GLOBAL INSPIRATION – LOCAL COMMERCE A V A I L

    A B L E I N 1 4 C O U N T R I E S Germany, Austria, Switzerland, Netherlands, France, Italy, Sweden, UK, Spain, Australia, Brazil, US, Norway, Belgium STYLIGHT.de Seite 4 / 42
  5. STYLIGHT ON THE GO W H E N E V

    E R . W H E R E V E R . STYLIGHT.de Seite 5 / 42
  6. PROUD TO BLEED PURPLE • Founded: 2008 in Munich •

    OFFICES: Munich, London, New York • Investors: Holtzbrinck Ventures, Tengelmann Ventures, Seven Ventures • Business Partners: 350+ partner shops worldwide with 6000+ brands • Total Employees — 160+ (over 19 nationalities from 4 continents) F A C T S A N D F I G U R E S TOTAL N° OF E M P LOY E E S 50 100 150 2015 2014 2013 2012 2011 2010 STYLIGHT.de Seite 6 / 42
  7. GROSS MERCHANDISE VALUE $360 MILLION $2 70 MILLION $175 MILLION

    $50 MILL 2011 2012 2013 2014 STYLIGHT.de Seite 7 / 42
  8. WHY WE CHOSE TABLEAU ONLINE? • Easy to start using

    • Works for free • All data sources in one place • Unified routine STYLIGHT.de Seite 11 / 42
  9. WHY WE CHOSE TABLEAU ONLINE? • combination of local and

    online/cloud sources (Google Analytics, JDBC…) • Sharing cross-continents - instantaneous • easy distribution of reports with Tabcmd STYLIGHT.de Seite 12 / 42
  10. LOADING AND MONITORING BEFORE OUTAGE • 25 workbooks online with

    119 views from 80 data sources • Scheduled mails • All refreshes scheduled manually STYLIGHT.de Seite 17 / 42
  11. SERVER OUTAGE JANUARY 2015 • Started with empty scheduled mail

    reports (9th Jan) • Monday >80% of views not working • No clear communication from Tableau • Server outage during our scheduled refreshes STYLIGHT.de Seite 20 / 42
  12. FIRST THINGS FIRST: FIREFIGHTING Replacement of all data sources in

    workbooks Open Local copy New extract Replace STYLIGHT.de Seite 23 / 42
  13. HOW TO REBUILD A BROKEN DATA SOURCE? Biggest Issue: Workbooks

    could not be opened due to broken data source Understand how a Tableau data extract is build Find a way to extract and recreate the essential parts of a TDE STYLIGHT.de Seite 24 / 42
  14. ISSUES, PLANS • We have all DS accessible • We

    know where data comes from • Structure re-creation • Migration without any manual input STYLIGHT.de Seite 28 / 42
  15. BENEFITS • Control over backups • Control over refreshes •

    Scale DWH up to petabyte scale • Easy to add new ETL stages (EMR) • More open for new challenges STYLIGHT.de Seite 34 / 42
  16. POSITIVE OUTCOMES • Number of data sources reduced by 30%

    • Speed increase by using RedShift factor >100 • Scalable infrastructure for growing company • More flexible connection of tables via RedShift STYLIGHT.de Seite 38 / 42
  17. IMPROVING IT TO THE NEXT LEVEL! • Open Source our

    Python tools • Internal DWH mapping server • Flexible to integrate new things • Google Spreadsheet integration STYLIGHT.de Seite 39 / 42
  18. HOW TO REACH US T O O L S ,

    T U G M U N I C H Sergii Khomenko [email protected] @lc0d3r G E N E R A L I N F O , B I J O B S Dr. Konstantin Wemhöner [email protected] @kwarks85 STYLIGHT Engineering: @CodeTailors STYLIGHT.de Seite 40 / 42
  19. STYLIGHT Nymphenburger Straße 86 80636 Munich, Germany Join us on

    Facebook: facebook.com/stylight Follow us on Twitter: twitter.com/stylight Follow us on Instagram: instagram.com/stylight STYLIGHT.de Seite 42 / 42