Big Data Wrangling: Getting Down to Business

Big Data Wrangling: Getting Down to Business

Keynote talk for a "business" audience at Big Data London 2016

Fb47910b51938c597b6ed6291206cb6e?s=128

Joe Hellerstein

November 04, 2016
Tweet

Transcript

  1. Big Data Getting Down to Business Joe Hellerstein, Co-Founder and

    Chief Strategy Officer Wrangling
  2. Berkeley

  3. Data to the People! Celebrate Diversity Tech for the People

    1 2 3 Outline
  4. Marketing Finance Product IT & Cyber Web & Social Customer

    Data Sales Support 4 See you at the Lake!
  5. A Common Platform to Store & Analyze Diverse Data Data

    Lake Finance Supply Chain IT & Cyber Web & Social Customer Satisfaction Marketing Support Sales
  6. Your business users feel… Data Lake Finance Supply Chain IT

    & Cyber Web & Social Customer Satisfaction Marketing Support Sales
  7. Data Scientists Data Engineers SQL Analysts Excel Data Analysts Business

    Users Data-Driven Population
  8. Data Accessibility Data Scientists Data Engineers SQL Analysts Excel Data

    Analysts Business Users Hundreds of Gigabytes Petabytes & Terabytes Megabytes & Gigabytes
  9. 9 Your business users feel…

  10. Data Scientists Data Engineers SQL Analysts Business Users Put the

    data in the hands of the people who know it best. Unlock potential by empowering people.
  11. Celebrate Diversity 1 2 3

  12. “There is no point in bringing data … into the

    data warehouse environment without integrating it.” — Bill Inmon, Building the Data Warehouse The Single Source of Truth
  13. “There is no point in bringing data … into the

    data warehouse environment without integrating it.” — Bill Inmon, Building the Data Warehouse Building
  14. “There is no point in bringing data … into the

    data warehouse environment without integrating it.” — Bill Inmon, Building the Data Warehouse Building
  15. “There is no point in bringing data … into the

    data warehouse environment without integrating it.” — Bill Inmon, Building the Data Warehouse
  16. Data Lake Data Lake Finance Supply Chain IT & Cyber

    Web & Social Marketing Support The Value of Diversity Customer Satisfaction Sales
  17. The Value of Diversity Data Lake Supply Chain Marketing Customer

    Satisfaction Inventory Management per supermarket chain! Customer Behavior 
 Inventory correlations with social media campaigns. Brand Management Buying patterns by demographics, politics.
  18. Data Lake Finance Supply Chain IT & Cyber Web &

    Social Marketing Support Data Lake Customer Satisfaction Sales Move beyond 
 the single source of truth. Foster agility to generate diverse value.
  19. 1 2 3 Tech for the People

  20. Driving Value To drive more value here You have to

    make this more efficient 20 Optimizing & Publishing Enriching & Blending Cleaning 80% of the time spent 20% Structuring Discovery Ingestion
  21. Why are Business Users Disconnected from Big Data?

  22. It’s an Interaction Problem

  23. Turn of the Century UX for Data Transformation - Schemas

    and annotations - Box-and-arrow programming - Batch execution
  24. Research Roots: Potter’s Wheel 2001 + Real data, sampled on

    the fly + Menu-driven transforms + Immediate execution and feedback [Raman & Hellerstein, VLDB01]
  25. Research Roots: Open Source Data Wrangler, 2011 + Predictive Transformation

    + Immediate feedback on
 multiple choices – Browser-sized data sets [Kandel, Heer & Hellerstein, CHI 11]
  26. 26 Tech Challenge: Make Working with Raw Data at Scale…

    Provide immediate feedback for how transformations impact the data on-the-fly Interactive Suggestions and cues 
 guide based on data, activity and history. You decide based on visual feedback Intelligent See what’s in your data— use visualization to drive transformation Visual Directly manipulate data in collaboration with algorithms. Delightful
  27. All of This Led to

  28. Interactive Exploration 28

  29. 29 Predictive Transformation

  30. Data to the People! Celebrate Diversity Tech for the People

    1 2 3
  31. Context Matters 4

  32. ? ? THE KNIGHT IT THE WARRIOR DATA SCIENCE PROBABILITY

    PREDICTION THE QUEEN MANAGEMENT ! LOCH DATA Data Relativism and Context Services http://bit.ly/datarelativism16
  33. Open Source Context Services http://www.ground-context.org ground A broader context for

    big data THE WARRIOR DATA SCIENCE THE WARRIOR DATA SCIENCE Wrangle Aggressively Govern Collectively THE WARRIOR DATA SCIENCE THE WARRIOR DATA SCIENCE THE KNIGHT IT THE QUEEN MANAGEMENT
  34. Stop by Trifacta’s booth #306 Download Trifacta Wrangler for Free

    trifacta.com/start-wrangling
  35. None