Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Big Data Wrangling: Getting Down to Business

Big Data Wrangling: Getting Down to Business

Keynote talk for a "business" audience at Big Data London 2016

Joe Hellerstein

November 04, 2016
Tweet

More Decks by Joe Hellerstein

Other Decks in Technology

Transcript

  1. Marketing Finance Product IT & Cyber Web & Social Customer

    Data Sales Support 4 See you at the Lake!
  2. A Common Platform to Store & Analyze Diverse Data Data

    Lake Finance Supply Chain IT & Cyber Web & Social Customer Satisfaction Marketing Support Sales
  3. Your business users feel… Data Lake Finance Supply Chain IT

    & Cyber Web & Social Customer Satisfaction Marketing Support Sales
  4. Data Accessibility Data Scientists Data Engineers SQL Analysts Excel Data

    Analysts Business Users Hundreds of Gigabytes Petabytes & Terabytes Megabytes & Gigabytes
  5. Data Scientists Data Engineers SQL Analysts Business Users Put the

    data in the hands of the people who know it best. Unlock potential by empowering people.
  6. “There is no point in bringing data … into the

    data warehouse environment without integrating it.” — Bill Inmon, Building the Data Warehouse The Single Source of Truth
  7. “There is no point in bringing data … into the

    data warehouse environment without integrating it.” — Bill Inmon, Building the Data Warehouse Building
  8. “There is no point in bringing data … into the

    data warehouse environment without integrating it.” — Bill Inmon, Building the Data Warehouse Building
  9. “There is no point in bringing data … into the

    data warehouse environment without integrating it.” — Bill Inmon, Building the Data Warehouse
  10. Data Lake Data Lake Finance Supply Chain IT & Cyber

    Web & Social Marketing Support The Value of Diversity Customer Satisfaction Sales
  11. The Value of Diversity Data Lake Supply Chain Marketing Customer

    Satisfaction Inventory Management per supermarket chain! Customer Behavior 
 Inventory correlations with social media campaigns. Brand Management Buying patterns by demographics, politics.
  12. Data Lake Finance Supply Chain IT & Cyber Web &

    Social Marketing Support Data Lake Customer Satisfaction Sales Move beyond 
 the single source of truth. Foster agility to generate diverse value.
  13. Driving Value To drive more value here You have to

    make this more efficient 20 Optimizing & Publishing Enriching & Blending Cleaning 80% of the time spent 20% Structuring Discovery Ingestion
  14. Turn of the Century UX for Data Transformation - Schemas

    and annotations - Box-and-arrow programming - Batch execution
  15. Research Roots: Potter’s Wheel 2001 + Real data, sampled on

    the fly + Menu-driven transforms + Immediate execution and feedback [Raman & Hellerstein, VLDB01]
  16. Research Roots: Open Source Data Wrangler, 2011 + Predictive Transformation

    + Immediate feedback on
 multiple choices – Browser-sized data sets [Kandel, Heer & Hellerstein, CHI 11]
  17. 26 Tech Challenge: Make Working with Raw Data at Scale…

    Provide immediate feedback for how transformations impact the data on-the-fly Interactive Suggestions and cues 
 guide based on data, activity and history. You decide based on visual feedback Intelligent See what’s in your data— use visualization to drive transformation Visual Directly manipulate data in collaboration with algorithms. Delightful
  18. ? ? THE KNIGHT IT THE WARRIOR DATA SCIENCE PROBABILITY

    PREDICTION THE QUEEN MANAGEMENT ! LOCH DATA Data Relativism and Context Services http://bit.ly/datarelativism16
  19. Open Source Context Services http://www.ground-context.org ground A broader context for

    big data THE WARRIOR DATA SCIENCE THE WARRIOR DATA SCIENCE Wrangle Aggressively Govern Collectively THE WARRIOR DATA SCIENCE THE WARRIOR DATA SCIENCE THE KNIGHT IT THE QUEEN MANAGEMENT