Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Accommodating Big Data in Visual Analytics: Pairing Computation with Cognition

Accommodating Big Data in Visual Analytics: Pairing Computation with Cognition

This talk covers opportunities and challenges related to big data visual analytics, with an emphasis on the kinds of questions developers, designers, and analysts alike must consider in the era of big data visualization.

Presented at AAG 2013 in Los Angeles, CA.

Joshua Stevens

April 12, 2013
Tweet

More Decks by Joshua Stevens

Other Decks in Science

Transcript

  1. Pairing Computation with Cognition JoshuaStevens AlanMacEachren Big Data Accommodating in

    Visual Analytics Plus: Big Data Social Science @ Penn State
  2. Outline • Introduction + Context • Computation: Considering End-users &

    Cognition • Big Data @ Penn State • Relating These Topics • Questions
  3. “ Big data are more than big. “Big data is

    more than simply a matter of size; it is an opportunity to !nd insights in new and emerging types of data and content...” - IBM
  4. • The 4 v’s of ‘big data’ are the norm

    for GIScientists • Ex: Climate models, terrain, networks, mobility patterns... • Many platforms are ready for cloud, AWS, and HPC Introduction + Context
  5. Introduction + Context • Cartographers and their tools are ready,

    too. “Among many capabilities the HTML5 standard provides, there is one crucial for improving GIS, and that is HTML5 Canvas” Ravnić, D. HTML5 Canvas: An Open Standard for High Performing GIS Map Visualization in Web Browsers. Directions Magazine, April 5, 2012.
  6. Computation • Computation will in!uence visualization from (at least) 2

    angles: 1. Data wrangling, calculation, and storage 2. Choosing the right technology for representation and visualization
  7. Computation: Calculation + Storage • “Each day, we create 2.5

    quintillion bytes of data...” - IBM • 12 TB/day are tweets • ....more than 2 billion copies of Wikipedia, or • Algorithms and statistical techniques are essential • Basic example: • Clustering is simple and common in GIS, but scales poorly with N (i.e., not O(n) or O(n log n)) • Typical solution: multiple machines (parallel and distributed GIS) (Stacked floppy disks x 19)
  8. Computation: Representation • Tools matter. • But...choose toolchains over tools.

    From Stevens, J., Smith, J., and M. Idris (2012). NVizABLE: A Network Visualization and Big Data Learning Environment.
  9. “Attack computation from 3 perspectives: Find success in the middle.

    1: Machine-level 2: HPC 3: Representation technology
  10. “Attack computation from 3 perspectives: Find success in the middle.

    (maybe) 1: Machine-level 2: HPC 3: Representation technology
  11. Approaching Computation • Important questions (I think): 1. Should we

    divide e"orts between machine-level and HPC? Or focus on both simultaneously? 2. Bigger is (probably) not always better. How do we determine when big is big enough? 3. How should visualization goals in!uence our answers?
  12. Computation Cognition • Many analytical evaluations and UI/UX case studies

    focus on carefully controlled data and settings (for obvious reasons) • But....such scenarios are rare in big data. Bakshy, E. Showing Support for Marriage Equality on Facebook. Facebook Data Science. March 29, 2013.
  13. • What this means for cognition... • Designers must expect

    the unexpected, then build tools that support edge cases (and beyond) • Think through computational issues, visualization goals, and users’ POV • Example: Incremental Visualization • Danyel Fisher, Igor Popov, Steven M. Drucker, and MC Schraefel, Trust Me, I'm Partially Right: Incremental Visualization Lets Analysts Explore Large Datasets Faster, in Proceedings of the 2012 Conference on Human Factors in Computing Systems (CHI 2012), 5 May 2012. Computation Cognition
  14. • Key considerations and questions emerge 1. How do non-experts

    interpret visualizations that continuously change? 2. Are incremental approaches e"ective in geographic displays? 3. What should it be like to interact with ≥ millions of data points? • When/if points should be interactive? Tie to computation and representation (e.g., Canvas vs SVG) Computation Cognition
  15. “ Overview !rst, zoom and !lter, then details on demand.

    - Shneiderman (1996) Still applicable in the era of big data?
  16. Big Data @ Penn State • NSF IGERT in Big

    Data Social Science • PI: Burt Monroe (Poli Sci) • Co-PIs: • Alan MacEachren (Geog) • Lee Giles (IST) • Melissa Hardy (Soc and Demography) • Aleksandra Slavkovic (Stats and Public Health) • Project Coordinator: Dee Bagshaw • www.bdss.psu.edu
  17. Big Data @ Penn State • 7 PhD Fellows in

    initial cohort • Dual-degree program w/ BDSS-speci$c courses • Years 2 and 3 • Research rotations in year 3 • Summer externships (at least 1 non-academic) Beatrice Abiero, Health Policy and Demographics Molly Ariotti, Political Science Muhammed Idris, Political Science Jennifer (Smith) Mason Geography Joshua Stevens, Geography Stephanie Wilson, Human Development and Family Studies Mo Yu, Information Science and Technolgy
  18. Big Data @ Penn State • Workshops, demos, and hackathons

    • Projects and publications Stevens, J., Smith, J., and M. Idris (2012). NVizABLE: A Network Visualization and Big Data Learning Environment. Yanomine, J, and J. Stevens (2012). Political Events in Afghanistan: Analysis of 200 Million Events in the GDELT Database. Presented at AAG by M. Idris on Tuesday and upcoming NetMob @ MIT, May 1-3, 2013. Reported in Foreign Policy, April 10 2013. http://ideas.foreignpolicy.com/posts/2013/04/10/ what_can_we_learn_from_the_last_200_million_th ings_that_happened_in_the_world
  19. How does this all relate? • Computation must consider visualization

    goals and cognitive impacts • No single approach is best, allocation of e"orts remains unclear • Must prepare students to deal with these issues early in their careers