Slide 1

Slide 1 text

Pairing Computation with Cognition JoshuaStevens AlanMacEachren Big Data Accommodating in Visual Analytics Plus: Big Data Social Science @ Penn State

Slide 2

Slide 2 text

Outline • Introduction + Context • Computation: Considering End-users & Cognition • Big Data @ Penn State • Relating These Topics • Questions

Slide 3

Slide 3 text

“ Big data are more than big. “Big data is more than simply a matter of size; it is an opportunity to !nd insights in new and emerging types of data and content...” - IBM

Slide 4

Slide 4 text

“ Big data are more than big. Volume Variety Velocity Vinculation

Slide 5

Slide 5 text

• The 4 v’s of ‘big data’ are the norm for GIScientists • Ex: Climate models, terrain, networks, mobility patterns... • Many platforms are ready for cloud, AWS, and HPC Introduction + Context

Slide 6

Slide 6 text

Introduction + Context • Cartographers and their tools are ready, too. “Among many capabilities the HTML5 standard provides, there is one crucial for improving GIS, and that is HTML5 Canvas” Ravnić, D. HTML5 Canvas: An Open Standard for High Performing GIS Map Visualization in Web Browsers. Directions Magazine, April 5, 2012.

Slide 7

Slide 7 text

Computation • Computation will in!uence visualization from (at least) 2 angles: 1. Data wrangling, calculation, and storage 2. Choosing the right technology for representation and visualization

Slide 8

Slide 8 text

Computation: Calculation + Storage • “Each day, we create 2.5 quintillion bytes of data...” - IBM • 12 TB/day are tweets • ....more than 2 billion copies of Wikipedia, or • Algorithms and statistical techniques are essential • Basic example: • Clustering is simple and common in GIS, but scales poorly with N (i.e., not O(n) or O(n log n)) • Typical solution: multiple machines (parallel and distributed GIS) (Stacked floppy disks x 19)

Slide 9

Slide 9 text

Computation: Representation • Tools matter. • But...choose toolchains over tools. From Stevens, J., Smith, J., and M. Idris (2012). NVizABLE: A Network Visualization and Big Data Learning Environment.

Slide 10

Slide 10 text

“Attack computation from 3 perspectives: Find success in the middle. 1: Machine-level 2: HPC 3: Representation technology

Slide 11

Slide 11 text

“Attack computation from 3 perspectives: Find success in the middle. (maybe) 1: Machine-level 2: HPC 3: Representation technology

Slide 12

Slide 12 text

Approaching Computation • Important questions (I think): 1. Should we divide e"orts between machine-level and HPC? Or focus on both simultaneously? 2. Bigger is (probably) not always better. How do we determine when big is big enough? 3. How should visualization goals in!uence our answers?

Slide 13

Slide 13 text

Computation Cognition • Many analytical evaluations and UI/UX case studies focus on carefully controlled data and settings (for obvious reasons) • But....such scenarios are rare in big data. Bakshy, E. Showing Support for Marriage Equality on Facebook. Facebook Data Science. March 29, 2013.

Slide 14

Slide 14 text

• What this means for cognition... • Designers must expect the unexpected, then build tools that support edge cases (and beyond) • Think through computational issues, visualization goals, and users’ POV • Example: Incremental Visualization • Danyel Fisher, Igor Popov, Steven M. Drucker, and MC Schraefel, Trust Me, I'm Partially Right: Incremental Visualization Lets Analysts Explore Large Datasets Faster, in Proceedings of the 2012 Conference on Human Factors in Computing Systems (CHI 2012), 5 May 2012. Computation Cognition

Slide 15

Slide 15 text

Fisher et al. (2012). Computation Cognition

Slide 16

Slide 16 text

• Key considerations and questions emerge 1. How do non-experts interpret visualizations that continuously change? 2. Are incremental approaches e"ective in geographic displays? 3. What should it be like to interact with ≥ millions of data points? • When/if points should be interactive? Tie to computation and representation (e.g., Canvas vs SVG) Computation Cognition

Slide 17

Slide 17 text

“ Overview !rst, zoom and !lter, then details on demand. - Shneiderman (1996)

Slide 18

Slide 18 text

“ Overview !rst, zoom and !lter, then details on demand. - Shneiderman (1996) Still applicable in the era of big data?

Slide 19

Slide 19 text

Big Data @ Penn State • NSF IGERT in Big Data Social Science • PI: Burt Monroe (Poli Sci) • Co-PIs: • Alan MacEachren (Geog) • Lee Giles (IST) • Melissa Hardy (Soc and Demography) • Aleksandra Slavkovic (Stats and Public Health) • Project Coordinator: Dee Bagshaw • www.bdss.psu.edu

Slide 20

Slide 20 text

Big Data @ Penn State • 7 PhD Fellows in initial cohort • Dual-degree program w/ BDSS-speci$c courses • Years 2 and 3 • Research rotations in year 3 • Summer externships (at least 1 non-academic) Beatrice Abiero, Health Policy and Demographics Molly Ariotti, Political Science Muhammed Idris, Political Science Jennifer (Smith) Mason Geography Joshua Stevens, Geography Stephanie Wilson, Human Development and Family Studies Mo Yu, Information Science and Technolgy

Slide 21

Slide 21 text

Big Data @ Penn State • Workshops, demos, and hackathons • Projects and publications Stevens, J., Smith, J., and M. Idris (2012). NVizABLE: A Network Visualization and Big Data Learning Environment. Yanomine, J, and J. Stevens (2012). Political Events in Afghanistan: Analysis of 200 Million Events in the GDELT Database. Presented at AAG by M. Idris on Tuesday and upcoming NetMob @ MIT, May 1-3, 2013. Reported in Foreign Policy, April 10 2013. http://ideas.foreignpolicy.com/posts/2013/04/10/ what_can_we_learn_from_the_last_200_million_th ings_that_happened_in_the_world

Slide 22

Slide 22 text

How does this all relate? • Computation must consider visualization goals and cognitive impacts • No single approach is best, allocation of e"orts remains unclear • Must prepare students to deal with these issues early in their careers

Slide 23

Slide 23 text

“ Thank you. josh.stevens@psu.edu | @jscarto https://speakerdeck.com/jscarto