Slide 1

Slide 1 text

By Eitan Lees Visualization

Slide 2

Slide 2 text

No content

Slide 3

Slide 3 text

“The ubiquity of visual metaphors in describing cognitive processes hints at a nexus of relationships between what we see and what we think” - Mackinlay & Card (1999)

Slide 4

Slide 4 text

External Cognition

Slide 5

Slide 5 text

No content

Slide 6

Slide 6 text

No content

Slide 7

Slide 7 text

Part 1: Data Wrangling Part 2: Visual Encodings Part 3: Graphical Critique Part 4: Practical Advice

Slide 8

Slide 8 text

Part 1: Data Wrangling

Slide 9

Slide 9 text

Clean data sets are all alike; every unclean data set is unclean in its own way

Slide 10

Slide 10 text

Things to Consider: - Make Numbers ⇒ Numbers - Make Dates ⇒ Dates - Make Nans ⇒ Nans - Make sure strings aren’t corrupted

Slide 11

Slide 11 text

Before we visualize, let’s tidy up

Slide 12

Slide 12 text

country year cases population Afghanistan 1999 745 19987071 Afghanistan 2000 2666 20595360 Brazil 1999 37737 172006362 Brazil 2000 80488 174504898 China 1999 212258 1272915272 China 2000 213766 1280428583

Slide 13

Slide 13 text

country year cases population Afghanistan 1999 745 19987071 Afghanistan 2000 2666 20595360 Brazil 1999 37737 172006362 Brazil 2000 80488 174504898 China 1999 212258 1272915272 China 2000 213766 1280428583 country year cases population Afghanistan 1999 745 19987071 Afghanistan 2000 2666 20595360 Brazil 1999 37737 172006362 Brazil 2000 80488 174504898 China 1999 212258 1272915272 China 2000 213766 1280428583 Variables

Slide 14

Slide 14 text

country year cases population Afghanistan 1999 745 19987071 Afghanistan 2000 2666 20595360 Brazil 1999 37737 172006362 Brazil 2000 80488 174504898 China 1999 212258 1272915272 China 2000 213766 1280428583 country year cases population Afghanistan 1999 745 19987071 Afghanistan 2000 2666 20595360 Brazil 1999 37737 172006362 Brazil 2000 80488 174504898 China 1999 212258 1272915272 China 2000 213766 1280428583 Variables country year cases population Afghanistan 1999 745 19987071 Afghanistan 2000 2666 20595360 Brazil 1999 37737 172006362 Brazil 2000 80488 174504898 China 1999 212258 1272915272 China 2000 213766 1280428583 Observations

Slide 15

Slide 15 text

country year key value Afghanistan 1999 cases 745 Afghanistan 1999 population 19987071 Afghanistan 2000 cases 2666 Afghanistan 2000 population 20595360 Brazil 1999 cases 37737 Brazil 1999 population 172006362 Brazil 2000 cases 80488 Brazil 2000 population 174504898 China 1999 cases 212258 China 1999 population 1272915272 China 2000 cases 213766 China 2000 population 1280428583

Slide 16

Slide 16 text

country year key value Afghanistan 1999 cases 745 Afghanistan 1999 population 19987071 Afghanistan 2000 cases 2666 Afghanistan 2000 population 20595360 Brazil 1999 cases 37737 Brazil 1999 population 172006362 Brazil 2000 cases 80488 Brazil 2000 population 174504898 China 1999 cases 212258 China 1999 population 1272915272 China 2000 cases 213766 China 2000 population 1280428583 Variables country year key value Afghanistan 1999 cases 745 Afghanistan 1999 population 19987071 Afghanistan 2000 cases 2666 Afghanistan 2000 population 20595360 Brazil 1999 cases 37737 Brazil 1999 population 172006362 Brazil 2000 cases 80488 Brazil 2000 population 174504898 China 1999 cases 212258 China 1999 population 1272915272 China 2000 cases 213766 China 2000 population 1280428583

Slide 17

Slide 17 text

country year key value Afghanistan 1999 cases 745 Afghanistan 1999 population 19987071 Afghanistan 2000 cases 2666 Afghanistan 2000 population 20595360 Brazil 1999 cases 37737 Brazil 1999 population 172006362 Brazil 2000 cases 80488 Brazil 2000 population 174504898 China 1999 cases 212258 China 1999 population 1272915272 China 2000 cases 213766 China 2000 population 1280428583 Variables country year key value Afghanistan 1999 cases 745 Afghanistan 1999 population 19987071 Afghanistan 2000 cases 2666 Afghanistan 2000 population 20595360 Brazil 1999 cases 37737 Brazil 1999 population 172006362 Brazil 2000 cases 80488 Brazil 2000 population 174504898 China 1999 cases 212258 China 1999 population 1272915272 China 2000 cases 213766 China 2000 population 1280428583 country year key value Afghanistan 1999 cases 745 Afghanistan 1999 population 19987071 Afghanistan 2000 cases 2666 Afghanistan 2000 population 20595360 Brazil 1999 cases 37737 Brazil 1999 population 172006362 Brazil 2000 cases 80488 Brazil 2000 population 174504898 China 1999 cases 212258 China 1999 population 1272915272 China 2000 cases 213766 China 2000 population 1280428583 Observations

Slide 18

Slide 18 text

country year key value Afghanistan 1999 cases 745 Afghanistan 1999 population 19987071 Afghanistan 2000 cases 2666 Afghanistan 2000 population 20595360 Brazil 1999 cases 37737 Brazil 1999 population 172006362 Brazil 2000 cases 80488 Brazil 2000 population 174504898 China 1999 cases 212258 China 1999 population 1272915272 China 2000 cases 213766 China 2000 population 1280428583

Slide 19

Slide 19 text

country year cases population Afghanistan 1999 745 19987071 Afghanistan 2000 2666 20595360 Brazil 1999 37737 172006362 Brazil 2000 80488 174504898 China 1999 212258 1272915272 China 2000 213766 1280428583 country year key value Afghanistan 1999 cases 745 Afghanistan 1999 population 19987071 Afghanistan 2000 cases 2666 Afghanistan 2000 population 20595360 Brazil 1999 cases 37737 Brazil 1999 population 172006362 Brazil 2000 cases 80488 Brazil 2000 population 174504898 China 1999 cases 212258 China 1999 population 1272915272 China 2000 cases 213766 China 2000 population 1280428583 We want to gather the values corresponding to each key.

Slide 20

Slide 20 text

country year cases population Afghanistan 1999 745 19987071 Afghanistan 2000 2666 20595360 Brazil 1999 37737 172006362 Brazil 2000 80488 174504898 China 1999 212258 1272915272 China 2000 213766 1280428583 country year key value Afghanistan 1999 cases 745 Afghanistan 1999 population 19987071 Afghanistan 2000 cases 2666 Afghanistan 2000 population 20595360 Brazil 1999 cases 37737 Brazil 1999 population 172006362 Brazil 2000 cases 80488 Brazil 2000 population 174504898 China 1999 cases 212258 China 1999 population 1272915272 China 2000 cases 213766 China 2000 population 1280428583 Tidy We want to gather the values corresponding to each key.

Slide 21

Slide 21 text

country 1999 2000 Afghanistan 745 2666 Brazil 37737 80488 China 212258 213766

Slide 22

Slide 22 text

country 1999 2000 Afghanistan 745 2666 Brazil 37737 80488 China 212258 213766 We want to spread the values to the corresponding keys.

Slide 23

Slide 23 text

country year cases Afghanistan 1999 745 Afghanistan 2000 2666 Brazil 1999 37737 Brazil 2000 80488 China 1999 212258 China 2000 213766 country 1999 2000 Afghanistan 745 2666 Brazil 37737 80488 China 212258 213766 We want to spread the values to the corresponding keys.

Slide 24

Slide 24 text

country year cases Afghanistan 1999 745 Afghanistan 2000 2666 Brazil 1999 37737 Brazil 2000 80488 China 1999 212258 China 2000 213766 country 1999 2000 Afghanistan 745 2666 Brazil 37737 80488 China 212258 213766 Tidy We want to spread the values to the corresponding keys.

Slide 25

Slide 25 text

Tidy Data country year cases population Afghanistan 1999 745 19987071 Afghanistan 2000 2666 20595360 Brazil 1999 37737 172006362 Brazil 2000 80488 174504898 China 1999 212258 1272915272 China 2000 213766 1280428583 country year cases population Afghanistan 1999 745 19987071 Afghanistan 2000 2666 20595360 Brazil 1999 37737 172006362 Brazil 2000 80488 174504898 China 1999 212258 1272915272 China 2000 213766 1280428583 Variables country year cases population Afghanistan 1999 745 19987071 Afghanistan 2000 2666 20595360 Brazil 1999 37737 172006362 Brazil 2000 80488 174504898 China 1999 212258 1272915272 China 2000 213766 1280428583 Observations Values

Slide 26

Slide 26 text

Part 2: Visual Encoding

Slide 27

Slide 27 text

Compare area of circles

Slide 28

Slide 28 text

Compare length of bars

Slide 29

Slide 29 text

Compare length of bars

Slide 30

Slide 30 text

Compare area of circles

Slide 31

Slide 31 text

Length Area Slope Position Angle Volume Color Value Color Hue Shape Visual Encoding Channels And many more ...

Slide 32

Slide 32 text

Length Area Slope Position Angle Volume Color Value Color Hue Shape Accuracy ranking of quantitative perceptual tasks. Better Worse

Slide 33

Slide 33 text

Data Models

Slide 34

Slide 34 text

Nominal: - Labels and Categories - Example: Pill Shape - Operations: =, ≠

Slide 35

Slide 35 text

Nominal: - Labels and Categories - Example: Pill Shape - Operations: =, ≠ Ordinal: - Ordered Sets - Example: Drug Schedule - Operations: =, ≠, <, >

Slide 36

Slide 36 text

Nominal: - Labels and Categories - Example: Pill Shape - Operations: =, ≠ Ordinal: - Ordered Sets - Example: Drug Schedule - Operations: =, ≠, <, > Quantitative: - Numerical Measurement - Example: Dosage - Operations: =, ≠, <, >, -, %

Slide 37

Slide 37 text

No content

Slide 38

Slide 38 text

No content

Slide 39

Slide 39 text

2D Plane Size Color Value Texture Color Hue Angle Shape

Slide 40

Slide 40 text

2D Plane Size Color Value Texture Color Hue Angle Shape Suitable for Ordered Data Suitable for Unordered Data

Slide 41

Slide 41 text

2D Plane Size Color Value Texture Color Hue Angle Shape Suitable for Ordered Data Position Area Color Value

Slide 42

Slide 42 text

2D Plane Size Color Value Texture Color Hue Angle Shape Suitable for Unordered Data Angle Color Hue Shape

Slide 43

Slide 43 text

Position N O Q Size N O Q Color Value N O Q Texture N O Color Hue N Angle N Shape N Nominal Ordinal Quantitative Note: Q⊂O⊂N Bertin’s Levels of Organization

Slide 44

Slide 44 text

No content

Slide 45

Slide 45 text

Grammar of Graphics 1. Data 2. Transformations 3. Marks 4. Encoding - mapping from fields to mark properties 5. Scale - functions that map data to visual scales 6. Guides - visualizations of scales (axes, legends, etc.)

Slide 46

Slide 46 text

Building Blocks of Visualization

Slide 47

Slide 47 text

Part 3: Graphical Critique

Slide 48

Slide 48 text

Most of modern statistical graphics can be traced back to William Playfair a Scottish engineer and political economist. William Playfair

Slide 49

Slide 49 text

No content

Slide 50

Slide 50 text

No content

Slide 51

Slide 51 text

No content

Slide 52

Slide 52 text

Measle cases per 100000 people

Slide 53

Slide 53 text

Connectivity diagram of character in Les Misérables

Slide 54

Slide 54 text

Data Visualization is Everywhere!

Slide 55

Slide 55 text

Edward Tufte

Slide 56

Slide 56 text

Edward Tufte “Graphical excellence is that which gives to the viewer the greatest number of ideas in the shortest time with the least ink in the smallest space.” ― Edward R. Tufte, The Visual Display of Quantitative Information

Slide 57

Slide 57 text

Edward Tufte “Graphical excellence is that which gives to the viewer the greatest number of ideas in the shortest time with the least ink in the smallest space.” ― Edward R. Tufte, The Visual Display of Quantitative Information

Slide 58

Slide 58 text

Edward Tufte “Graphical excellence is that which gives to the viewer the greatest number of ideas in the shortest time with the least ink in the smallest space.” ― Edward R. Tufte, The Visual Display of Quantitative Information (within reason!)

Slide 59

Slide 59 text

Data to Ink Ratio

Slide 60

Slide 60 text

Data to Ink Ratio

Slide 61

Slide 61 text

Data to Ink Ratio Reasonable?

Slide 62

Slide 62 text

75% 50% 25% min max Tukey Data to Ink Ratio

Slide 63

Slide 63 text

Tukey Data to Ink Ratio

Slide 64

Slide 64 text

Tukey Tufte #1 Data to Ink Ratio

Slide 65

Slide 65 text

Tukey Tufte #1 Tufte #2 Data to Ink Ratio

Slide 66

Slide 66 text

Tukey Tufte #1 Tufte #2 Data to Ink Ratio Unreasonable?

Slide 67

Slide 67 text

Sparklines “A sparkline is a small intense, simple, word-sized graphic with typographic resolution … ” - Edward Tufte, Beautiful Evidence, p. 46-63.

Slide 68

Slide 68 text

Small Multiples “At the heart of quantitative reasoning is a single question: Compared to what? Small multiple designs answer directly by visually enforcing comparisons of changes, of the differences among objects, of the scope of alternatives.” - Edward Tufte, Envisioning Information, p. 67

Slide 69

Slide 69 text

No content

Slide 70

Slide 70 text

No content

Slide 71

Slide 71 text

Yeah, well, that's just, like, your opinion, man.

Slide 72

Slide 72 text

Part 4: Practical Advice

Slide 73

Slide 73 text

Ten Simple Rules for Better Figures By Nicolas P. Rougier 1. Know your audience 2. Identify your message 3. Adapt the figure to the support medium 4. Captions are not optional 5. Do not trust the defaults 6. Use color effectively 7. Do not mislead the reader 8. Avoid “Chartjunk” 9. Message trumps beauty 10. Get the right tool

Slide 74

Slide 74 text

1. Know your audience 2. Identify your message 3. Adapt the figure to the support medium 4. Captions are not optional 5. Do not trust the defaults 6. Use color effectively 7. Do not mislead the reader 8. Avoid “Chartjunk” 9. Message trumps beauty 10. Get the right tool Ten Simple Rules for Better Figures By Nicolas P. Rougier

Slide 75

Slide 75 text

1. Know your audience 2. Identify your message 3. Adapt the figure to the support medium 4. Captions are not optional 5. Do not trust the defaults 6. Use color effectively 7. Do not mislead the reader 8. Avoid “Chartjunk” 9. Message trumps beauty 10. Get the right tool Ten Simple Rules for Better Figures By Nicolas P. Rougier

Slide 76

Slide 76 text

1. Know your audience 2. Identify your message 3. Adapt the figure to the support medium 4. Captions are not optional 5. Do not trust the defaults 6. Use color effectively 7. Do not mislead the reader 8. Avoid “Chartjunk” 9. Message trumps beauty 10. Get the right tool Ten Simple Rules for Better Figures By Nicolas P. Rougier

Slide 77

Slide 77 text

1. Know your audience 2. Identify your message 3. Adapt the figure to the support medium 4. Captions are not optional 5. Do not trust the defaults 6. Use color effectively 7. Do not mislead the reader 8. Avoid “Chartjunk” 9. Message trumps beauty 10. Get the right tool Ten Simple Rules for Better Figures By Nicolas P. Rougier

Slide 78

Slide 78 text

No content