Visualization

170b760e0147792d0140e59ec78e9893?s=47 Eitan Lees
November 15, 2019

 Visualization

A talk about the theory of visualization

170b760e0147792d0140e59ec78e9893?s=128

Eitan Lees

November 15, 2019
Tweet

Transcript

  1. By Eitan Lees Visualization

  2. None
  3. “The ubiquity of visual metaphors in describing cognitive processes hints

    at a nexus of relationships between what we see and what we think” - Mackinlay & Card (1999)
  4. External Cognition

  5. None
  6. None
  7. Part 1: Data Wrangling Part 2: Visual Encodings Part 3:

    Graphical Critique Part 4: Practical Advice
  8. Part 1: Data Wrangling

  9. Clean data sets are all alike; every unclean data set

    is unclean in its own way
  10. Things to Consider: - Make Numbers ⇒ Numbers - Make

    Dates ⇒ Dates - Make Nans ⇒ Nans - Make sure strings aren’t corrupted
  11. Before we visualize, let’s tidy up

  12. country year cases population Afghanistan 1999 745 19987071 Afghanistan 2000

    2666 20595360 Brazil 1999 37737 172006362 Brazil 2000 80488 174504898 China 1999 212258 1272915272 China 2000 213766 1280428583
  13. country year cases population Afghanistan 1999 745 19987071 Afghanistan 2000

    2666 20595360 Brazil 1999 37737 172006362 Brazil 2000 80488 174504898 China 1999 212258 1272915272 China 2000 213766 1280428583 country year cases population Afghanistan 1999 745 19987071 Afghanistan 2000 2666 20595360 Brazil 1999 37737 172006362 Brazil 2000 80488 174504898 China 1999 212258 1272915272 China 2000 213766 1280428583 Variables
  14. country year cases population Afghanistan 1999 745 19987071 Afghanistan 2000

    2666 20595360 Brazil 1999 37737 172006362 Brazil 2000 80488 174504898 China 1999 212258 1272915272 China 2000 213766 1280428583 country year cases population Afghanistan 1999 745 19987071 Afghanistan 2000 2666 20595360 Brazil 1999 37737 172006362 Brazil 2000 80488 174504898 China 1999 212258 1272915272 China 2000 213766 1280428583 Variables country year cases population Afghanistan 1999 745 19987071 Afghanistan 2000 2666 20595360 Brazil 1999 37737 172006362 Brazil 2000 80488 174504898 China 1999 212258 1272915272 China 2000 213766 1280428583 Observations
  15. country year key value Afghanistan 1999 cases 745 Afghanistan 1999

    population 19987071 Afghanistan 2000 cases 2666 Afghanistan 2000 population 20595360 Brazil 1999 cases 37737 Brazil 1999 population 172006362 Brazil 2000 cases 80488 Brazil 2000 population 174504898 China 1999 cases 212258 China 1999 population 1272915272 China 2000 cases 213766 China 2000 population 1280428583
  16. country year key value Afghanistan 1999 cases 745 Afghanistan 1999

    population 19987071 Afghanistan 2000 cases 2666 Afghanistan 2000 population 20595360 Brazil 1999 cases 37737 Brazil 1999 population 172006362 Brazil 2000 cases 80488 Brazil 2000 population 174504898 China 1999 cases 212258 China 1999 population 1272915272 China 2000 cases 213766 China 2000 population 1280428583 Variables country year key value Afghanistan 1999 cases 745 Afghanistan 1999 population 19987071 Afghanistan 2000 cases 2666 Afghanistan 2000 population 20595360 Brazil 1999 cases 37737 Brazil 1999 population 172006362 Brazil 2000 cases 80488 Brazil 2000 population 174504898 China 1999 cases 212258 China 1999 population 1272915272 China 2000 cases 213766 China 2000 population 1280428583
  17. country year key value Afghanistan 1999 cases 745 Afghanistan 1999

    population 19987071 Afghanistan 2000 cases 2666 Afghanistan 2000 population 20595360 Brazil 1999 cases 37737 Brazil 1999 population 172006362 Brazil 2000 cases 80488 Brazil 2000 population 174504898 China 1999 cases 212258 China 1999 population 1272915272 China 2000 cases 213766 China 2000 population 1280428583 Variables country year key value Afghanistan 1999 cases 745 Afghanistan 1999 population 19987071 Afghanistan 2000 cases 2666 Afghanistan 2000 population 20595360 Brazil 1999 cases 37737 Brazil 1999 population 172006362 Brazil 2000 cases 80488 Brazil 2000 population 174504898 China 1999 cases 212258 China 1999 population 1272915272 China 2000 cases 213766 China 2000 population 1280428583 country year key value Afghanistan 1999 cases 745 Afghanistan 1999 population 19987071 Afghanistan 2000 cases 2666 Afghanistan 2000 population 20595360 Brazil 1999 cases 37737 Brazil 1999 population 172006362 Brazil 2000 cases 80488 Brazil 2000 population 174504898 China 1999 cases 212258 China 1999 population 1272915272 China 2000 cases 213766 China 2000 population 1280428583 Observations
  18. country year key value Afghanistan 1999 cases 745 Afghanistan 1999

    population 19987071 Afghanistan 2000 cases 2666 Afghanistan 2000 population 20595360 Brazil 1999 cases 37737 Brazil 1999 population 172006362 Brazil 2000 cases 80488 Brazil 2000 population 174504898 China 1999 cases 212258 China 1999 population 1272915272 China 2000 cases 213766 China 2000 population 1280428583
  19. country year cases population Afghanistan 1999 745 19987071 Afghanistan 2000

    2666 20595360 Brazil 1999 37737 172006362 Brazil 2000 80488 174504898 China 1999 212258 1272915272 China 2000 213766 1280428583 country year key value Afghanistan 1999 cases 745 Afghanistan 1999 population 19987071 Afghanistan 2000 cases 2666 Afghanistan 2000 population 20595360 Brazil 1999 cases 37737 Brazil 1999 population 172006362 Brazil 2000 cases 80488 Brazil 2000 population 174504898 China 1999 cases 212258 China 1999 population 1272915272 China 2000 cases 213766 China 2000 population 1280428583 We want to gather the values corresponding to each key.
  20. country year cases population Afghanistan 1999 745 19987071 Afghanistan 2000

    2666 20595360 Brazil 1999 37737 172006362 Brazil 2000 80488 174504898 China 1999 212258 1272915272 China 2000 213766 1280428583 country year key value Afghanistan 1999 cases 745 Afghanistan 1999 population 19987071 Afghanistan 2000 cases 2666 Afghanistan 2000 population 20595360 Brazil 1999 cases 37737 Brazil 1999 population 172006362 Brazil 2000 cases 80488 Brazil 2000 population 174504898 China 1999 cases 212258 China 1999 population 1272915272 China 2000 cases 213766 China 2000 population 1280428583 Tidy We want to gather the values corresponding to each key.
  21. country 1999 2000 Afghanistan 745 2666 Brazil 37737 80488 China

    212258 213766
  22. country 1999 2000 Afghanistan 745 2666 Brazil 37737 80488 China

    212258 213766 We want to spread the values to the corresponding keys.
  23. country year cases Afghanistan 1999 745 Afghanistan 2000 2666 Brazil

    1999 37737 Brazil 2000 80488 China 1999 212258 China 2000 213766 country 1999 2000 Afghanistan 745 2666 Brazil 37737 80488 China 212258 213766 We want to spread the values to the corresponding keys.
  24. country year cases Afghanistan 1999 745 Afghanistan 2000 2666 Brazil

    1999 37737 Brazil 2000 80488 China 1999 212258 China 2000 213766 country 1999 2000 Afghanistan 745 2666 Brazil 37737 80488 China 212258 213766 Tidy We want to spread the values to the corresponding keys.
  25. Tidy Data country year cases population Afghanistan 1999 745 19987071

    Afghanistan 2000 2666 20595360 Brazil 1999 37737 172006362 Brazil 2000 80488 174504898 China 1999 212258 1272915272 China 2000 213766 1280428583 country year cases population Afghanistan 1999 745 19987071 Afghanistan 2000 2666 20595360 Brazil 1999 37737 172006362 Brazil 2000 80488 174504898 China 1999 212258 1272915272 China 2000 213766 1280428583 Variables country year cases population Afghanistan 1999 745 19987071 Afghanistan 2000 2666 20595360 Brazil 1999 37737 172006362 Brazil 2000 80488 174504898 China 1999 212258 1272915272 China 2000 213766 1280428583 Observations Values
  26. Part 2: Visual Encoding

  27. Compare area of circles

  28. Compare length of bars

  29. Compare length of bars

  30. Compare area of circles

  31. Length Area Slope Position Angle Volume Color Value Color Hue

    Shape Visual Encoding Channels And many more ...
  32. Length Area Slope Position Angle Volume Color Value Color Hue

    Shape Accuracy ranking of quantitative perceptual tasks. Better Worse
  33. Data Models

  34. Nominal: - Labels and Categories - Example: Pill Shape -

    Operations: =, ≠
  35. Nominal: - Labels and Categories - Example: Pill Shape -

    Operations: =, ≠ Ordinal: - Ordered Sets - Example: Drug Schedule - Operations: =, ≠, <, >
  36. Nominal: - Labels and Categories - Example: Pill Shape -

    Operations: =, ≠ Ordinal: - Ordered Sets - Example: Drug Schedule - Operations: =, ≠, <, > Quantitative: - Numerical Measurement - Example: Dosage - Operations: =, ≠, <, >, -, %
  37. None
  38. None
  39. 2D Plane Size Color Value Texture Color Hue Angle Shape

  40. 2D Plane Size Color Value Texture Color Hue Angle Shape

    Suitable for Ordered Data Suitable for Unordered Data
  41. 2D Plane Size Color Value Texture Color Hue Angle Shape

    Suitable for Ordered Data Position Area Color Value
  42. 2D Plane Size Color Value Texture Color Hue Angle Shape

    Suitable for Unordered Data Angle Color Hue Shape
  43. Position N O Q Size N O Q Color Value

    N O Q Texture N O Color Hue N Angle N Shape N Nominal Ordinal Quantitative Note: Q⊂O⊂N Bertin’s Levels of Organization
  44. None
  45. Grammar of Graphics 1. Data 2. Transformations 3. Marks 4.

    Encoding - mapping from fields to mark properties 5. Scale - functions that map data to visual scales 6. Guides - visualizations of scales (axes, legends, etc.)
  46. Building Blocks of Visualization

  47. Part 3: Graphical Critique

  48. Most of modern statistical graphics can be traced back to

    William Playfair a Scottish engineer and political economist. William Playfair
  49. None
  50. None
  51. None
  52. Measle cases per 100000 people

  53. Connectivity diagram of character in Les Misérables

  54. Data Visualization is Everywhere!

  55. Edward Tufte

  56. Edward Tufte “Graphical excellence is that which gives to the

    viewer the greatest number of ideas in the shortest time with the least ink in the smallest space.” ― Edward R. Tufte, The Visual Display of Quantitative Information
  57. Edward Tufte “Graphical excellence is that which gives to the

    viewer the greatest number of ideas in the shortest time with the least ink in the smallest space.” ― Edward R. Tufte, The Visual Display of Quantitative Information
  58. Edward Tufte “Graphical excellence is that which gives to the

    viewer the greatest number of ideas in the shortest time with the least ink in the smallest space.” ― Edward R. Tufte, The Visual Display of Quantitative Information (within reason!)
  59. Data to Ink Ratio

  60. Data to Ink Ratio

  61. Data to Ink Ratio Reasonable?

  62. 75% 50% 25% min max Tukey Data to Ink Ratio

  63. Tukey Data to Ink Ratio

  64. Tukey Tufte #1 Data to Ink Ratio

  65. Tukey Tufte #1 Tufte #2 Data to Ink Ratio

  66. Tukey Tufte #1 Tufte #2 Data to Ink Ratio Unreasonable?

  67. Sparklines “A sparkline is a small intense, simple, word-sized graphic

    with typographic resolution … ” - Edward Tufte, Beautiful Evidence, p. 46-63.
  68. Small Multiples “At the heart of quantitative reasoning is a

    single question: Compared to what? Small multiple designs answer directly by visually enforcing comparisons of changes, of the differences among objects, of the scope of alternatives.” - Edward Tufte, Envisioning Information, p. 67
  69. None
  70. None
  71. Yeah, well, that's just, like, your opinion, man.

  72. Part 4: Practical Advice

  73. Ten Simple Rules for Better Figures By Nicolas P. Rougier

    1. Know your audience 2. Identify your message 3. Adapt the figure to the support medium 4. Captions are not optional 5. Do not trust the defaults 6. Use color effectively 7. Do not mislead the reader 8. Avoid “Chartjunk” 9. Message trumps beauty 10. Get the right tool
  74. 1. Know your audience 2. Identify your message 3. Adapt

    the figure to the support medium 4. Captions are not optional 5. Do not trust the defaults 6. Use color effectively 7. Do not mislead the reader 8. Avoid “Chartjunk” 9. Message trumps beauty 10. Get the right tool Ten Simple Rules for Better Figures By Nicolas P. Rougier
  75. 1. Know your audience 2. Identify your message 3. Adapt

    the figure to the support medium 4. Captions are not optional 5. Do not trust the defaults 6. Use color effectively 7. Do not mislead the reader 8. Avoid “Chartjunk” 9. Message trumps beauty 10. Get the right tool Ten Simple Rules for Better Figures By Nicolas P. Rougier
  76. 1. Know your audience 2. Identify your message 3. Adapt

    the figure to the support medium 4. Captions are not optional 5. Do not trust the defaults 6. Use color effectively 7. Do not mislead the reader 8. Avoid “Chartjunk” 9. Message trumps beauty 10. Get the right tool Ten Simple Rules for Better Figures By Nicolas P. Rougier
  77. 1. Know your audience 2. Identify your message 3. Adapt

    the figure to the support medium 4. Captions are not optional 5. Do not trust the defaults 6. Use color effectively 7. Do not mislead the reader 8. Avoid “Chartjunk” 9. Message trumps beauty 10. Get the right tool Ten Simple Rules for Better Figures By Nicolas P. Rougier
  78. None