Slide 1

Slide 1 text

Data Visualization Dr Georges Hattab Junior group leader and Head of the Bioinformatics division Department of Mathematics & Computer Science

Slide 2

Slide 2 text

Introduction Data visualization is increasingly important, but it requires clear objectives and improved implementation 2

Slide 3

Slide 3 text

Why data visualization? • Unprecedented amount of data: 2.5 quintillion bytes/day • Efficient exploration • Effective communication • Integral aspects of scientific communication • Challenge to benefit from it without being overwhelmed • 65% population are visual learners • 90% information transmitted to the brain is visual • 70% sensory receptors in the eyes • 50% of the brain is dedicated to visual processing 3

Slide 4

Slide 4 text

Problem • The value and utility of this popular form of communication remains unclear although there’s a growing appetite for the visual display of information • Clear objectives are needed to drive design decisions • Assess utility and practicality of visualizations • What do researchers want to and need to see in the data? • Which computational approaches and visual encodings will best bring out the trends? 4 Continual refinement of design decisions to meet research objectives

Slide 5

Slide 5 text

Task specificity Miele et al. 2019. Nine quick tips for analyzing network data. https://doi.org/10.1371/journal.pcbi.1007434 5

Slide 6

Slide 6 text

Difficulty of validation • Solution: Use methods from different fields at each level 6 Nested Model of Visualization Design and Validation. Tamara Munzner. IEEE TVCG 2009.

Slide 7

Slide 7 text

Textbook references • Visualization Analysis and Design. Tamara Munzner • Points of View article series. Nature Methods. • Data visualization handbook. Juuso Koponen and Jonatam Hildén 7

Slide 8

Slide 8 text

Examples 8

Slide 9

Slide 9 text

Examples 9 Original image from REUTERS/Simon Scarr, Marco Hernandez.

Slide 10

Slide 10 text

Examples 10

Slide 11

Slide 11 text

Examples 11 Wong, B. Visualizing biological data. Nat Methods 9, 1131 (2012) doi:10.1038/nmeth.2258

Slide 12

Slide 12 text

Pointers before design • The story: Which story do you want to tell and how do you want to tell it? • The overview figure: How do you clarify concepts and quickly understand the overall idea? 12 Original image from REUTERS/Simon Scarr, Marco Hernandez.

Slide 13

Slide 13 text

The story 13 Separation: A hero ventures forth from the world of common day into a region of supernatural wonder. Initiation: Fabulous forces are there encountered and a decisive victory is won. Return: The hero comes back from this mysterious adventure with the power to bestow boons on his fellow man. Info we trust, RJ Andrews. Joseph Campbell, 1949

Slide 14

Slide 14 text

The story 14 Charles Minard. 1869

Slide 15

Slide 15 text

The story 15 John Snow, 1854

Slide 16

Slide 16 text

The story 16 Randy Olson. 2016 https://graphics.wsj.com/infectious-diseases-and-vaccines/ https://community.jmp.com/t5/JMP-Blog/Graph-makeover-Measles-heat-map/ba-p/30550 https://www.visualisingdata.com/2015/02/visualisation-data-like/ https://blogs.sas.com/content/sastraining/2015/02/17/how-to-make-infectious-diseases-look-better/

Slide 17

Slide 17 text

The story 17

Slide 18

Slide 18 text

The story… with an UI 18

Slide 19

Slide 19 text

The overview figure Portrays discrete yet connected steps or states Accounts for all graphical elements to follow change Relationships and Hierarchy Compact and economical for public understanding 19

Slide 20

Slide 20 text

The overview figure 20

Slide 21

Slide 21 text

The counterexample of overview 21

Slide 22

Slide 22 text

The overview figure 22 Wong, B. Nat Methods 8, 365 (2011) doi:10.1038/nmeth0511-365. Lieberman-Aiden et al. Science 326, 289–293 (2009).

Slide 23

Slide 23 text

The overview figure 23

Slide 24

Slide 24 text

… or a combination 24

Slide 25

Slide 25 text

There’s more… 25 Courtesy of Christoph Niemann Courtesy of Michelle Rial

Slide 26

Slide 26 text

… or quite a bit serious 26 • 1 million seconds equal 11 and 1/2 days . • 1 billion seconds equal 31 and 3/4 years . • 1 trillion seconds equal 31 710 years.

Slide 27

Slide 27 text

… or quite a bit serious 27

Slide 28

Slide 28 text

Data visualization • Enable people to explore and explain data through human visual abilities to recognize patterns • Data visualization transforms information into a visual form • This process requires skills from engineering, statistics, graphic design, and other disciplines 28 IBM Design

Slide 29

Slide 29 text

The process 29

Slide 30

Slide 30 text

Layout is next 30

Slide 31

Slide 31 text

Layout Salience to relevance, negative space, Gestalt principles, design process, elements of visual design, storytelling 31

Slide 32

Slide 32 text

Compositional aesthetics 32

Slide 33

Slide 33 text

Infallible proportions 33 Wong. Nat Methods 8, 783 (2011) doi:10.1038/nmeth.1711

Slide 34

Slide 34 text

… all around us 34

Slide 35

Slide 35 text

The grid system • organizes content • improves the design process • organizes typography • makes easy collaboration • helps create balanced compositions • is very flexible 35 Thomas Gaskin The 892 unique ways to partition a 3 × 4 grid This poster illustrates a change in design practice. Compu- tation-based design—that is, the use of algorithms to compute options—is becoming more practical and more common. Design tools are becoming more computation- based; designers are working more closely with program- mers; and designers are taking up programming. Above, you see the 892 unique ways to partition a 3 × 4 grid into unit rectangles. For many years, designers have used grids to unify diverse sets of content in books, magazines, screens, and other environments. The 3 × 4 grid is a com- mon example. Yet even in this simple case, generating all the options has—until now—been almost impossible. Patch Kessler designed algorithms to generate all the possible variations, identify unique ones, and sort them— not only for 3 × 4 grids but also for any n × m grid. He instantiated the algorithms in a MATLAB program, which output PDFs, which Thomas Gaskin imported into Adobe Illustrator to design the poster. Rules for generating variations The rule system that generated the variations in the poster was suggested by Bill Drenttel and Jessica Helfand who noted its relationship to the tatami mat system used in Japanese buildings for 1300 years or more. In 2006, Drenttel and Helfand obtained U.S. Patent 7124360 on this grid system—“Method and system for computer screen lay- out based on recombinant geometric modular structure” . The tatami system uses 1 × 2 rectangles. Within a 3 × 4 grid, 1 × 2 rectangles can be arranged in 5 ways. They appear at the end of section 6. Unit rectangles (1 × 1, 1 × 2, 1 × 3, 1 × 4; 2 × 2, 2 × 3, 2 × 4; 3 × 3, 3 × 4) can be arranged in a 3 × 4 grid in 3,164 ways. Many are almost the same—mirrored or rotated versions of the same configuration. The poster includes only unique variations—one version from each mirror or rotation group. Colors indicate the type and number of related non-unique variations. The variations shown in black have 3 related versions; blue, green, and orange have 1 related version; and magenta variations are unique, because mirroring and rotating yields the original, thus no other versions. (See the table to the right for examples.) Rules for sorting The poster groups variations according to the number of non-overlapping rectangles. The large figures indicate the beginning of each group. The sequence begins in the upper left and proceeds from left to right and top to bottom. Each group is further divided into sub-groups sharing the same set of elements. The sub-groups are arranged according to the size of their largest element from largest to smallest. Squares precede rectangles of the same area; horizontals precede verticals of the same dimensions. Within sub- groups, variations are arranged according to the position of the largest element, preceding from left to right and top to bottom. Variations themselves are oriented so that the largest rectangle is in the top left. Black dots separate groups by size. Gray dots separate groups by orientation. Where to learn more Grids have been described in design literature for at least 50 years. French architect Le Corbusier describes grid systems in his 1946 book, Le Modulor. Swiss graphic designer Karl Gerstner describes a number of grid systems or “programmes” in his 1964 book, Designing Programmes. The classic work on grids for graphic designers is Josef Muller-Brockman’s 1981 book, Grid Systems. Patch Kessler explores the mathematical underpinnings of grid generation in his paper “Arranging Rectangles” . www.mechanicaldust.com/Documents/Partitions_05.pdf Thomas Gaskin has created an interactive tool for viewing variations and generating HTML. www.3x4grid.com Design: Thomas Gaskin Creative Direction: Hugh Dubberly Algorithms: Patrick Kessler Patent: William Drenttel + Jessica Helfand Copyright © 2011 Dubberly Design Office 2501 Harrison Street, #7 San Francisco, CA 94110 415 648 9799 26 × Magenta All three symmetries combined Unchanged by horizontal reflection, vertical reflection, or 180º rotation. 26 × Green Rotational symmetry Changed by horizontal and vertical reflection. 61 × Blue Top-bottom symmetry Changed by horizontal reflection and 180º rotation. 76 × Orange Left-right symmetry Changed by vertical reflection and 180º rotation. 703 × Black Asymmetric Changed by horizontal reflection, vertical reflection, and 180º rotation. Original Horizontal Reflection Vertical Reflection 180º Rotation R R R R R R R 3 10 of 4 33 of 5 90 of 7 232 of 8 201 of 9 105 of 10 35 of 11 6 of 12 1 of 2 3 of 1 1 of 6 175 of 3 × 4’s 3 × 3’s 3 × 3’s 3 × 3’s 2 × 4’s 2 × 4’s 2 × 4’s 2 × 4’s 2 × 3’s 2 × 3’s 2 × 3’s 2 × 3’s 2 × 3’s 2 × 3’s 1 × 4’s 1 × 4’s 1 × 4’s 1 × 4’s 1 × 4’s 1 × 4’s 1 × 4’s 2 × 2’s 2 × 2’s 2 × 2’s 2 × 2’s 2 × 2’s 2 × 2’s 2 × 2’s 1 × 3’s 1 × 3’s 1 × 3’s 1 × 3’s 1 × 3’s 1 × 3’s 1 × 2’s 1 × 2’s 1 × 2’s 1 × 2’s 1 × 2’s 1 × 2’s 1 × 1’s

Slide 36

Slide 36 text

The grid system 36

Slide 37

Slide 37 text

The grid system 37

Slide 38

Slide 38 text

The grid system 38 Kharchenko, P., Alekseyenko, A., Schwartz, Y. et al. Nature 471, 480–485 (2011) doi:10.1038/nature09725

Slide 39

Slide 39 text

Retrospective 1 • text goes here 2 • text goes here 39

Slide 40

Slide 40 text

… or the journey of our eyes 40 Wong. Nat Methods 8, 783 (2011) doi:10.1038/nmeth.1711

Slide 41

Slide 41 text

41 … or the journey of our eyes Wong. Nat Methods 8, 783 (2011) doi:10.1038/nmeth.1711

Slide 42

Slide 42 text

Property that depends on the relationship of one object to other objects on a display Salience to relevance 42

Slide 43

Slide 43 text

Salience • Salience is the physical property that sets an object apart from its surroundings • It should align with relevance in visuals used for presentations • information encoding needs to be efficient because the audience is expected to simultaneously listen and read 43 Wong. Nat Methods 8, 889 (2011) doi:10.1038/nmeth.1762

Slide 44

Slide 44 text

Selective vision 44

Slide 45

Slide 45 text

Selective vision 45

Slide 46

Slide 46 text

Saliency map 46

Slide 47

Slide 47 text

Tips • create salience by using: color, shape, size, position • easier to see information that is presented physically larger • elements at a diagonal stand out when all others are oriented vertically and horizontally • on a black and white backdrop of elements, colored information is attractive • salience of unintentional assignment can be very harmful to communicate a clear message 47

Slide 48

Slide 48 text

Discordance salience/relevance Hello World! Wong. Nat Methods 8, 889 (2011) doi:10.1038/nmeth.1762 48

Slide 49

Slide 49 text

Negative space The whitespace or the unmarked areas of a page 49

Slide 50

Slide 50 text

Whitespace Mori Kansai (1814-1894), Rabbits, 1881 Gaps between text blocks The term stems from the printing practice which white paper is generally used. Margins and gaps that separate blocks of text make it easier to access written material because they provide a visual structure. Well-planned negative space balances the positive (nonwhite) space and is key to aesthetic. and images 50

Slide 51

Slide 51 text

Examples 51 Wong. Nat Methods 8, 5 (2011) doi:10.1038/nmeth0111-5

Slide 52

Slide 52 text

Congested environments 52 Siegenthaler et al. (2019) PLoS Biol 17(8): e3000400. https://doi.org/10.1371/journal.pbio.3000400

Slide 53

Slide 53 text

Congested environments 53 Kharchenko, P., Alekseyenko, A., Schwartz, Y. et al. Nature 471, 480–485 (2011) doi:10.1038/nature09725 Chromatin annotation of the Drosophila melanogaster genome

Slide 54

Slide 54 text

Example solution 54 Chromatin annotation of the Drosophila melanogaster genome Kharchenko, P., Alekseyenko, A., Schwartz, Y. et al. Nature 471, 480–485 (2011) doi:10.1038/nature09725

Slide 55

Slide 55 text

Gestalt principles Or how we organize visual information 55

Slide 56

Slide 56 text

Visual structure 56 A B C D Wong. Nat Methods 7, 863 (2010) doi:10.1038/nmeth1110-863

Slide 57

Slide 57 text

Similarity, proximity, grouping, etc 57 Wong. Nat Methods 7, 863 (2010) doi:10.1038/nmeth1110-863 c B F N U ABCDEFGH

Slide 58

Slide 58 text

The illusion of completion 58 Wong. Nat Methods 7, 941 (2010) doi:10.1038/nmeth1210-941

Slide 59

Slide 59 text

Unified compositions • Graphics and text used as vertices and edges of geometric shapes 
 • Geometric and curvilinear shapes used as flexible guides to align content. 59 Wong. Nat Methods 7, 941 (2010) doi:10.1038/nmeth1210-941

Slide 60

Slide 60 text

Patterns • Our eyes see patterns everywhere • We can tell when things look the same or different • When are combined, we see something new 60

Slide 61

Slide 61 text

Summary We can see … 
 • patterns with things that have the same shape or color 
 • that things belong together when they are close to one another 
 61

Slide 62

Slide 62 text

Summary 62 We can see … 
 • that things belong together when they move the same way • that things belong together when they are close to one another 


Slide 63

Slide 63 text

Summary 63 We can see … 
 • and enjoy patterns that are neat and even • shapes even when part of the shape is missing

Slide 64

Slide 64 text

We can see … 
 • shapes where things aren’t! Summary 64 • when our eyes and brain work together, there is really no limit to what we can make!

Slide 65

Slide 65 text

Conclusion • From bits to larger units • Structure gives meaning • Perceptual organization based on principles • Principles: Similarity, proximity, connection, enclosure • Structure helps us draw correlations between visual elements 65 Hattab et al. Info+ conference. (2016)

Slide 66

Slide 66 text

Elements of visual design Translate the principles of effective writing to the process of figure design 66

Slide 67

Slide 67 text

Figure creation • apply principles of effective written communication • leverage our training and experience with words • make the process structured and reproducible • assess and optimize each part of a figure “Do not take shortcuts at the expense of clarity” 
 — Strung and White’s dictum Krzywinski. Nat Methods 10, 371 (2013) doi:10.1038/nmeth.2444 67

Slide 68

Slide 68 text

Problematic constructs “Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo” Pinker, S. The Language Instinct. (1994) W. Morrow, New York 68 Krzywinski. Nat Methods 10, 371 (2013) doi:10.1038/nmeth.2444

Slide 69

Slide 69 text

Important tips • avoid overwriting because “rich ornate prose is hard to digest, generally unwholesome, and sometimes nauseating” • the visual equivalent is “chartjunk” as coined by Tufte • visually garnished elements shout at the reader • if you don’t write it this way don’t draw it then • don’t overwhelm the reader • simple shapes provide an elegant representation • use color sparingly Krzywinski. Nat Methods 10, 371 (2013) doi:10.1038/nmeth.2444 69

Slide 70

Slide 70 text

Similarity for common meaning 70 Krzywinski. Nat Methods 10, 371 (2013) doi:10.1038/nmeth.2444

Slide 71

Slide 71 text

Storytelling Relate your data to the world by telling a story 71

Slide 72

Slide 72 text

Why stories? • capacity to delight and surprise • spark creativity by making meaningful connections between data and the ideas, interests and lives of the reader • stories may contain vexing questions, conflict, dead ends, insights and occasional thrilling leaps • When you see these indicators the story is well told 72

Slide 73

Slide 73 text

The story 73 Separation: A hero ventures forth from the world of common day into a region of supernatural wonder. Initiation: Fabulous forces are there encountered and a decisive victory is won. Return: The hero comes back from this mysterious adventure with the power to bestow boons on his fellow man. Info we trust, RJ Andrews. Joseph Campbell, 1949

Slide 74

Slide 74 text

Maintaining focus • Leave out detail that does not advance the plot • Distinguish necessary detail from superfluous detail • Do not show everything • Provide context and support for your story but stay on track • Make use of clever visual elements to help your readers • Consider: What would the headline of your story be? 74

Slide 75

Slide 75 text

Example 75 Krzywinski and Cairo. Nat Methods 10, 687 (2013) doi:10.1038/nmeth.2571

Slide 76

Slide 76 text

• The first two panels of the figure provide the background necessary for this plot twist to be appreciated 
 • The vertical scale is chosen to accentuate the similarity of the death rates for males due to cancer in aggregate and to lung cancer in panels 2 and 3. • 
 In short, it’s a good story!

Slide 77

Slide 77 text

Some more tips Be sure to • use multiple panels for the flow • use colloquial language when addressing a large audience • not use the complete data • rely on a visual guideline to maintain focus • use coherent visual elements • avoid color unless it is necessary Visually dull or accentuate • axes and grids to maintain focus on data trends • qualitative and quantitative aspects but always be accurate • the context (e.g., panel 4 compares adult vs youth rates) • style to meet the journal or publisher style requirements 77

Slide 78

Slide 78 text

Design process is next A good figure, like good writing, doesn't simply happen—it is crafted. “Revise and rewrite” becomes “revise and redraw”. 78 Search for and find the design Refine Iterate Enjoy life!

Slide 79

Slide 79 text

Design process Develop a visual literacy to construct representations that are appealing and convincing 79

Slide 80

Slide 80 text

Why the word Design? • Design is a requirement not a cosmetic addition • Design is all around us • Industrial design is for objects you use • Graphic design is for designs you read • Well designed objects and figures provide visible clues to their underlying function • Is interaction important? • How easy to use is the provided functionality? 80 Wong, B. The design process. Nat Methods 8, 987 (2011). https://doi.org/10.1038/nmeth.1783

Slide 81

Slide 81 text

The story 81 Charles Minard. 1869

Slide 82

Slide 82 text

Example overview figure 82 • Represents a catalog of gene expression data from human cells treated with chemical and genetic reagents • Accentuate the steps with a mountain between ‘sample preparation’ and ‘data analysis’ (placed at 8:13) • Differentiate steps with color and find the physical location in the institute where the work is carried out • high contrast headings for 4 major features of the poster

Slide 83

Slide 83 text

Motivation • Computer-based visualization systems provide visual representations of datasets designed to help people carry out tasks more effectively 
 • Visualization or VIS is suitable when there is a need to augment human capabilities rather than replace people with computational decision-making methods 83 [Marie Neurath. Too Small To See. 1956]

Slide 84

Slide 84 text

Nested model of visualization • domain situation: 
 - who are the target users? • abstraction: translate from specifics of domain to vocabulary of vis 
 - what is shown? data abstraction 
 - why is the user looking at it? task abstraction • idiom 
 - how is it shown? 
 + visual encoding idiom: how to draw 
 + interaction idiom: how to manipulate • algorithm for efficient computation alization design algorithm idiom abstraction domain Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009) 84

Slide 85

Slide 85 text

Problem of validation Why is validation difficult? • different ways to get it wrong at each level 4 Domain situation You misunderstood their needs You’re showing them the wrong thing Visual encoding/interaction idiom The way you show it doesn’t work Algorithm Your code is too slow Data/task abstraction [A Nested Model of Visualization Design and Validation. Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009). ] 85 Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009)

Slide 86

Slide 86 text

Evaluation • Methods from many fields, qualitative & quantitative 
 - Controlled experiments in lab, field studies of deployed systems Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009) computer science design cognitive psychology anthropology/ 
 ethnography anthropology/ 
 ethnography Domain situation Observe target users using existing tools Visual encoding/interaction idiom Justify design with respect to alternatives Algorithm Measure system time/memory Analyze computational complexity Observe target users after deployment ( ) Measure adoption Analyze results qualitatively Measure human time with lab experiment (lab study) Data/task abstraction 86

Slide 87

Slide 87 text

Why represent all the data? Summaries lose information, details matter – confirm expected and find unexpected patterns – assess validity of statistical model Identical statistics x mean 9 x variance 10 y mean 7,5 y variance 3,75 x/y correlation 0,816 Anscombe’s Quartet X Y X Y X Y X Y Mean Variance Correlation 87

Slide 88

Slide 88 text

Why analyze? • imposes a structure on huge design space – scaffold to help you think systematically about choices – analyzing existing as stepping stone to designing new Present Locate Identify Path between two nodes Actions Targets SpaceTree TreeJuxtaposer Encode Navigate Select Filter Aggregate Tree Arrange Why? What? How? Encode Navigate Select 88 Tamara Munzner. Visualization Analysis and Design

Slide 89

Slide 89 text

Examples of How? 89 SpaceTree: Supporting Exploration in Large Node Link Tree, Design Evolution and Empirical Evaluation. Grosjean, Plaisant, and Bederson. Proc. InfoVis 2002, p 57–64. SpaceTree Tamara Munzner. Visualization Analysis and Design ntify SpaceTree TreeJuxtaposer Encode Navigate Select Filter Aggregate Arrange How? Encode Navigate Select

Slide 90

Slide 90 text

Examples of How? TreeJuxtaposer: Scalable Tree Comparison Using Focus+Context With Guaranteed Visibility. ACM Trans. on Graphics (Proc. SIGGRAPH) 22:453– 462, 2003. TreeJuxtaposer 90 Tamara Munzner. Visualization Analysis and Design ntify SpaceTree TreeJuxtaposer Encode Navigate Select Filter Aggregate Arrange How? Encode Navigate Select

Slide 91

Slide 91 text

What to analyze? 91 Why? How? What? Dataset Types Dataset Availability Static Dynamic Tables Attributes (columns) Items (rows) Cell containing value Networks Link Node (item) Trees Fields (Continuous) Geometry (Spatial) Attributes (columns) Value in cell Cell Multidimensional Table Value in cell Ordering Direction Sequential Diverging Cyclic Grid of positions Position Datasets What? Attributes Dataset Types Data Types Data and Dataset Types Tables Attributes (columns) Items (rows) Cell containing value Networks Link Node (item) Fields (Continuous) Attributes (columns) Cell Items Attributes Links Positions Grids Attribute Types Ordering Direction Categorical Ordered Ordinal Quantitative Sequential Diverging Tables Networks & Trees Fields Geometry Clusters, Sets, Lists Items Attributes Items (nodes) Links Attributes Grids Positions Attributes Items Positions Items Grid of positions Datasets What? Attributes Dataset Types Data Types Data and Dataset Types Tables Attributes (columns) Items (rows) Cell containing value Networks Link Node (item) Fields (Continuous) Attributes (columns) Cell Items Attributes Links Positions Grids Attribute Types Ordering Direction Categorical Ordered Ordinal Quantitative Sequential Diverging Tables Networks & Trees Fields Geometry Clusters, Sets, Lists Items Attributes Items (nodes) Links Attributes Grids Positions Attributes Items Positions Items Grid of positions Datasets What? Attributes Dataset Types Data Types Data and Dataset Types Tables Attributes (columns) Items (rows) Cell containing value Networks Link Node (item) Fields (Continuous) Attributes (columns) Cell Items Attributes Links Positions Grids Attribute Types Ordering Direction Categorical Ordered Ordinal Quantitative Sequential Diverging Tables Networks & Trees Fields Geometry Clusters, Sets, Lists Items Attributes Items (nodes) Links Attributes Grids Positions Attributes Items Positions Items Grid of positions Tamara Munzner. Visualization Analysis and Design

Slide 92

Slide 92 text

What to analyze? 92 Why? What? Dataset Types Dataset Availability Static Dynamic Tables Attributes (columns) Items (rows) Cell containing value Networks Link Node (item) Trees Fields (Continuous) Geometry (Spatial) Attributes (columns) Value in cell Cell Multidimensional Table Value in cell Ordering Direction Quantitative Sequential Diverging Cyclic Items Attributes Items (nodes) Links Attributes Grids Positions Attributes Items Positions Items Grid of positions Position Why? How? What? Dataset Types Dataset Availability Static Dynamic Tables Attributes (columns) Items (rows) Cell containing value Networks Link Node (item) Trees Fields (Continuous) Geometry (Spatial) Attributes (columns) Value in cell Cell Multidimensional Table Value in cell Ordering Direction Sequential Diverging Cyclic Grid of positions Position Dataset Availability Static Dynamic Tables Attributes (columns) Items (rows) Cell containing value Networks Link Node (item) Trees Fields (Continuous) Geometry (Spatial) Attributes (columns) Value in cell Cell Multidimensional Table Value in cell Grid of positions Position Why? What? Dataset Types Dataset Availability Static Dynamic Tables Attributes (columns) Items (rows) Cell containing value Networks Link Node (item) Trees Fields (Continuous) Geometry (Spatial) Attributes (columns) Value in cell Cell Multidimensional Table Value in cell Ordering Direction Quantitative Sequential Diverging Cyclic Items Attributes Items (nodes) Links Attributes Grids Positions Attributes Items Positions Items Grid of positions Position Tamara Munzner. Visualization Analysis and Design

Slide 93

Slide 93 text

Types: Datasets and data Tables Attributes (columns) Items (rows) Cell containing value Networks Link Node (item) Trees Fields (Continuous) Attributes (columns) Value in cell Cell Multidimensional Table Value in cell Grid of positions Geometry (Spatial) Position Dataset Types Tables Attributes (columns) Items (rows) Cell containing value Networks Link Node (item) Trees Fields (Continuous) Attributes (columns) Value in cell Cell Multidimensional Table Value in cell Grid of positions Geometry (Spatial) Position Dataset Types Tables Attributes (columns) Items (rows) Cell containing value Networks Link Node (item) Trees Fields (Continuous) Attributes (columns) Value in cell Cell Multidimensional Table Value in cell Grid of positions Geometry (Spatial) Position Dataset Types Spatial Tables Attributes (columns) Items (rows) Cell containing value Networks Link Node (item) Trees Fields (Continuous) Attributes (columns) Value in cell Cell Multidimensional Table Value in cell Grid of positions Ge Dataset Types Attributes Attribute Types Ordering Direction Categorical Ordered Ordinal Quantitative Sequential Diverging Cyclic 93 Why? How? What? Dataset Types Dataset Availability Static Dynamic Tables Attributes (columns) Items (rows) Cell containing value Networks Link Node (item) Trees Fields (Continuous) Geometry (Spatial) Attributes (columns) Value in cell Cell Multidimensional Table Value in cell Ordering Direction Sequential Diverging Cyclic Grid of positions Position Tamara Munzner. Visualization Analysis and Design

Slide 94

Slide 94 text

Types: Datasets and data Tables Attributes (columns) Items (rows) Cell containing value Networks Link Node (item) Trees Fields (Continuous) Attributes (columns) Value in cell Cell Multidimensional Table Value in cell Grid of positions Geometry (Spatial) Position Dataset Types Tables Attributes (columns) Items (rows) Cell containing value Networks Link Node (item) Trees Fields (Continuous) Attributes (columns) Value in cell Cell Multidimensional Table Value in cell Grid of positions Geometry (Spatial) Position Dataset Types Tables Attributes (columns) Items (rows) Cell containing value Networks Link Node (item) Trees Fields (Continuous) Attributes (columns) Value in cell Cell Multidimensional Table Value in cell Grid of positions Geometry (Spatial) Position Dataset Types Spatial Tables Attributes (columns) Items (rows) Cell containing value Networks Link Node (item) Trees Fields (Continuous) Attributes (columns) Value in cell Cell Multidimensional Table Value in cell Grid of positions Ge Dataset Types Attributes Attribute Types Ordering Direction Categorical Ordered Ordinal Quantitative Sequential Diverging Cyclic 94 Why? How? What? Dataset Types Dataset Availability Static Dynamic Tables Attributes (columns) Items (rows) Cell containing value Networks Link Node (item) Trees Fields (Continuous) Geometry (Spatial) Attributes (columns) Value in cell Cell Multidimensional Table Value in cell Ordering Direction Sequential Diverging Cyclic Grid of positions Position Tamara Munzner. Visualization Analysis and Design

Slide 95

Slide 95 text

Trends Actions Analyze Search Query Why? All Data Outliers Features Attributes One Many Distribution Dependency Correlation Similarity Network Data Topology Paths Extremes Consume Present Enjoy Discover Produce Annotate Record Derive Identify Compare Summarize tag Target known Target unknown Location known Location unknown Lookup Locate Browse Explore Targets What? Why analyze data? 95 Tamara Munzner. Visualization Analysis and Design Search Query At Ne Sp Produce Annotate Record Derive Identify Compare Summarize tag Target known Target unknown Location known Location unknown Lookup Locate Browse Explore Trends Actions Analyze Search Query Why? All Data Outliers Features Attributes One Many Distribution Dependency Correlation Similarity Network Data Topology Paths Extremes Consume Present Enjoy Discover Produce Annotate Record Derive Identify Compare Summarize tag Target known Target unknown Location known Location unknown Lookup Locate Browse Explore Targets What?

Slide 96

Slide 96 text

Trends Actions Analyze Search Query Why? All Data Outliers Features Attributes One Many Distribution Dependency Correlation Similarity Network Data Topology Paths Extremes Consume Present Enjoy Discover Produce Annotate Record Derive Identify Compare Summarize tag Target known Target unknown Location known Location unknown Lookup Locate Browse Explore Targets What? Why analyze data? 96 Tamara Munzner. Visualization Analysis and Design Trends Why? All Data Outliers Features Attributes One Many Distribution Dependency Correlation Similarity Network Data Topology Paths Extremes Targets What? Search Query Attributes One Many Distribution Dependency Correlation Sim Network Data Spatial Data Shape Topology Paths Extremes Produce Annotate Record Derive Identify Compare Summarize tag Target known Target unknown Location known Location unknown Lookup Locate Browse Explore Why How What Trends Why? All Data Outliers Features Attributes One Many Distribution Dependency Correlation Similarity Network Data Topology Paths Extremes Targets What?

Slide 97

Slide 97 text

1. Action: Analyze • consume – discover vs present • classic split • aka explore vs explain – enjoy • newcomer • aka casual, social • produce – annotate, record – derive • crucial design choice Analyze Search Query Consume Present Enjoy Discover Produce Annotate Record Derive tag Target known Target unknown Location known Location unknown Lookup Locate Browse Explore Actions 97 Tamara Munzner. Visualization Analysis and Design Analyze Search Query Consume Present Enjoy Discover Produce Annotate Record Derive tag Target known Target unknown Location known Location unknown Lookup Locate Browse Explore Actions

Slide 98

Slide 98 text

2. Action: Search • what does user know? – target, location 98 Search Query Produce Annotate Record Derive Identify Compare Summarize tag Target known Target unknown Location known Location unknown Lookup Locate Browse Explore Tamara Munzner. Visualization Analysis and Design

Slide 99

Slide 99 text

3. Action: Query • what does user know? – target, location • how much of the data matters? – one, some, all • analyze, search, query – independent choices for each 99 Search Query Produce Annotate Record Derive Identify Compare Summarize tag Target known Target unknown Location known Location unknown Lookup Locate Browse Explore Tamara Munzner. Visualization Analysis and Design Search Query Produce Annotate Record Derive Identify Compare Summarize tag Target known Target unknown Location known Location unknown Lookup Locate Browse Explore

Slide 100

Slide 100 text

Targets 100 Trends Why? All Data Outliers Features Attributes One Many Distribution Dependency Correlation Similarity Network Data Topology Paths Extremes Targets What? Search Query One Many Distribution Dependency Correla Network Data Spatial Data Shape Topology Paths Extremes Produce Annotate Record Derive Identify Compare Summarize tag Target known Target unknown Location known Location unknown Lookup Locate Browse Explore Tamara Munzner. Visualization Analysis and Design

Slide 101

Slide 101 text

How to visualize? 101 Tamara Munzner. Visualization Analysis and Design How? Encode Manipulate Facet Reduce Arrange Map Change Select Navigate Express Separate Order Align Use Juxtapose Partition Superimpose Filter Aggregate Embed Color Size, Angle, Curvature, ... Hue Saturation Luminance from categorical and ordered attributes How? Encode Manipulate Facet Reduce Arrange Map Change Select Navigate Express Separate Order Align Use Juxtapose Partition Superimpose Filter Aggregate Embed Color Size, Angle, Curvature, ... Hue Saturation Luminance from categorical and ordered attributes Map Select Navigate Order Align Use Pa Su Color Motion Size, Angle, Curvature, ... Hue Saturation Luminance Shape Direction, Rate, Frequency, ... from categorical and ordered attributes

Slide 102

Slide 102 text

How to visualize? How? Manipulate Facet Reduce Change Select Navigate Juxtapose Partition Superimpose Filter Aggregate Embed e 102 Tamara Munzner. Visualization Analysis and Design How? Encode Manipulate Facet Reduce Arrange Map Change Select Navigate Express Separate Order Align Use Juxtapose Partition Superimpose Filter Aggregate Embed Color Size, Angle, Curvature, ... Hue Saturation Luminance from categorical and ordered attributes How? Manipulate Facet Reduce Change Select Navigate Juxtapose Partition Superimpose Filter Aggregate Embed e How? Manipulate Facet Reduce Change Select Navigate Juxtapose Partition Superimpose Filter Aggregate Embed e

Slide 103

Slide 103 text

How to visually encode information? • analyze idiom structure Tamara Munzner. Visualization Analysis and Design 103

Slide 104

Slide 104 text

Definition: Marks and channels Tamara Munzner. Visualization Analysis and Design 104 • marks – geometric primitives • channels – control appearance of marks Horizontal Position Vertical Both Color Shape Tilt Size Length Area Volume Points Lines Areas 104

Slide 105

Slide 105 text

How to visually encode information? • analyze idiom structure 
 — as combination of marks and channels Tamara Munzner. Visualization Analysis and Design 105 1: 
 vertical position mark: line 2: 
 vertical position horizontal position mark: point 3: 
 vertical position horizontal position color hue mark: point 4: 
 vertical position horizontal position color hue size (area) mark: point

Slide 106

Slide 106 text

Channels 106 Magnitude Channels: Ordered Attributes Identity Channels: Categorical Attributes Spatial region Color hue Motion Shape Position on common scale Position on unaligned scale Length (1D size) Tilt/angle Area (2D size) Depth (3D position) Color luminance Color saturation Curvature Volume (3D size) Channels: Expressiveness Types And E ectiveness Ranks Tamara Munzner. Visualization Analysis and Design Magnitude Channels: Ordered Attributes Identity Channels: Categorical Attributes Spatial region Color hue Motion Shape Position on common scale Position on unaligned scale Length (1D size) Tilt/angle Area (2D size) Depth (3D position) Color luminance Color saturation Curvature Volume (3D size) Channels: Expressiveness Types And E ectiveness Ranks

Slide 107

Slide 107 text

Channels • expressiveness principle 
 — match channel and data characteristics • effectiveness principle 
 — encode most important attributes with highest ranked channels Magnitude Channels: Ordered Attributes Identity Channels: Categorical Attributes Spatial region Color hue Motion Shape Position on common scale Position on unaligned scale Length (1D size) Tilt/angle Area (2D size) Depth (3D position) Color luminance Color saturation Curvature Volume (3D size) Channels: Expressiveness Types And E ectiveness Ranks Tamara Munzner. Visualization Analysis and Design 107

Slide 108

Slide 108 text

Problem: Visual complexity Tamara Munzner. Visualization Analysis and Design 108 Four strategies: 1. change view over time 2. facet across multiple views 3. reduce items/attributes within single view 4. derive new data to show within view

Slide 109

Slide 109 text

Complexity: Strategies Tamara Munzner. Visualization Analysis and Design 109 Search Query Consume Present Enjoy Discover Produce Annotate Record Derive tag Target known Target unknown Location known Location unknown Lookup Locate Browse Explore How? Manipulate Facet Reduce Change Select Navigate Juxtapose Partition Superimpose Filter Aggregate Embed nce How? Manipulate Facet Reduce Change Select Navigate Juxtapose Partition Superimpose Filter Aggregate Embed nce How? Manipulate Facet Reduce Change Select Navigate Juxtapose Partition Superimpose Filter Aggregate Embed nce

Slide 110

Slide 110 text

Strategy 1: Change over time Tamara Munzner. Visualization Analysis and Design How? Manipulate Facet Reduce Change Select Navigate Juxtapose Partition Superimpose Filter Aggregate Embed nce How? Manipulate Facet Reduce Change Select Navigate Juxtapose Partition Superimpose Filter Aggregate Embed nce How? Manipulate Facet Reduce Change Select Navigate Juxtapose Partition Superimpose Filter Aggregate Embed nce 110 Search Query Consume Present Enjoy Discover Produce Annotate Record Derive tag Target known Target unknown Location known Location unknown Lookup Locate Browse Explore

Slide 111

Slide 111 text

Idiom: Animated transitions 111 [Using Multilevel Call Matrices in Large Software Projects. van Ham. Proc. IEEE InfoVis, pp. 227–232, 2003.] • smooth transition from one state to another 
 — alternative to jump cuts 
 — support for item tracking when amount of change is limited • example: multilevel matrix views 
 — scope of what is shown narrows down info • middle block stretches to fill space, additional structure appears within • other blocks squish down to increasingly aggregated representations

Slide 112

Slide 112 text

Strategy 2: Facet Tamara Munzner. Visualization Analysis and Design 112 Search Query Consume Present Enjoy Discover Produce Annotate Record Derive tag Target known Target unknown Location known Location unknown Lookup Locate Browse Explore How? Manipulate Facet Reduce Change Select Navigate Juxtapose Partition Superimpose Filter Aggregate Embed nce How? Manipulate Facet Reduce Change Select Navigate Juxtapose Partition Superimpose Filter Aggregate Embed nce How? Manipulate Facet Reduce Change Select Navigate Juxtapose Partition Superimpose Filter Aggregate Embed nce

Slide 113

Slide 113 text

Strategy 2: Facet Facet Reduce Juxtapose Partition Superimpose Filter Aggregate Embed Superimpose Layers Coordinate Multiple Side By Side Views Share Encoding: Same/Di erent Share Data: All/Subset/None Share Navigation All Subset No Linked Highlighting 113 Tamara Munzner. Visualization Analysis and Design

Slide 114

Slide 114 text

• see how regions contiguous in one view are distributed within another – powerful and pervasive interaction idiom • encoding: different – multiform • data: all shared [Visual Exploration of Large Structured Datasets. Wills. NTTS, pp. 237–246. IOS Press, 1995.] 114 Idiom: Linked Highlighting

Slide 115

Slide 115 text

• encoding: same • data: subset shared • navigation: shared – bidirectional linking • differences – viewpoint – (size) • overview-detail • System: Google Maps 115 Idiom: Bird’s-eye maps [A Review of Overview+Detail, Zooming, and Focus+Context Interfaces. Cockburn, Karlson, and Bederson. ACM Computing Surveys 41:1 (2008), 1–31.]

Slide 116

Slide 116 text

• encoding: same • data: none shared – different attributes for node colors – (same network layout) • navigation: shared 
 • System: Cerebral 116 Idiom: Small multiples [Cerebral: Visualizing Multiple Experimental Conditions on a Graph with Biological Context. Barsky, Munzner, Gardy, and Kincaid. IEEE Trans. Visualization and Computer Graphics (Proc. InfoVis 2008) 14:6 (2008), 1253–1260.]

Slide 117

Slide 117 text

• benefits: eyes vs memory – lower cognitive load to move eyes between 2 views than remembering previous state with single view • costs: display area, 2 views side by side each have only half the area of one view 117 Strategy 2: Facet: Coordinate views Tamara Munzner. Visualization Analysis and Design All Subset Same Multiform Multiform, Overview/ Detail None Redundant No Linkage Small Multiples Overview/ Detail

Slide 118

Slide 118 text

Strategy 2: Facet: Partition • how to divide data between views – encodes association between items using spatial proximity – major implications for what patterns are visible – split according to attributes • design choices – how many splits • all the way down: one mark per region? • stop earlier, for more complex structure within region? – order in which attributes used to split – how many views Partition into Side-by-Side Views Superimpose Layers Share Navigation All Subset Same Multiform Multiform, Overview/ Detail None Redundant No Linkage Small Multiples Overview/ Detail e Facet Reduce Juxtapose Partition Superimpose Filter Aggregate Embed Why? How? What? 118 Tamara Munzner. Visualization Analysis and Design

Slide 119

Slide 119 text

Partition: List alignment • single bar chart with grouped bar s – split by state into region s • complex glyph within each region showing all age s – compare: easy within state, harder across ages Tamara Munzner. Visualization Analysis and Design 11.0 10.0 9.0 8.0 7.0 6.0 5.0 4.0 3.0 2.0 1.0 0.0 CA TK NY FL IL PA 65 Years and Over 45 to 64 Years 25 to 44 Years 18 to 24 Years 14 to 17 Years 5 to 13 Years Under 5 Years CA TK NY FL IL PA 0 5 11 0 5 11 0 5 11 0 5 11 0 5 11 0 5 11 0 5 11 • small multiple bar chart s – split by age into region s • one chart per regio n – compare: easy within age, harder across states 11.0 10.0 9.0 8.0 7.0 6.0 5.0 4.0 3.0 2.0 1.0 0.0 CA TK NY FL IL PA 65 Years and Over 45 to 64 Years 25 to 44 Years 18 to 24 Years 14 to 17 Years 5 to 13 Years Under 5 Years CA TK NY FL IL PA 0 5 11 0 5 11 0 5 11 0 5 11 0 5 11 0 5 11 0 5 11 119

Slide 120

Slide 120 text

Strategy 2: Facet: Partition • split by neighborhood, then by type, then by time - years as rows - months as columns • color by price • neighborhood patterns • where it’s expensive • where you pay much more for detached type 120 [Configuring Hierarchical Layouts to Address Research Questions. Slingsby, Dykes, and Wood. IEEE Transactions on Visualization and Computer Graphics (Proc. InfoVis 2009) 15:6 (2009), 977–984.]

Slide 121

Slide 121 text

Strategy 2: Facet: Partition • switch order of splits - type then neighborhood • switch color - by price variation • type patterns - within specific type, which neighborhoods are inconsistent 121 [Configuring Hierarchical Layouts to Address Research Questions. Slingsby, Dykes, and Wood. IEEE Transactions on Visualization and Computer Graphics (Proc. InfoVis 2009) 15:6 (2009), 977–984.]

Slide 122

Slide 122 text

Strategy 2: Facet: Partition • different encodings for second-level regions - chloropleth maps 122 [Configuring Hierarchical Layouts to Address Research Questions. Slingsby, Dykes, and Wood. IEEE Transactions on Visualization and Computer Graphics (Proc. InfoVis 2009) 15:6 (2009), 977–984.]

Slide 123

Slide 123 text

Strategy 3: Reduce 123 Tamara Munzner. Visualization Analysis and Design Search Query Consume Present Enjoy Discover Produce Annotate Record Derive tag Target known Target unknown Location known Location unknown Lookup Locate Browse Explore How? Manipulate Facet Reduce Change Select Navigate Juxtapose Partition Superimpose Filter Aggregate Embed nce How? Manipulate Facet Reduce Change Select Navigate Juxtapose Partition Superimpose Filter Aggregate Embed nce How? Manipulate Facet Reduce Change Select Navigate Juxtapose Partition Superimpose Filter Aggregate Embed nce

Slide 124

Slide 124 text

Strategy 3: Reduce Reduce Filter Aggregate Embed Reducing Items and Attributes Filter Items Attributes Aggregate Items Attributes 124 Tamara Munzner. Visualization Analysis and Design • reduce/increase: filters – pro: straightforward and intuitive to understand and compute – con: out of sight, out of mind • aggregation – pro: inform about whole set – con: difficult to avoid losing signal • not mutually exclusive – combine filter, aggregate – combine reduce, facet, change, derive

Slide 125

Slide 125 text

• static item aggregation • task: find distribution • data: table • derived data – 5 quant attributes • median: central line • lower and upper quartile: boxes • lower upper fences: whiskers or values beyond which items are outliers – outliers beyond fence cutoffs explicitly shown Idiom: Boxplot multi-modality is particularly ! ! ! ! ! ! ! ! ! n s k mm !2 0 2 4 !2 0 2 4 Figure 4: From left to right: box right are: standard normal (n), r [40 years of boxplots. Wickham and Stryjewski. 2012. had.co.nz]125

Slide 126

Slide 126 text

Idiom: Dimensionality reduction • specifically applied to docs • attribute aggregation – derive low-dimensional target space from high-dimensional measured space Task 1 In HD data Out 2D data Produce In High- dimensional data Why? What? Derive In 2D data Task 2 Out 2D data How? Why? What? Encode Navigate Select Discover Explore Identify In 2D data Out Scatterplot Out Clusters & points Out Scatterplot Clusters & points Task 3 In Scatterplot Clusters & points Out Labels for clusters Why? What? Produce Annotate In Scatterplot In Clusters & points Out Labels for clusters wombat Tamara Munzner. Visualization Analysis and Design 126 Task 1 In HD data Out 2D data Produce In High- dimensional data Why? What? Derive In 2D data Task 2 Out 2D data How? Why? What? Encode Navigate Select Discover Explore Identify In 2D data Out Scatterplot Out Clusters & points Out Scatterplot Clusters & points Task 3 In Scatterplot Clusters & points Out Labels for clusters Why? What? Produce Annotate In Scatterplot In Clusters & points Out Labels for clusters wombat Task 1 In HD data Out 2D data Produce In High- dimensional data Why? What? Derive In 2D data Task 2 Out 2D data How? Why? What? Encode Navigate Select Discover Explore Identify In 2D data Out Scatterplot Out Clusters & points Out Scatterplot Clusters & points Task 3 In Scatterplot Clusters & points Out Labels for clusters Why? What? Produce Annotate In Scatterplot In Clusters & points Out Labels for clusters wombat

Slide 127

Slide 127 text

Let’s take a step back! evels of visualization design ? f domain to 
 a abstraction ing at it? task algorithm idiom abstraction domain Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009) 127 algorithm idiom abstraction domain

Slide 128

Slide 128 text

Refine graphical characteristics • Pencil and paper • facilitate thinking and hypothesis generation • inward reflection • outward expression • constructive activity • thinking specific and explicit • demanding activity • to contextualize our understanding spatially Wong, B., Kjærgaard, R. Pencil and paper. Nat Methods 9, 1037 (2012). https://doi.org/10.1038/nmeth.2223 128 Depict Data Studio. The Data Visualization Design Process: A Step-by-Step Guide for Beginners

Slide 129

Slide 129 text

Layout every detail Giorgia Lupi, 2016. Sketching with Data Opens the Mind’s Eye129

Slide 130

Slide 130 text

Sketch to structure 130 Michele Graffieti & Giorgia Lupi, 2016. Sketching with Data Opens the Mind’s Eye

Slide 131

Slide 131 text

Explore design elements 131 Michele Graffieti & Giorgia Lupi, 2016. Sketching with Data Opens the Mind’s Eye

Slide 132

Slide 132 text

Drawing to refine 132 Michele Graffieti & Giorgia Lupi, 2016. Sketching with Data Opens the Mind’s Eye

Slide 133

Slide 133 text

Final design Michele Graffieti & Giorgia Lupi, 2016. Sketching with Data Opens the Mind’s Eye 133

Slide 134

Slide 134 text

Pointers • In education, drawing improves comprehension of scientific concepts • Students were found to perform markedly better after they had been prompted to generate, justify and refine visual representations of classroom material • Drawing is to augment our short term working memory 134 Wong, B., Kjærgaard, R. Pencil and paper. Nat Methods 9, 1037 (2012). https://doi.org/10.1038/nmeth.2223

Slide 135

Slide 135 text

Visual working memory 135 Wong, B., Kjærgaard, R. Pencil and paper. Nat Methods 9, 1037 (2012). https://doi.org/10.1038/nmeth.2223 • Table describes a simple network where connections between the nodes are indicated by filled cells • Connections are arranged as rows and columns • Try to mentally picture the underlying network!

Slide 136

Slide 136 text

Sketching 136 Giorgia Lupi, 2016. Sketching with Data Opens the Mind’s Eye

Slide 137

Slide 137 text

Color is next 137 Newton’s and Goethe’s color wheels

Slide 138

Slide 138 text

Color 138 The property possessed by an object of producing different sensations on the eye as a result of the way it reflects or emits light

Slide 139

Slide 139 text

Color as a representation Newton’s and Goethe’s color wheels

Slide 140

Slide 140 text

Color in culture 140

Slide 141

Slide 141 text

Color as illusion 141 colors to specific categories, color can bias the reader such a potent differentiator, the appropriate strategy i that are discernible from one another but comparab Color is a relative medium, and neighboring c visual perception. For example, it is possible to color look different or different colors appear the the same) by changing only the background color perception of color depends on context, and manip butes of neighboring colors affects how we see th A heat map requires us to judge the relative bright a matrix. The interaction of color can cause a pro makes this graphical representation suffer (Fig. 1c Every color is described by three properties: hu lightness. Hue is the attribute we use to classify a colo Saturation describes the neutrality of a color; a red o no white is said to be very saturated. The lightness about its relative ordering on the dark-to-light scale Figure 1 | Perception of color can vary. (a,b) The same colo (a), and different colors can appear to be nearly the same b background color (b)1. (c) The rectangles in the heat map i * a c b Albers, J. Interaction of Color (Yale University Press, New Haven, Connecticut, USA, 1975)

Slide 142

Slide 142 text

Color constancy 142

Slide 143

Slide 143 text

Color as three numbers •trichromatic cone cells respond to 1 out of 3 frequencies exhibited by photons arriving on their surface •only about 6 — 7 million of cones •different cone cell responses: area function of wavelength [Representing Colors as Three Numbers, Stone, IEEE Computer Graphics and Applications, 25(4), July 2005, pp. 78-85] 143

Slide 144

Slide 144 text

Color as cone cell responses •different cone cell responses: area function of wavelength •for a given spectrum 
 - multiply by response curve 
 - integrate to get response [Ch 10: Color. Papers: Colors as Three Numbers. Munzner, 2015]144

Slide 145

Slide 145 text

• brain sees only cone response 
 - different spectra appear the same Spectral Sensitivity 6 Wavelength (nm) IR UV Visible Spectrum • varies strongly over the wavelength range between 380 and 800 nm [Representing Colors as Three Numbers, Stone, IEEE Computer Graphics and Applications, 25(4), July 2005, pp. 78-85]
 [Ch 10: Color. Papers: Colors as Three Numbers. Munzner, 2015] Metamerism: Several similar segments 145

Slide 146

Slide 146 text

Color as three channels minance, saturation, hue r categorical for ordered e n mon color spaces r choice for visual encoding r, but beware ≠ luminance cy creating visual layers ot combine with luminance or saturation 8 Saturation Luminance values Hue 146 Wong, B. Color coding. Nat Methods 7, 573 (2010)

Slide 147

Slide 147 text

Color coding • adds dimensionality and richness to scientific communications • simplifies a complex analysis task • typically used to differentiate information into classes • challenge of picking colors that are discriminable • need for systematic approach for color coding 147 such a potent differentiator, the appropriate strategy is to choose colors that are discernible from one another but comparable in visibility. Color is a relative medium, and neighboring colors can affect visual perception. For example, it is possible to make the same color look different or different colors appear the same (or nearly the same) by changing only the background color (Fig. 1a,b). The perception of color depends on context, and manipulating the attri- butes of neighboring colors affects how we see the original color1. A heat map requires us to judge the relative brightness of colors in a matrix. The interaction of color can cause a profound effect that makes this graphical representation suffer (Fig. 1c). Every color is described by three properties: hue, saturation and lightness. Hue is the attribute we use to classify a color as red or yellow. Saturation describes the neutrality of a color; a red object with little or no white is said to be very saturated. The lightness of a color tells us about its relative ordering on the dark-to-light scale. cases To colo whil rang syste erty to bl ors. beco tural data Ju size the o hue, com and Co choi senta us m disce the d Bang 1. Alb Con Bang W of Tec Art as Figure 1 | Perception of color can vary. (a,b) The same color can look different (a), and different colors can appear to be nearly the same by changing the background color (b)1. (c) The rectangles in the heat map indicated by the asterisks (*) are the same color but appear to be different. * * a c b

Slide 148

Slide 148 text

Color to categorical data 148 • well suited to represent categorical data • distinguish between experimental conditions • if used by assigning intense or weak colors to specific categories, color can bias the reader. • color is a potent differentiator • appropriate strategy: choose colors that are discernible from one another but comparable in visibility.

Slide 149

Slide 149 text

Categorical color: Discriminability constraints • noncontiguous small regions of color: only 6-12 bins 10 [Cinteny: flexible analysis and visualization of synteny and genome rearrangements in multiple organisms. Sinha and Meller. BMC Bioinformatics, 8:82, 2007.] [Cinteny: flexible analysis and visualization of sentent and genome rearrangements in multiple organisms. Sinha and Meller. BMC Bioinformatics, 8:82, 2007] • noncontiguous small regions of color: only 6-12 bins • have a really good reason to use 10 or more categorical colors 149 How many categories?

Slide 150

Slide 150 text

[Tableau software blog. How we designed the new color palettes in Tableau 10? Stone, 2016] Tableau10 150

Slide 151

Slide 151 text

Color to quantitative data • define key regions or points in the data range that we intend to highlight before designing a color-coding scheme that varies the 3 components • need to determine the aspects of the data we want to make apparent • hue is impractical • principally need to rely on color value • reserve hue to indicate different segments of the data range • a meaningful range will be the extremes: min and max values • the zero value may also be interesting • different ranges for different contexts: sea level, absolute zero −273.15°Celsius

Slide 152

Slide 152 text

Color to quantitative data • color not ideal due to ambiguity of how colors should be ordered • is yellow smaller than blue? • could pattern the sequence after the ordering visible light by wavelength: ROYGBIV • Transition between colors are uneven, which breaks the correspondence between color and numerical value 152 Gehlenborg, N., Wong, B. Mapping quantitative data to color. Nat Methods 9, 769 (2012)

Slide 153

Slide 153 text

Color blindness • Tritanopia/Tritanomaly: Missing/malfunctioning S-cone (blue). • Deuteranopia/Deuteranomaly: Missing/malfunctioning M-cone (green). • Protanopia/Protanomaly: Missing/malfunctioning L-cone (red). Tritanomaly Normal Deuteroanomaly Protoanomaly • Monochromatism: either no cones available or just one type is missing • etc Protanopia Deuteranopia Tritanopia

Slide 154

Slide 154 text

Red-green color coding in an immuno f l uorescent image. Nat Methods. 2011 Jun;8(6):441

Slide 155

Slide 155 text

• Luminance is measurable • Lightness is perceived Luminance • Brightness is perceived Lightness relative to some average level in an image or environment • HSV and HSB are the same. V stands for Value and B stands for Brightness • HSL: L is Lightness All are employed for the same purpose: make an image more or less light- er! Luminance Lightness Brightness Disambiguation: Luminance

Slide 156

Slide 156 text

[Ch 10: Color. Papers: Colors as Three Numbers. Munzner, 2015. http:// www.cs.ubc.ca/~tmm/courses/547-15] • RGB 
 - convenient for machines
 - three channels are not separable
 • CIE XYZ
 - from color matching functions
 - perceptually based
 • HSL
 - a simple transformation from RGB
 - good: separates out lightness from hue and saturation
 - bad: lightness not true luminance
 - careful: only pseudo-perceptual 156 Color spaces

Slide 157

Slide 157 text

Color: Luminance, saturation, hue Color: Luminance, saturation, hue • 3 channels – identity for categorical • hue – magnitude for ordered • luminance • saturation • other common color spaces – RGB: poor choice for visual encoding – HSL: better, but beware • lightness ≠ luminance • transparency – useful for creating visual layers • but cannot combine with luminance or saturation 8 Corners of the RGB color cube L from HLS All the same Luminance values Color: Luminance, saturation, hue • 3 channels – identity for categorical • hue – magnitude for ordered • luminance • saturation • other common color spaces – RGB: poor choice for visual encoding – HSL: better, but beware • lightness ≠ luminance • transparency – useful for creating visual layers • but cannot combine with luminance or saturation 8 Saturation Luminance values Hue [Ch 10: Color. Papers: Colors as Three Numbers. Munzner, 2015] 157

Slide 158

Slide 158 text

Colormaps 9 • categorical limits: noncontiguous – 6-12 bins hue/color • far fewer if colorblind – 3-4 bins luminance, saturation – size heavily affects salience • use high saturation for small regions, low saturation for large after [Color Use Guidelines for Mapping and Visualization. Brewer, 1994. http://www.personal.psu.edu/faculty/c/a/cab38/ColorSch/Schemes.html] Categorical Ordered Sequential Bivariate Diverging Binary Diverging Categorical Sequential Categorical Categorical [Ch 10: Color. Papers: Colors as Three Numbers. Munzner, 2015. http://www.cs.ubc.ca/~tmm/courses/547-15] Color maps

Slide 159

Slide 159 text

Exploiting the Power of the Human Visual System July 21, 2009 Contrast hierarchy creates layers Context Normal Urgent Context Normal Urgent Wrong Right Context Normal Normal Context Normal Normal From Larry Arend colorusage.arc.nasa.gov Rules for managing attention whisper silent shout indoor voice Maureen Stone, StoneSoup Consulting Jock Mackinlay, Tableau Software 4 From Vision and the Art of Seeing by Margaret Livingstone Luminance Hue & chroma Get it right in black & white Maps courtesy of the National Park Service (www.nps.gov) How do we fix this? Context Normal Urgent Context Normal [Exploiting the Power of the Human Visual System. Stone and Mackinlay, 2009] Color usage 159

Slide 160

Slide 160 text

Exploiting the Power of the Human Visual System July 21, 2009 From Stephen Few Bezold Effect Spreading: Adjacent colors blend What color is this? What color is this? What color is this? What color is this? Tufte’s Fundamental Uses To label • Primarily hue variation • Associated with color names To measure • Vary lightness & chroma • Map to data distribution • Map to data distribution [Exploiting the Power of the Human Visual System. Stone and Mackinlay, 2009] Color usage

Slide 161

Slide 161 text

Ordered color: Rainbow is poor default • problems – perceptually unordered – perceptually nonlinear • benefits – fine-grained structure visible and nameable • alternatives – fewer hues for large-scale structure – multiple hues with monotonically increasing luminance for fine-grained – segmented rainbows good for categorical, ok for binned 11 [Transfer Functions in Direct Volume Rendering: Design, Interface, Interaction. Kindlmann. SIGGRAPH 2002 Course Notes] [A Rule-based Tool for Assisting Colormap Selection. Bergman,. Rogowitz, and. Treinish. Proc. IEEE Visualization (Vis), pp. 118–125, 1995.] [Why Should Engineers Be Worried About Color? Treinish and Rogowitz 1998. http://www.research.ibm.com/people/l/lloydt/color/color.HTM] [Ch 10: Color. Papers: Colors as Three Numbers. Munzner, 2015. http://www.cs.ubc.ca/~tmm/courses/547-15] Ordered color: Rainbow is poor default • problems – perceptually unordered – perceptually nonlinear • benefits – fine-grained structure visible and nameable • alternatives – fewer hues for large-scale structure – multiple hues with monotonically increasing luminance for fine-grained – segmented rainbows good for categorical, ok for binned 11 [Transfer Functions in Direct Volume Rendering: Design, Interface, Interaction. Kindlmann. SIGGRAPH 2002 Course Notes] [A Rule-based Tool for Assisting Colormap Selection. Bergman,. Rogowitz, and. Treinish. Proc. IEEE Visualization (Vis), pp. 118–125, 1995.] [Why Should Engineers Be Worried About Color? Treinish and Rogowitz 1998. http://www.research.ibm.com/people/l/lloydt/color/color.HTM] Ordered color

Slide 162

Slide 162 text

[New matplotlib colormaps. https://bids.github.io/colormap/] Better alternatives

Slide 163

Slide 163 text

The case of the viridis map

Slide 164

Slide 164 text

• what is the color used for? • what type of imagery needs to be colored? • what can we assume about the display? • what can we assume about the user? • what can we assume about the task? [What’s so hard about categorical color? Stone. StoneSoup Consulting] 164 Helpful tips

Slide 165

Slide 165 text

12 Map other channels • size – length accurate, 2D area ok, 3D volume poor • angle – nonlinear accuracy • horizontal, vertical, exact diagonal • shape – complex combination of lower-level primitives – many bins • motion – highly separable against static • binary: great for highlighting – use with care to avoid irritation [Ch 10: Color. Papers: Colors as Three Numbers. Munzner, 2015. http://www.cs.ubc.ca/~tmm/courses/547-15] 12 ap other channels size – length accurate, 2D area ok, 3D volume poor angle – nonlinear accuracy • horizontal, vertical, exact diagonal shape – complex combination of lower-level primitives – many bins motion – highly separable against static • binary: great for highlighting – use with care to avoid irritation Motion Direction, Rate, Frequency, ... Length Angle Curvature Area Volume Size, Angle, Curvature, ... Shape Motion 165 Map other channels!

Slide 166

Slide 166 text

166 Real world example using color

Slide 167

Slide 167 text

man Visual System July 21, 2009 If you can’t use color wisely, it is best to avoid it entirely Above all, do no harm If you can’t use color wisely, it is best to avoid it entirely If you can’t use color wisely, it is best to avoid it entirely Above all, do no harm. it is best to avoid it entirely Above all, do no harm [Exploiting the Power of the Human Visual System. Stone and Mackinlay, 2009] 167

Slide 168

Slide 168 text

No content

Slide 169

Slide 169 text

Plot types is next 169 IBM Design

Slide 170

Slide 170 text

Plot types 170 Creating a simple yet effective plot requires an understanding of data and tasks

Slide 171

Slide 171 text

What is your intent? Adapted from IBM Design171

Slide 172

Slide 172 text

Bar charts • typically used to visualize quantities associated with a set of items • Bar charts are appropriate for counts • Bar charts encode quantities by length • Stacked bar charts enable comparison across items • Layered bar charts support comparison within categories • Grouped bar charts allow comparison across categories 172 xkcd

Slide 173

Slide 173 text

Bar charts and box plots 173 Streit, M., Gehlenborg, N. Bar charts and box plots. Nat Methods 11, 117 (2014)

Slide 174

Slide 174 text

Box plots • typically when dealing with quantities sampled from a population (VS a set of counts) the data contains uncertainty! • Bar charts aren’t suitable to add error bars because misleading • Bar charts start at zero, the resulting range might not have been observed • Box plots better fit and represent the characteristics of a distribution 174 xkcd

Slide 175

Slide 175 text

Representation of distributions 175 Streit, M., Gehlenborg, N. Bar charts and box plots. Nat Methods 11, 117 (2014)

Slide 176

Slide 176 text

Helpful tips • for better readability you may oder each • order bars by heights • order boxes by medians • use zero as base line for bar charts unless there’s a good reason • facilitate interpretation and comparison by adding ticks, marks and, if necessary, grid lines to show smaller differences • fill with solid color and forgo outlines • avoid more than 10 colors 176 xkcd

Slide 177

Slide 177 text

Sets and intersections • sets is a universal concept • examples: 
 - bacteria in a soil sample 
 - enzymes in biochem pathway 
 - variants in a genome 
 - proteins in a serum 
 - genes in a patient cohort • often the task is to identify these sets • other common task: analysis of the similarities and differences of n sets by using the concept of intersection 177 xkcd

Slide 178

Slide 178 text

Euler diagrams • Euler diagrams represent intersecting sets as overlapping shapes: circles, ellipses, etc • They are drawn so that there area is proportional to the number of elements they represent • effective vis to encode all sets of intersections is to use a matrix with a binary pattern to render bars above the columns representing each intersection (sorted, scaled, etc) 178 Lex, A., Gehlenborg, N. Sets and intersections. Nat Methods 11, 779 (2014)

Slide 179

Slide 179 text

Venn diagrams • Venn diagrams are identical with the exception that they show all intersections • all intersections include empty sets (which are not drawn in Euler diagrams) 179

Slide 180

Slide 180 text

Example explosion 180 xkcd

Slide 181

Slide 181 text

Let’s go further! Lex, A., Gehlenborg, N. Sets and intersections. Nat Methods 11, 779 (2014) 181

Slide 182

Slide 182 text

Heat maps 182 Toussaint Loua, Atlas statistique de la population de Paris(1873)

Slide 183

Slide 183 text

Heat maps • represent 2D tables of numbers as shades of colors • very popular in Biology, to depict gene expression, high- throughput data, multivariate data, etc • dense and intuitive • hundreds of rows can be displayed on a screen • rely on color encoding and on meaningful reordering of the rows and columns 183 Ramakrishna, C., Corleto, J., Ruegger, P.M. et al. Dominant Role of the Gut Microbiota in Chemotherapy Induced Neuropathic Pain. Sci Rep 9, 20324 (2019)

Slide 184

Slide 184 text

Heatmaps? 184 xkcd

Slide 185

Slide 185 text

Clustering 185 Gehlenborg, N., Wong, B. Heat maps. Nat Methods 9, 213 (2012) • color relative medium • clustering reveals patterns and structure in the heat maps • added gaps reveal relationship

Slide 186

Slide 186 text

Pushing the limit! 186 Gehlenborg, N., Wong, B. Heat maps. Nat Methods 9, 213 (2012)

Slide 187

Slide 187 text

Parallel coordinates 187

Slide 188

Slide 188 text

Parallel coordinates • all lines pass through a small number of points • categorical data is not well suited for parallel coordinates • limited visibility of items when the number gets high • works best for moderate number of dimensions and no more than a few thousand records • quickly recognize patterns • estimate the strength of correlations 188

Slide 189

Slide 189 text

Temporal data: Line charts • use inherent properties of time to create effective visualizations • time is unidirectional, provides a natural order for events and has an inherent semantic structure • temporal data are often cyclic and exhibit repeating patterns • the challenge is that time cannot be directly perceived (unlike spatial dimensions) • common approaches may combine: position, brightness, saturation, animation

Slide 190

Slide 190 text

190 Line chart is quite famous in all data analysis steps xkcd

Slide 191

Slide 191 text

Relatable vis 191 xkcd

Slide 192

Slide 192 text

Helpful tips Time • very effective visual variable • examples: line and bar charts • mapped to the horizontal axis 
 • take into account the inherent cyclicality • break apart the time dimension into time intervals • they emphasize a recurring pattern 192 Week Streit, M., Gehlenborg, N. Temporal data. Nat Methods 12, 97 (2015)

Slide 193

Slide 193 text

Radar charts • use polar coordinates • project the data onto a circular plane • often applied because of visual appeal • produce a continuous curve over the cycles • support comparison of patterns across cycles • harder to interpret due to distortion 1 193 Streit, M., Gehlenborg, N. Temporal data. Nat Methods 12, 97 (2015)

Slide 194

Slide 194 text

Nightingale rose diagrams 194 Florence Nightingale (1820—1910)

Slide 195

Slide 195 text

Sparklines • term introduced by E Tufte for illustrating data trends • show data in a highly condensed form that still allows for comparison • designed to show qualitative data aspects • don’t require scales or axes • enables integration of a high number of measurements over time 195 Laurence Sterne: The Life and Opinions of Tristram Shandy, Gentleman, 1759–1767, Vol. VI, Chapter Forty. “These were the four lines I moved through my first, second, third, and fourth volumes”.

Slide 196

Slide 196 text

Animation • maps time to time • alternative approach if visual variables such as position and saturation or brightness are already in use • intuitive • limits ability to detect patterns • cannot compare across multiple time points • change blindness makes it that we lose track of many changing elements 196

Slide 197

Slide 197 text

Animation or not? 197 Horse In Motion, Muybridge (1886)

Slide 198

Slide 198 text

Small multiples 198

Slide 199

Slide 199 text

Small multiples 199

Slide 200

Slide 200 text

Networks • arise from complex Biological or other relational data • mathematically known as graphs • describe a set of pairwise relationships • common plotting is a node-link diagram • typically molecules as nodes and the connections between the nodes as straight or curved lines (or edges) • directed (asymmetric) or undirected (symmetric) 200 Gehlenborg, N., Wong, B. Networks. Nat Methods 9, 115 (2012)

Slide 201

Slide 201 text

Networks: Helpful tips (1) • advantage of preserving local network detail • easy to identify nearest neighbors of a node • easy to trace paths through the network • layout affects how data is perceived • spring-embedded layout creates hubs and clusters (doesn’t scale) • pitfall of the ‘hairball’ effect • alternative is adjacency matrix • one drawback: difficult to understand for non-connected nodes Gehlenborg, N., Wong, B. Networks. Nat Methods 9, 115 (2012) 201

Slide 202

Slide 202 text

Networks: Helpful tips (2) • adjacency matrix: reorder nodes such that many filled cells appear next to each other as possible • clusters are evident • connections between clusters appear as clumps of information away from the diagonal • if adj. matrix and node-link diagrams are inadequate: limit the representation to partial network or rely on statistical metric to describe certain data aspect Gehlenborg, N., Wong, B. Networks. Nat Methods 9, 115 (2012) 202

Slide 203

Slide 203 text

Pathways • describe the connectivity and flow of information in biological systems • example of molecules, cells, species, global ecological networks, etc • pathways are network • elements as nodes and their relationships as edge • requirement: clear depiction of connectivity via pattern 203 Hunnicutt, B., Krzywinski, M. Pathways. Nat Methods 13, 5 (2016)

Slide 204

Slide 204 text

Pathways: Helpful tips (1) • information flow from left to right and top to bottom • diverging from this standard or introducing asymmetry in the layout helps emphasize differences • should be done carefully and sparingly • edges that loop back should be in clockwise direction (b) • placing nodes on grid assists eye movement across (c) • alignment type emphasizes either information flow or source nodes 204 Hunnicutt, B., Krzywinski, M. Pathways. Nat Methods 13, 5 (2016)

Slide 205

Slide 205 text

Pathways: Helpful tips (2) • strong relationships can be illustrated using connection and enclosure • edges group nodes via connection • enclosure can group nodes in shared compartments • associate nodes through similarity (color or shape) or proximity (pixel distance) • highlights parts with grouping • proximity grouping can be done with negative space • need for start and finish to easily identify and examine pathways • labels or names: high visual cost and disrupts grouping 205 Hunnicutt, B., Krzywinski, M. Pathways. Nat Methods 13, 5 (2016)

Slide 206

Slide 206 text

Neural circuit diagrams • network • nodes: brains regions or single neurons • directed edges: axonal connections • edge may encode many variables • designates neurotransmitter type • determines cell excitation, inhibition or modulation of its targets • node position, color and shape encode cell morphology, type, location, etc 206 Hunnicutt, B., Krzywinski, M. Neural circuit diagrams. Nat Methods 13, 189 (2016)

Slide 207

Slide 207 text

Neural circuits: Helpful tips (1) 207 Hunnicutt, B., Krzywinski, M. Neural circuit diagrams. Nat Methods 13, 189 (2016)

Slide 208

Slide 208 text

Neural circuits: Helpful tips (2) Supplementary Figure 1 Strategies to add emphasis and information to the circuit shown in Fig Full region acronyms are shown as in the original5. ! 208 Hunnicutt, B., Krzywinski, M. Neural circuit diagrams. Nat Methods 13, 189 (2016)

Slide 209

Slide 209 text

Neural circuits: Helpful tips (3) 209 Hunnicutt, B., Krzywinski, M. Neural circuit diagrams. Nat Methods 13, 189 (2016)

Slide 210

Slide 210 text

Treemaps 210

Slide 211

Slide 211 text

Pie charts 211 Drew Skau, Robert Kosara, Arcs, Angles, or Areas: Individual Data Encodings in Pie and Donut Charts, EuroVis 2016

Slide 212

Slide 212 text

Pie charts: Helpful tips (1) 212

Slide 213

Slide 213 text

More Pie charts 213 https://infogram.com/create/pie-chart

Slide 214

Slide 214 text

…in medicine 214 https://www.fda.gov/about-fda/reports/communicating-risks-and-benefits-evidence-based-users-guide

Slide 215

Slide 215 text

Ok last words… 215 https://www.thekitchn.com/apple-pie-recipe-reviews-22956677

Slide 216

Slide 216 text

What is your intent? Adapted from IBM Design216

Slide 217

Slide 217 text

Elements of a figure is next! 217

Slide 218

Slide 218 text

Elements of a figure 218 Creating a figure requires style, skill, and certain elements

Slide 219

Slide 219 text

Typography Choose typefaces, sizes and spacing to clarify the structure and meaning of the text

Slide 220

Slide 220 text

Typography: Art and technique • affects perception of credibility • frequently conflated with font • Arial is a typeface that include roman, bold and italic fonts • letterforms: serif, sans serif • primary characteristics • Serif: thinner, formal, easier to read in block text because ‘feet’ helps our eyes follow line (posters, printed documents) • Sans serif: simpler, information, and appropriate for headings and labels (slides) 220 Wong, B. Points of view: Typography. Nat Methods 8, 277 (2011)

Slide 221

Slide 221 text

Typography to honor content • pick one and ignore the rest • combine with care • reveals the tone of the doc • clarifies structure and meaning • space among ¶ > line spacing 221

Slide 222

Slide 222 text

Typographical pairings

Slide 223

Slide 223 text

Space and meaning Wong, B. Points of view: Typography. Nat Methods 8, 277 (2011)

Slide 224

Slide 224 text

Typography • font selection shows quickly if content is stately or humble, formal or informal, creative or technical 
 • most docs can be set perfectly with: 
 - one typeface 
 - 2 or 3 type sizes 
 - bold and italics if necessary 
 • typography must draw our attention without interfering with the reading 224

Slide 225

Slide 225 text

Axes, ticks and grids Makes navigational elements distinct and unobtrusive to maintain visual priority of data

Slide 226

Slide 226 text

Axes, ticks and grids • figures with quantitative info more accurately understood • helpful navigational elements • provide scale and aid to assess lengths & proportions • must be distinct from primary information • Gestalt principles inform us how to use: line width, color and transparency • keep data-to-ink ratio high • least ink for navigational elements 226

Slide 227

Slide 227 text

Axes • data as foundation • follow its coordinate system • figure axes are critical in orienting the reader • avoid bounding by axes on all sides • containment often mistaken for organization (negative space) • multi-panel figures should maintain fixed scales for comparison • variation in axis ranges is easily overlooked • outliers shouldn’t compress the dynamic range of all the data 227 Krzywinski, M. Axes, ticks and grids. Nat Methods 10, 183 (2013)

Slide 228

Slide 228 text

Ticks • densely labeled figures uneasy on the eye • axis ticks burden it with repetition • esp. relevant for views of data across large genomes (filled with repeating non significant zeros) • easy strategy to keep tick label complexity low while maintaining usability 228 Krzywinski, M. Axes, ticks and grids. Nat Methods 10, 183 (2013)

Slide 229

Slide 229 text

Grids • establish sight lines to compare proportions and relate positions to axis ticks • grid number is suggestive of the scale of differences • dense grid means minor fluctuations in the data and low uncertainty level • dense grid impedes accurate judgement due to high density • no grid may be better than a bad one • use only when needed 229 Krzywinski, M. Axes, ticks and grids. Nat Methods 10, 183 (2013)

Slide 230

Slide 230 text

Data-ink-ratio 230 Courtesy of Tomasz Przechlewski

Slide 231

Slide 231 text

How to! 231

Slide 232

Slide 232 text

Elements of graphical integrity 232

Slide 233

Slide 233 text

Functional layering 233

Slide 234

Slide 234 text

Labels and callouts Figure labels require the same consistency and alignment in their layout as text

Slide 235

Slide 235 text

Consistency and alignment • deal with complexity by using: 
 - labels to identify components 
 - defining terms and acronyms 
 - focusing reader’s attention 
 235 • labels are annotations • labels position in relation to data points • placement priority scheme Krzywinski, M. Labels and callouts. Nat Methods 10, 275 (2013)

Slide 236

Slide 236 text

Clarity • keep labels concise but clear • move common text to legend • explore ways to present labels in alignment 236 Krzywinski, M. Labels and callouts. Nat Methods 10, 275 (2013) • control spatial variation • if in doubt keep extra space

Slide 237

Slide 237 text

Integration • design figures to incorporate labels and callouts • even the space • group labels intuitively • use a grid system • uniform arrangement/spacing • consistent line lengths, angles, spacing and alignment • limit diversity in length and angle of callout lines 237 Krzywinski, M. Labels and callouts. Nat Methods 10, 275 (2013) Hanahan, D. & Weinberg, R.A. Cell 144, 646–674 (2011)

Slide 238

Slide 238 text

Plotting symbols Choose distinct symbols that overlap without ambiguity and communicate relationships in data

Slide 239

Slide 239 text

Plotting symbols 239 Krzywinski, M., Wong, B. Plotting symbols. Nat Methods 10, 451 (2013)

Slide 240

Slide 240 text

Symbol diversity • data categories encoded with distinct symbols • insufficient symbol contrast impede on identifying them • letters as plotting symbols • some draw more attention • bias assignment of category • color as efficient discriminator Krzywinski, M., Wong, B. Plotting symbols. Nat Methods 10, 451 (2013) 240

Slide 241

Slide 241 text

Natural hierarchies • data points represent genes classified by: 
 - type (gene, non processed pseudogene, processed) 
 - transcription state (on, off) • map salience to relevance • elevate important data using symbols with greater visual weight (fill and/or color) • single color isolates single var

Slide 242

Slide 242 text

Discrimination problem 242

Slide 243

Slide 243 text

Arrows Use well-proportioned arrows sparingly and consistently as a guide through complex information

Slide 244

Slide 244 text

Arrows in diagrams 244 Wong, B. Arrows. Nat Methods 8, 701 (2011)

Slide 245

Slide 245 text

Arrows and meanings • metaphorical uses (increase, decrease) • geometry tells us its purpose • arrows in one figure could have a different purpose • label parts, convey mechanical motion, flow, change, movement or causality • functional relationship of elements 245 Wong, B. Arrows. Nat Methods 8, 701 (2011)

Slide 246

Slide 246 text

Arrows 246

Slide 247

Slide 247 text

Arrows 247

Slide 248

Slide 248 text

‘There are two goals when presenting data: convey your story and establish credibility’ -Edward Tufte 248

Slide 249

Slide 249 text

Clarity is next! 249 Illustration by Tiago Galo

Slide 250

Slide 250 text

Clarity 250 Simplify your presentation to improve clarity

Slide 251

Slide 251 text

Refine, redraw 251

Slide 252

Slide 252 text

Less is more 252

Slide 253

Slide 253 text

Align to a grid 253 PRINCIPLES OF FORM AND DESIGN MODULAR GRID GRID GESTALT PRINCIPLES OF GROUPING COLOR CONTRAST HIERARCHY WEIGHT AND SCALE HIERARCHY HIERARCHY SHAPE CONTRAST WEIGHT AND SCALE HIERARCHY GOLDEN RATIO 1.61803399 GOLDEN RECTANGLE RULE OF THIRDS align focal point to one of the four circles

Slide 254

Slide 254 text

Stay golden 254

Slide 255

Slide 255 text

Think in 3s 255

Slide 256

Slide 256 text

Use form and design principles 256 PRINCIPLES OF FORM AND DESIGN POINT GEOMETRIC STATIC LINE ORGANIC ACTIVE/DYNAMIC VOLUME SYMMETRY GRADATION ASYMMETRY SPACE/PLACEMENT RADIAL RHYTHM GROUPING SPACE/SCALE NEGATIVE/POSITIVE TRANSPARENT DIRECTION OPAQUE PLANE LAYERS TENSION TENSION SCALE REPETITION POINT GEOMETRIC STATIC LINE ORGANIC ACTIVE/DYNAMIC VOLUME TEXTURE SYMMETRY GRADATION ASYMMETRY SPACE/PLACEMENT RADIAL RHYTHM GROUPING SPACE/SCALE NEGATIVE/POSITIVE TRANSPARENT DIRECTION OPAQUE PLANE LAYERS TENSION TENSION SCALE GESTURE PATTERN REPETITION FIGURE/GROUND AMBIGUOUS FIGURE/GROUND REVERSIBLE

Slide 257

Slide 257 text

… and Gestalt principles 257 GESTALT PRINCIPLES OF GROUPING CLOSURE AREA SYMMETRY COLOR CONTRAST HIERARCHY WEIGHT AND SCALE HIERARCHY HIERARCHY SHAPE CONTRAST WEIGHT AND SCALE HIERARCHY PROXIMITY SIMILARITY PROXIMITY CONTINUITY

Slide 258

Slide 258 text

Simplify 258 Wong, B. Simplify to clarify. Nat Methods 8, 611 (2011)

Slide 259

Slide 259 text

Reduce redundancy 259 Wong, B. Simplify to clarify. Nat Methods 8, 611 (2011)

Slide 260

Slide 260 text

… and how about some fun? 260 Briscoe et al. (2014) Biology Letters Jablonski et al. (2012), Historical Biology, 24:5, 527-536

Slide 261

Slide 261 text

Science communication 261 blog.addgene.org/early-career-researcher-toolbox-free-tools-for-making-scientific-graphics

Slide 262

Slide 262 text

Understanding graphs • accurate interpretation of visual variables • effective graphs should: 
 - accommodate reader needs 
 - focus on human perception strengths • e.g. tough to accurately judge differences among two curves • perceptual system is attuned to detecting min distances • shortcoming of judging relative area • e.g. bubble charts usefulness 262 Wong, B. Design of data figures. Nat Methods 7, 665 (2010)

Slide 263

Slide 263 text

Science communication 263

Slide 264

Slide 264 text

Jacques Bertin 264

Slide 265

Slide 265 text

Cleveland and McGill Rank Aspect to compare 1 Positions on a common scale 2 Positions on the same but nonaligned scales 3 Lengths 4 Angles, slopes 5 Area 6 Volume, color saturation 7 Color hue 265 Wong, B. Design of data figures. Nat Methods 7, 665 (2010)

Slide 266

Slide 266 text

Uncover trends? 266

Slide 267

Slide 267 text

Visual communication • sci comm with graphs depends on the design decisions made by authors • specifically, encoding info for readers to decode • strong visuals to compose better figures • rely on accurate perceptual tasks • support the visual assessment for better interpretation 267

Slide 268

Slide 268 text

Figurative vs abstract Examples of figure redesigns 268

Slide 269

Slide 269 text

Visualizing Science 269

Slide 270

Slide 270 text

Task and representation 270

Slide 271

Slide 271 text

Data as its core 271

Slide 272

Slide 272 text

Color usage 272

Slide 273

Slide 273 text

Overview first, detail later 273

Slide 274

Slide 274 text

The continuum 274

Slide 275

Slide 275 text

Points of review Examples of figure redesigns 275

Slide 276

Slide 276 text

Microscopy system 276 Wong, B. Points of view: Points of review (part 1). Nat Methods 8, 101 (2011)

Slide 277

Slide 277 text

Microscopy system • intended to illustrate 3 parts • redraw the figure so the threefold nature is apparent even at a glance • Gestalt principles to organize objects into groups • e.g. line connections, space containment, proximity • compartments for structure • the horizontal feature links the system together • negative space as separator, added uniformity using shapes 277 Wong, B. Points of view: Points of review (part 1). Nat Methods 8, 101 (2011)

Slide 278

Slide 278 text

Gene expression Wong, B. Points of view: Points of review (part 1). Nat Methods 8, 101 (2011) 278

Slide 279

Slide 279 text

Gene expression • fitting vertical structure to relate parts to one another • line up arrows for visual completion (connect and order process) • differentiate the central path from other elements using orientation and alignment to create salience • added reagents misaligned or placed at an angle from central molecules • consistent color encoding (green as barcode) Wong, B. Points of view: Points of review (part 1). Nat Methods 8, 101 (2011) 279

Slide 280

Slide 280 text

Data graphs • reading graphs to observe individual data points • keep each in memory to construct an image • fast process thanks to visual perception • graphical encoding supports detection and assembly process • certain tasks easier • e.g. reading bar chart vs pie 280 Wong, B. Points of view: Points of review (part 2). Nat Methods 8, 189 (2011)

Slide 281

Slide 281 text

Visual encodings 281 Wong, B. Points of view: Points of review (part 2). Nat Methods 8, 189 (2011)

Slide 282

Slide 282 text

Multivariate scatter plot 282 Wong, B. Points of view: Points of review (part 2). Nat Methods 8, 189 (2011)

Slide 283

Slide 283 text

Color, color, and more color 283 Wong, B. Points of view: Points of review (part 2). Nat Methods 8, 189 (2011)

Slide 284

Slide 284 text

Color illusions 284 http://www.psy.ritsumei.ac.jp/~akitaoka/color12e.html

Slide 285

Slide 285 text

Brightness illusion 285

Slide 286

Slide 286 text

Dimensionality is next! 286

Slide 287

Slide 287 text

Dimensionality 287 Refers to having vast amounts of variables or high dimensional data

Slide 288

Slide 288 text

Dimensions 288

Slide 289

Slide 289 text

Dimensions The XKCD Guide to the Universe's Most Bizarre Physics 289

Slide 290

Slide 290 text

Dimensions 290

Slide 291

Slide 291 text

Visualization • effective for spatial data, rarely effective for other types • complexity & understandability • higher effectiveness 2D plane • rely on non spatial graphical encodings to add extra dimensions 291 https://3dmapart.com/

Slide 292

Slide 292 text

Urban Design and Planning https://sites.google.com/site/3ddatavisualizationresearch/ 292

Slide 293

Slide 293 text

Complexity 293

Slide 294

Slide 294 text

Understandability Deepen.ai from KITTI labeled dataset 294 https://labs.wsu.edu/kramerlab/lidar-data-visualization/ Niko et al. 107. 854-865. 10.1016/j.energy.2016.04.089 (2016)

Slide 295

Slide 295 text

Space-filling models 295 Gehlenborg, N., Wong, B. Into the third dimension. Nat Methods 9, 851 (2012)

Slide 296

Slide 296 text

3D representation of abstract data Gehlenborg, N., Wong, B. Into the third dimension. Nat Methods 9, 851 (2012) 296

Slide 297

Slide 297 text

Tips (1) • if one data dimension is categorical and there are few categories use shapes • many approaches to represent nD data on 2D plane • matrix of scatter plots for pairwise correlations can reveal correlations • also: Heat maps and coordinate plots • dimensionality reduction methods (PCA, MDS, etc) acceptable yet with info loss 297 http://www.turingfinance.com/artificial-intelligence-and-statistics-principal-component-analysis-and-self-organizing-maps/

Slide 298

Slide 298 text

Tips (2) • minimize impact of occlusion • animated rotation of objects of interest is common to show hidden surfaces (interactive) • semitransparent surfaces allow to look through or into objects • problem of having unintended artifacts, esp. with color usage • place labels onto the 2D projection not in 3D scene (distortion and readability) • take into account data properties • support vis goals with depth cues and consistent encodings 298 Lan Huong, and Holmes. "Ten quick tips for effective dimensionality reduction." PLoS computational biology 15.6 (2019)

Slide 299

Slide 299 text

Power of the plane 299 2D visualizations of multivariate data are most effective when combined

Slide 300

Slide 300 text

Power of the plane • parallel coordinate plots • scatter plots • highly useful 2D plot types for high-dimensional data • representation of data using location on a plane • strengths for highlighting different data aspects • data tasks to show: clusters, trends and outliers 300

Slide 301

Slide 301 text

Parallel coordinate plot • one data set: Iris (R.A. Fisher) • multiple visualizations • parallel coordinate plot handles n data types (a) • quantitative multivariate data over time or m conditions (b) • enables accurate comparisons across dimensions • robust graphical encodings • clear data relationships • limited suitability for data dominated by categorical information or small data ranges 301 Gehlenborg, N., Wong, B. Power of the plane. Nat Methods 9, 935 (2012)

Slide 302

Slide 302 text

Scatter plot • the choice between these two plots depends on the analytical task • the how the data is represented is the difference • 1 data point in parallel coordinate is 1 line or 1 profile • supports pairwise correlations and other relationships between m dimensions • characteristic shapes of the point clouds • complement each other 302 Gehlenborg, N., Wong, B. Power of the plane. Nat Methods 9, 935 (2012)

Slide 303

Slide 303 text

Interactivity? 303

Slide 304

Slide 304 text

Multidimensional data Often combined with multidimensional analysis to divide complex data into groups

Slide 305

Slide 305 text

Complexity • focus on meaning instead of structure • anchor the figure to relevant domain knowledge content (versus method detail) • which findings are interesting? • what representation would communicate them clearly? • project data onto familiar visual paradigms • e.g., network or pathway to show biological effects • dimensions can be encoded as spatial or visual elements, such as along x and y axes or by color, size or symbol 305 Krzywinski, M., Savig, E. Multidimensional data. Nat Methods 10, 595 (2013)

Slide 306

Slide 306 text

Small multiples • effective method for presentation • example: Study of drug effect on a network of signaling proteins 306 Krzywinski, M., Savig, E. Multidimensional data. Nat Methods 10, 595 (2013)

Slide 307

Slide 307 text

Effective design • effective figures • use of spatial encoding to present the data domain (protein network) • small multiple maintains functional relationship between the proteins • assess the impact of n variables • incompatible for quantitative variables as it ‘muddles and confounds the analysis’ • small multiple scales well • original actually shows 392 cell type-drug combinations A RT I C L E S signaling in monocytes (Supplementary Figs. 21 and 22), independ- ent of stimulation conditions, indicating that under the conditions of our assay, SFK and JAK-STAT signaling pathways are active in mono- cytes, but inactive in T cells, B cells, dendritic cells and NK cells. The data also enabled the comparative analysis of cell-signaling- network responses to inhibition in closely related cell types. Such responses differ only to a few compounds, including imatinib (Gleevec; Supplementary Note 7 and Supplementary Fig. 23), the c-Jun Column 1 2 3 4 14 5 6 8 7 9 10 11 12 13 15 16 17 18 20 19 21 22 24 23 25 27 26 28 Row 1 2 3 4 5 6 8 7 9 10 11 12 IgM– IgM+ CD4+ CD8+ CD14+ HLA-DRlow CD14+ HLA-DRmid CD14+ HLA-DRhigh CD14– HLA-DRlow CD14– HLA-DRmid CD14– HLA-DRhigh CD14+ CD14– NK cells Dendritic AKT-1/2 Sorafenib BTK inhib. III Crassin Dasatinib G DC-0941 G o-6983 H89 IKK inhib. I Im atinib JAK3 inhib. Lck inhib. Lestaurtinib PP2 Rapam ycin Ruxolitinib SP600125 SB-202190 Sunitinib Syk inhib. IV Tofacitinib Cell type JAK2 inhib. UO 126 VX680 Streptonigrin Surf- Monocytes T cells B cells Column 1 2 3 4 14 5 6 8 7 9 10 11 12 13 15 16 17 18 20 19 21 22 24 23 25 26 27 28 Row 1 2 3 4 5 6 8 7 9 10 11 12 13 14 Inhibitor Inhibitor 5x 4x 3x 2x 1x 0x –1x –2x –3x –4x –5x T cells NK cells B cells Monocytes Surface– Dendritic cells ERK p38 SHP STAT1 STAT3 STAT5 NFb BTK S6 AKT PLC ZAP70 LAT SLP76 SYK BLNK ERK p38 SHP STAT1 STAT3 STAT5 NFb BTK S6 AKT PLC LAT SYK ERK p38 SHP STAT1 STAT3 STAT5 NFb BTK S6 AKT PLC LAT SLP76 No inhibitor EC 50 Drug potency <6.0*10–10 6.0*10–10 3.0*10–9 1.5*10–8 7.7*10–8 3.9*10–7 2.0*10–6 9.9*10–6 5.0*10–5 >5.0*10–5 Non-sigmoidal Response Unstim. Stimulation Inhibitor Fold change (vs. basal) Percent inhibition Induction ≥1,000 100 (Basal levels) 0 (No inhib.) ≤–1,000 JAK(pan) inhib. Staurosporine No inhib. (Fold change vs. basal) Inhibitor AKT-1/2 Sorafenib BTK inhib. III Crassin Dasatinib G DC-0941 G o-6983 H89 IKK inhib. I Im atinib JAK(pan) inhib. JAK3 inhib. Lck inhib. Lestaurtinib PP2 Rapam ycin Ruxolitinib SP600125 SB-202190 Staurosporine Streptonigrin Sunitinib Syk inhib. IV Tofacitinib Unstim. Stimulation (30 min) pVO4 IL-2 IL-3 IL-12 LPS GM-CSF IFN- IFN- G-CSF BCR/FcR-XL PMA/Iono. JAK2 inhib. UO 126 VX680 No inhib. (Fold change vs. basal) Phosphoprotein placement IFN- IgM+ B cells a b c Figure 5 Overview of inhibitor impact. (a) A miniaturized signaling network, guided by canonical pathways, including vertical ordering of nodes from membrane-proximal signaling proteins to nucleus-localized transcription, is used here to depict the effect of a stimulus or inhibitor on each quantified phosphorylation site after 15-min incubation with the inhibitor and subsequent 30-min cell stimulation. As some antibodies recognize different proteins in different cell types, three cell type–specific signaling networks are shown. In the absence of inhibitor (“No inhibitor”), the response to each stimulus relative to the untreated state is represented as fold change by a sized red or black circle (for induction and reduction of phosphorylation levels, respectively). To visualize the effects of inhibitors (“Inhibitor”), circles were sized inversely to the IC50 and colored by the percent inhibition (‘inhibition’). For example, in the presence of ruxolitinib, inhibition of phosphorylation of STAT1 (IC50 = 23 nM, 93% inhibition) and STAT3 (IC50 = 4 nM, 147% inhibition) was observed (a, “Inhibitor”), whereas without activation of the B cells, no observable effects of ruxolitinib on the quantified signaling nodes were visible (b, yellow box). Fold-change induction before inhibition and confidence intervals for IC50 values and percent inhibition are not visualized, but are given in Supplementary Results 3. (b) The impact of all inhibitors under all stimulation conditions is shown for IgM+ B cells. (c) The impact of all inhibitors on all cell types after 30 min IFN-A stimulation is shown. Sections highlighted by color are detailed in the main text. 307 Krzywinski, M., Savig, E. Multidimensional data. Nat Methods 10, 595 (2013)

Slide 308

Slide 308 text

Design decisions 308 Krzywinski, M., Savig, E. Multidimensional data. Nat Methods 10, 595 (2013)

Slide 309

Slide 309 text

Helpful tips • focus the reader’s attention • focus on specific elements in displays of complex data • use a light visual style • use row and column numbers to aid data lookup • organize the presentation of high-dimensional data • leverage existing biological conceptual models • scope of data focused with a narrowed range or a table rearrangement 309 https://material.io/design/communication/data-visualization.html

Slide 310

Slide 310 text

Multidimensional examples 310 From high-dimensional data to storytelling

Slide 311

Slide 311 text

Spatial: China's maritime routes CHINA Strait of Malacca South China Sea P A C I F I C O C E A N A T L A N T I C O C E A N Bab al-Mandab Strait Suez Haifa 25 areas most a ected by a trade disruption Selected land connections to maritime-road ports Chinese trade routes Bagamoyo Djibouti Zeebrugge Tangier Cherchell Ambarli Piraeus Gwadar Kyaukphyu Laem Chabang Chongjin Rason Malé Colombo Aboadze Lagos Walvis Bay Nouakchott China’s maritime-road projects cluster where disruption to its trade would be most costly Sources: Mercator Institute for China Studies; World Bank; Journal of Contemporary China; European Space Agency; US National Centres for Environmental Information; NOAA Geosciences Lab/SOEST, University of Hawaii *Where work is under way with a Chinese organisation that has a majority stake or is being tasked with development or operation 0.01 10 1 5 0.1 0.5 Increase in length of trade routes if closed to Chinese trade, weighted by cargo value, % Maritime-road projects* Chinese maritime shipping routes, The Economist September 28th 2019 101 likely to aid China. They suggest it will be The results were conclusive. After holding China’s “maritime road” Graphic detail 311

Slide 312

Slide 312 text

Let’s retro-engineer! ➡ Domain knowledge ➡ Task ➡ Data abstraction(s) ➡ Visual encoding(s) Domain situation Observe target users using existing tools Visual encoding/interaction idiom Justify design with respect to alternatives Algorithm Measure system time/memory Analyze computational complexity Observe target users after deployment ( ) Measure adoption Analyze results qualitatively Measure human time with lab experiment (lab study) Data/task abstraction Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009) 312

Slide 313

Slide 313 text

Spatial: When the lights go out Shenyang Seoul Pyongyang N O R T H K O R E A C H I N A R U S S I A S O U T H K O R E A 1 10 100 1 0.1 10 100 Nocturnal luminosity per person GDP per person, at PPP* 2012 13 14 15 16 17 18 1,000 2,000 3,000 C H I N A Korean peninsula, nocturnal luminosity March average Lights in Pyongyang dimmed sharply in Sources: World Data Lab in collaboration with the NOMIS Foundation, Vienna University of Economics and Business, and the International Institute for Applied Systems Analysis; Earth Observation Group, NOAA NCEI; “Illuminating economic growth”, by Yingyao Hu and Jiaxiong Yao, IMF working paper, April *Purchasing-power parity †Bank of Korea, - , EIU Nocturnal luminosity v GDP - , by country, log scales North Korea, GDP per person $, at PPP*, converted using Chinese prices Conventional estimates† Luminosity-based estimate 10km 100km The Economist May 4th 2019 85 that among countries with similar lumi- nosity, autocracies reported gdp growth the country has brightened again. International sanctions are unlikely to When the lights North Korea’s economy Graphic detail Issue Date: 04-05-2019 Zone: UKPB Desk: GraphicDetail Output on: 01-05-2019----18:57 Page: GD1 Revision: 0 313

Slide 314

Slide 314 text

Spatial: Playlists and politics Sources: Vivid Seats; US Census Bureau; MIT Elections and Data Science Lab; The Economist ↓ In the Northeast and Midwest, areas whose voters swung towards Donald Trump in stand out for liking hard-edged rock music African-American counties in the South have a particular a nity for hip-hop Latin music is prevalent in Hispanic areas along the Mexican border and in South Florida ↓ Rural mountain and plains states gravitate towards country and folk, as does much of the South Musical preferences mirror America’s demographic and political divides Most popular genre relative to national average, by county, share of live music tickets sold, , % Country/folk Dance/electronica Latin Pop Hip-hop/rap/R&B Rock/alternative Population, , m . . Los Los Angeles Angeles Los Los Angeles Angeles Chicago Chicago The Bronx, New York Miami Houston Houston Chicago Houston Los Angeles The Economist November 16th 2019 81 were more likely than Democrats to prefer dogs to cats, neat desks to messy ones, ac- folk, full of odes to wide-open spaces, pre- vail in plains and mountain states. Yet Playlists and American music Graphic detail Issue Date: 16-11-2019 Zone: UKPB Desk: GraphicDetail Output on: 13-11-2019----19:59 Page: GD1 Revision: 0 314

Slide 315

Slide 315 text

Spatial: Ice would suffice Arctic sea ice, annual minimum extent Relative volume change Observed temperature change by latitude, °C , relative to - average The Arctic is the epicentre of global warming Sources: NSIDC; PIOMAS; NASA; Carbon Brief *Average for ice thicker than cm, data from August †Minimum at Sept th ‡ st- th percentile of values within each latitude band RUSSIA CANADA GREENLAND UNITED STATES A R C T I C O C E A N Arctic Circle . °N North Pole † † . m . m . m Annual minimum extent, km record low 90°S 45°S 45°N 0 90°N 0 2 4 5 3 1 ↑ The Arctic is warming much faster than everywhere else Latitude Equator Antarctic Arctic Average Range‡ Ice thickness* metres Annual minimum area ↑ Stronger jet stream Weaker jet stream According to one theory, a big temperature di erence yields a strong jet stream on a relatively straight path. This forms a barrier that keeps cold air in the Arctic Smaller temperature di erences produce a slower, wavier jet stream Cold air moves south and warm air moves north. Weaker winds slow the movement of weather systems, causing heatwaves and cold snaps to linger Warm air Warm air Cold air Cold air . . . The Economist September 21st 2019 101 white ice does.In turn, this speeds up melt- ing: a classic positive-feedback loop. The more carbon dioxide and methane. These gases can then speed up the greenhouse ef- Ice would suffice The altered Arctic Graphic detail Issue Date: 21-09-2019 Zone: UKPB Desk: GraphicDetail Output on: 18-09-2019----20:40 Page: GD1 Revision: 0 315 Arctic sea ice, annual minimum extent Relative volume change Observed temperature change by latitude, °C , relative to - average The Arctic is the epicentre of global warming Sources: NSIDC; PIOMAS; NASA; Carbon Brief *Average for ice thicker than cm, data from August †Minimum at Sept th ‡ st- th percentile of values within each latitude band RUSSIA CANADA GREENLAND UNITED STATES A R C T I C O C E A N Arctic Circle . °N North Pole † † . m . m . m Annual minimum extent, km record low 90°S 45°S 45°N 0 90°N 0 2 4 5 3 1 ↑ The Arctic is warming much faster than everywhere else Latitude Equator Antarctic Arctic Average Range‡ Ice thickness* metres Annual minimum area ↑ Stronger jet stream Weaker jet stream According to one theory, a big temperature di erence yields a strong jet stream on a relatively straight path. This forms a barrier that keeps cold air in the Arctic Smaller temperature di erences produce a slower, wavier jet stream Cold air moves south and warm air moves north. Weaker winds slow the movement of weather systems, causing heatwaves and cold snaps to linger Warm air Warm air Cold air Cold air . . . The Economist September 21st 2019 101 white ice does.In turn, this speeds up melt- ing: a classic positive-feedback loop. The more carbon dioxide and methane. These gases can then speed up the greenhouse ef- Ice would suffice The altered Arctic Graphic detail Issue Date: 21-09-2019 Zone: UKPB Desk: GraphicDetail Output on: 18-09-2019----20:40 Page: GD1 Revision: 0

Slide 316

Slide 316 text

Non-spatial: Exalted valley *At July 31st †To Q2 ‡Forecast §To Q1 Sources: Datastream from Refinitiv; Bloomberg; BEA; eMarketer; Open Secrets; The Economist Amazon Amazon Microsoft Alphabet Alphabet Apple Alphabet Microsoft Facebook Facebook Facebook Apple Amazon Cisco Intel Top-five tech firms in each month Dotcom bubble IBM US non-financial corporate profits R&D spending among S&P firms US advertising revenue Federal lobbying spending 0 3 6 9 12 2010 12 14 16 18 19† 2010 12 14 16 18 19† 2010 12 14 16 18 19‡ 2010 12 14 16 18 19§ 0 10 20 30 40 0 10 20 30 40 0 0.5 1.0 1.5 2.0 Today’s biggest tech firms have surpassed their predecessors’ peak US technology companies Share of total US stockmarket value, % Top-five technology companies Share of total, % 0 5 10 15 20 25 30 * The Economist August 10th 2019 73 bubble, the industry is more concentrated today: Microsoft, Amazon, Apple, Alphabet $100bn in cash (and more in stock) to buy would-be rivals. Partly as a result, the num- Exalted valley Tech titans Graphic detail Issue Date: 10-08-2019 Zone: UKPB Desk: GraphicDetail Output on: 07-08-2019----19:30 Page: GD1 Revision: 0 316 *At July 31st †To Q2 ‡Forecast §To Q1 Sources: Datastream from Refinitiv; Bloomberg; BEA; eMarketer; Open Secrets; The Economist Amazon Amazon Microsoft Alphabet Alphabet Apple Alphabet Microsoft Facebook Facebook Facebook Apple Amazon Cisco Intel Top-five tech firms in each month Dotcom bubble IBM US non-financial corporate profits R&D spending among S&P firms US advertising revenue Federal lobbying spending 0 3 6 9 12 2010 12 14 16 18 19† 2010 12 14 16 18 19† 2010 12 14 16 18 19‡ 2010 12 14 16 18 19§ 0 10 20 30 40 0 10 20 30 40 0 0.5 1.0 1.5 2.0 Today’s biggest tech firms have surpassed their predecessors’ peak US technology companies Share of total US stockmarket value, % Top-five technology companies Share of total, % 0 5 10 15 20 25 30 * The Economist August 10th 2019 73 bubble, the industry is more concentrated today: Microsoft, Amazon, Apple, Alphabet $100bn in cash (and more in stock) to buy would-be rivals. Partly as a result, the num- Exalted valley Tech titans Graphic detail Issue Date: 10-08-2019 Zone: UKPB Desk: GraphicDetail Output on: 07-08-2019----19:30 Page: GD1 Revision: 0

Slide 317

Slide 317 text

Non-spatial: Teenage wasteland Share of Americans using platforms at least once per month, estimate, by age group Advertising revenue, $bn Estimate Global monthly active users, bn Selected services, Q2 2018 Teenagers are avoiding Facebook, as older users flock to it Sources: eMarketer; KeyBanc Capital Markets; company reports; press reports *Q †Q ‡Estimated from daily active users - to -year-olds to to to + Facebook’s acquisitions of Instagram and WhatsApp have compensated for the greying of its core product 60 40 20 0 80% 60 40 20 0 80% Facebook Instagram Instagram Instagram Facebook Facebook Snapchat Snapchat 2008 17 17 23 2008 17 23 2008 17 23 2008 17 23 23 2008 FORECAST FORECAST 20 19 18 17 16 15 2014 80 60 40 20 0 Owned by Facebook 0 0.5 1.0 1.5 2.0 2.5 Snapchat‡ Reddit Twitter Weibo TikTok Instagram WeChat Facebook Messenger† WhatsApp* Facebook The Economist July 20th 2019 73 Measuring usage of Facebook is tricky: book Messenger, a chat app the company Teenage wasteland Ageing on Facebook Graphic detail Issue Date: 20-07-2019 Zone: UKPB Desk: GraphicDetail Output on: 17-07-2019----20:33 Page: GD1 Revision: 0 317 Share of Americans using platforms at least once per month, estimate, by age group Advertising revenue, $bn Estimate Global monthly active users, bn Selected services, Q2 2018 Teenagers are avoiding Facebook, as older users flock to it Sources: eMarketer; KeyBanc Capital Markets; company reports; press reports *Q †Q ‡Estimated from daily active users - to -year-olds to to to + Facebook’s acquisitions of Instagram and WhatsApp have compensated for the greying of its core product 60 40 20 0 80% 60 40 20 0 80% Facebook Instagram Instagram Instagram Facebook Facebook Snapchat Snapchat 2008 17 17 23 2008 17 23 2008 17 23 2008 17 23 23 2008 FORECAST FORECAST 20 19 18 17 16 15 2014 80 60 40 20 0 Owned by Facebook 0 0.5 1.0 1.5 2.0 2.5 Snapchat‡ Reddit Twitter Weibo TikTok Instagram WeChat Facebook Messenger† WhatsApp* Facebook The Economist July 20th 2019 73 Measuring usage of Facebook is tricky: book Messenger, a chat app the company Teenage wasteland Ageing on Facebook Graphic detail Issue Date: 20-07-2019 Zone: UKPB Desk: GraphicDetail Output on: 17-07-2019----20:33 Page: GD1 Revision: 0 Share of Americans using platforms at least once per month, estimate, by age group Advertising revenue, $bn Estimate Global monthly active users, bn Selected services, Q2 2018 Teenagers are avoiding Facebook, as older users flock to it Sources: eMarketer; KeyBanc Capital Markets; company reports; press reports *Q †Q ‡Estimated from daily active users - to -year-olds to to to + Facebook’s acquisitions of Instagram and WhatsApp have compensated for the greying of its core product 60 40 20 0 80% 60 40 20 0 80% Facebook Instagram Instagram Instagram Facebook Facebook Snapchat Snapchat 2008 17 17 23 2008 17 23 2008 17 23 2008 17 23 23 2008 FORECAST FORECAST 20 19 18 17 16 15 2014 80 60 40 20 0 Owned by Facebook 0 0.5 1.0 1.5 2.0 2.5 Snapchat‡ Reddit Twitter Weibo TikTok Instagram WeChat Facebook Messenger† WhatsApp* Facebook The Economist July 20th 2019 73 Measuring usage of Facebook is tricky: book Messenger, a chat app the company Teenage wasteland Ageing on Facebook Graphic detail Issue Date: 20-07-2019 Zone: UKPB Desk: GraphicDetail Output on: 17-07-2019----20:33 Page: GD1 Revision: 0

Slide 318

Slide 318 text

Spatial or non-spatial? 100 300 200 70 400 Log scale 1990 95 05 2000 10 15 Q2 2020 Q1 2019 Australia Canada Germany France Britain Ireland Italy New Zealand Spain United States ← Actual Forecast → % confidence interval Mean forecast House-price forecast, Q % change on a year earlier, real terms Confidence intervals, % 0 -5 -10 5 10 15 50 75 90 95 Median Australia N. Zealand Canada Britain France Germany Ireland Italy US Spain A decade after the financial crisis, house prices are at new highs Sources: OECD; BIS; IMF; national statistics; The Economist Real house prices Q = The Economist June 29th 2019 85 from the oecd and national agencies. And even an inexact forecast provides more in- sight than no forecast at all. As a result, we them, we used a machine-learning algo- rithm called a random forest. This method creates a “forest” of “decision trees”, each As safe as houses Residential property Graphic detail Issue Date: 29-06-2019 Zone: UKPB Desk: GraphicDetail Output on: 26-06-2019----19:34 Page: GD1 Revision: 0 318

Slide 319

Slide 319 text

1 10 50 1 10 100 0.1 60 FORECAST Total emissions, gigatonnes of CO equivalent 2.5 12.5 Nonetheless, China is so large that it has become the world’s biggest emitter—and will only get bigger China emits far less greenhouse gas per person than Western countries did at the same stage of economic development GDP per person, prices, $’ GDP per person v annual emissions per person - , log scales Annual emissions per person Tonnes of CO equivalent Global trend weighted by population → China now has the same emissions per person as Western countries did in ↑ Economies get more carbon-e cient once they get rich, causing their emissions per person to level o China United States, Britain, France & Germany India & Indonesia Other countries The Economist May 25th 2019 89 change 05-2019----19:36 Page: GD1 Revision: 0 319 1 1 10 100 0.1 0 20 10 40 50 30 60 FORECAST 1850 75 75 25 1900 50 2000 16 30 Total emissions, gigatonnes of CO equivalent 2.5 12.5 Nonetheless, China is so large that it has become the world’s biggest emitter—and will only get bigger Sources: Climate Action Tracker; Climate Watch; University of Groningen Growth and Development Centre; UN Intergovernmental Panel on Climate Change Total annual greenhouse-gas emissions Gigatonnes of CO equivalent GDP per person, prices, $’ ↑ Economies get more carbon-e cient once they get rich, causing their emissions per person to level o India & Indonesia China Other countries India & Indonesia United States, Britain, France & Germany $12,000-16,000 in 2019 dollars have pro- duced a population-weighted average of 10.6 tonnes of carbon dioxide-equivalent gases per person per year. In 2016 China’s gdp per head was $14,000, and it emitted just 9.3 tonnes per person. Moreover, China pollutes far less per person than Western countries did at the same stage of development. When Ameri- ca, France, Britain and Germany had in- comes similar to modern China’s, they re- lied on inefficient power stations and cars, and spewed out 16.6 tonnes per person. The combination of China’s huge popu- lation and rapid gdp growth has nonethe- less made it the world’s biggest emitter of carbon. China is predicted to produce 16bn tonnes of greenhouse gases in 2030—four times the entire world’s output in 1900. To prevent the stock of greenhouse gas- es in the atmosphere from reaching levels likely to cause disastrous warming, China must do better than merely beating the past records of richer countries. Instead, it will need an unprecedented decline in emis- sions per head—at least to the more car- bon-efficient level of similarly rich Latin American economies, and ideally onto the trajectory of poorer Asian giants like India and Indonesia, which rely less on heavy in- dustry and manufacturing. Those coun- tries, perched at the sweltering latitudes where farmers will be most hurt by climate change, must in turn work out how to reach upper-middle-income status without rep- licating China’s emissions path. To their credit, Chinese authorities, spurred by public concern about air pollu- tion, have prioritised green policies, such as switching from coal-fired power sta- tions to renewable sources and setting up an emissions-trading system. China’s an- nual rate of emissions growth has fallen from 9.3% in 2002-11to 0.6% in 2012-16. The waning of its cement-intensive construc- tion boom should slow emissions further. But it will take more than incremental gains to stave off severe warming. 7 Spatial or non-spatial?

Slide 320

Slide 320 text

More-digitised countries use less cash. Enthusiastic governments can speed things along Cash use v internet penetration Sources: Bank of England; World Bank Number of retail cash transactions per person Greece Second-largest shadow economy among rich countries United States Home to the largest technology companies Japan Credit-card market historically protected from foreign competition GDP per person At market prices, $’ 0 250 500 2006 08 10 12 14 16 17 Bank of Korea announces plans for a cashless society by 2020 0 250 500 2006 08 10 12 14 16 17 Instant payment system launches 0 250 500 2006 08 10 12 14 16 17 iDEAL, a bank-backed payment system, is rolled out Amsterdam’s bus system goes cashless 0 250 500 2006 08 10 12 14 16 17 Swish payment system launches South Korea Denmark Netherlands Sweden Internet users, % of population Brazil France Estonia Portugal Sweden Latvia Czech Republic Poland Lithuania Finland Argentina Slovenia Turkey Russia Luxembourg Mexico Norway Bulgaria Australia Belgium Chile S. Africa China Netherlands Peru Thailand Indonesia Canada India Britain Romania Colombia Malaysia Saudi Arabia Hungary Switzerland Hong Kong Slovakia Denmark Philippines Morocco Singapore Italy South Korea Taiwan Spain Germany Ireland Austria 0 20 40 60 80 100 20 40 60 80 100 % of total transactions conducted in cash The Economist August 3rd 2019 73 digitised societies tend to make fewer cash payments. In Nordic countries like Norway MasterCard), tech giants (Apple, Google) and payment apps (PayPal, Venmo). Tossing the coin The cashless economy Graphic detail 320 More-digitised countries use less cash. Enthusiastic governments can speed things along Cash use v internet penetration Sources: Bank of England; World Bank Number of retail cash transactions per person Greece Second-largest shadow economy among rich countries United States Home to the largest technology companies Japan Credit-card market historically protected from foreign competition GDP per person At market prices, $’ 0 250 500 2006 08 10 12 14 16 17 Bank of Korea announces plans for a cashless society by 2020 0 250 500 2006 08 10 12 14 16 17 Instant payment system launches 0 250 500 2006 08 10 12 14 16 17 iDEAL, a bank-backed payment system, is rolled out Amsterdam’s bus system goes cashless 0 250 500 2006 08 10 12 14 16 17 Swish payment system launches South Korea Denmark Netherlands Sweden Internet users, % of population Brazil France Estonia Portugal Sweden Latvia Czech Republic Poland Lithuania Finland Argentina Slovenia Turkey Russia Luxembourg Mexico Norway Bulgaria Australia Belgium Chile S. Africa China Netherlands Peru Thailand Indonesia Canada India Britain Romania Colombia Malaysia Saudi Arabia Hungary Switzerland Hong Kong Slovakia Denmark Philippines Morocco Singapore Italy South Korea Taiwan Spain Germany Ireland Austria 0 20 40 60 80 100 20 40 60 80 100 % of total transactions conducted in cash The Economist August 3rd 2019 73 digitised societies tend to make fewer cash payments. In Nordic countries like Norway MasterCard), tech giants (Apple, Google) and payment apps (PayPal, Venmo). Tossing the coin The cashless economy Graphic detail Spatial or non-spatial?

Slide 321

Slide 321 text

Multi-verse! 321

Slide 322

Slide 322 text

Data exploration is next! 322

Slide 323

Slide 323 text

Data integration Integration is often the coordination of processes to become part of a whole, or intermix 323

Slide 324

Slide 324 text

Data types • each with own inherent structure • specific visualization techniques • e.g., gene expression matrix values for cell measurements are meaningful as heat map or parallel coordinate plot • challenge of finding a vis that effectively integrates and combines the data types • understanding patterns and processes in research studies relies on data integration 324

Slide 325

Slide 325 text

Merging 2+ graphical forms • need for balance between optimal representation of one data type versus the other • networks are naturally displayed as node-link diagrams or adjacency matrices • Goal: Discover correlations, common trends, or causal relationships • design depends on what the analysis task calls for Gehlenborg, N., Wong, B. Integrating data. Nat Methods 9, 315 (2012) 325

Slide 326

Slide 326 text

Overview states? Gehlenborg, N., Wong, B. Integrating data. Nat Methods 9, 315 (2012) 326

Slide 327

Slide 327 text

Sequential comparison? 327 Gehlenborg, N., Wong, B. Integrating data. Nat Methods 9, 315 (2012)

Slide 328

Slide 328 text

General comparison? 328 Gehlenborg, N., Wong, B. Integrating data. Nat Methods 9, 315 (2012)

Slide 329

Slide 329 text

Summary • suitability of data vis methods strongly depends on question • distinct graphing techniques emphasize different data aspects • ability to see data in discrete form enables deeper understanding • useful to have tools that implement all or at least several in one interface • e.g., Cytoscape plugin Cerebral • suitability to switch between data views and analysis tasks 329 Gehlenborg, N., Wong, B. Integrating data. Nat Methods 9, 315 (2012) https://xkcd.com/373/

Slide 330

Slide 330 text

Panels or dashboards 330 https://xkcd.com/688/ Resolution https://xkcd.com/395/

Slide 331

Slide 331 text

Tasks and automation 331 https://xkcd.com/1319/

Slide 332

Slide 332 text

Data exploration The action of exploring the data and discovering patterns using graphical representation 332

Slide 333

Slide 333 text

Presentation vs Exploration • present known data characteristics • emphasis of identified point(s) of interest • explore data to understand structure • suspicion of regularities or patterns • no knowledge of exactly what they are • provide meaningful overviews to find patterns Wikipedia. Patern Recognition. Alisneaky, svg version by User:Zirguezi333

Slide 334

Slide 334 text

Anscombe’s quartet 334 Shoresh, N., Wong, B. Data exploration. Nat Methods 9, 5 (2012) The four sets of numbers in the quartet have many identical summary statistics (mean, variances, etc)

Slide 335

Slide 335 text

Anscombe’s quartet 335 Shoresh, N., Wong, B. Data exploration. Nat Methods 9, 5 (2012)

Slide 336

Slide 336 text

The process • iterative • often overview-first, detail- later • graphical organization of the data is guided by expectations and hypotheses • observed patterns refine or germinate new hypotheses • Anscombe’s quartet shows not to rely solely on computational metrics Visualize Transform Model Data Communicate 336

Slide 337

Slide 337 text

High-dimensional data • exploratory goal to find ‘classes of behavior’ among multiple components • e.g., genes, populations, samples, etc • create simple representations of low-dimensional data ‘slices’ • useful strategy that restricts complexity • one plot for each component • makes the visual task of finding commonality between plots simpler • ensure consistency (same scale) 337 Shoresh, N., Wong, B. Data exploration. Nat Methods 9, 5 (2012)

Slide 338

Slide 338 text

Alternative using small multiple 338 Shoresh, N., Wong, B. Data exploration. Nat Methods 9, 5 (2012)

Slide 339

Slide 339 text

Helpful tips • visual burden by simultaneous representation of all the data • limit observation number • sample a subset of the data • less important features may be removed • focus on small number of features • attention on essential info • add features to support story Geologicl Maps of Henry Lake Quadrangle, Idaho and Montana, 1972 by Irvin J. Whitkind. USGS339

Slide 340

Slide 340 text

Last lecture is next! 340

Slide 341

Slide 341 text

Lecture landscape Spatial mapping of the lecture 341

Slide 342

Slide 342 text

Introduction 342

Slide 343

Slide 343 text

The process 343

Slide 344

Slide 344 text

Layout, salience, negative space, etc 344

Slide 345

Slide 345 text

The story Separation: A hero ventures forth from the world of common day into a region of supernatural wonder. Initiation: Fabulous forces are there encountered and a decisive victory is won. Return: The hero comes back from this mysterious adventure with the power to bestow boons on his fellow man. Info we trust, RJ Andrews. Joseph Campbell, 1949 345

Slide 346

Slide 346 text

Design process, the nested model, etc 346

Slide 347

Slide 347 text

Nested model of visualization • domain situation: 
 - who are the target users? • abstraction: translate from specifics of domain to vocabulary of vis 
 - what is shown? data abstraction 
 - why is the user looking at it? task abstraction • idiom 
 - how is it shown? 
 + visual encoding idiom: how to draw 
 + interaction idiom: how to manipulate • algorithm for efficient computation alization design algorithm idiom abstraction domain Munzner. IEEE TVCG 15(6):921-928, 2009 (Proc. InfoVis 2009) 347

Slide 348

Slide 348 text

Layout every detail Giorgia Lupi, 2016. Sketching with Data Opens the Mind’s Eye348

Slide 349

Slide 349 text

Color 349

Slide 350

Slide 350 text

Categorical data [Tableau software blog. How we designed the new color palettes in Tableau 10? Stone, 2016] Tableau10 350

Slide 351

Slide 351 text

[New matplotlib colormaps. https://bids.github.io/colormap/] Continuous data 351

Slide 352

Slide 352 text

Plot types 352

Slide 353

Slide 353 text

What is your intent? Adapted from IBM Design353

Slide 354

Slide 354 text

Elements of design 354

Slide 355

Slide 355 text

Natural hierarchies • data points represent genes classified by: 
 - type (gene, non processed pseudogene, processed) 
 - transcription state (on, off) • map salience to relevance • elevate important data using symbols with greater visual weight (fill and/or color) • single color isolates single var

Slide 356

Slide 356 text

Clarity, figurative vs abstract, etc 356

Slide 357

Slide 357 text

Visualizing Science 357

Slide 358

Slide 358 text

Overview first, detail later 358

Slide 359

Slide 359 text

Dimensionality, power of the plane 359

Slide 360

Slide 360 text

Complexity • focus on meaning instead of structure • anchor the figure to relevant domain knowledge content (versus method detail) • which findings are interesting? • what representation would communicate them clearly? • project data onto familiar visual paradigms • e.g., network or pathway to show biological effects • dimensions can be encoded as spatial or visual elements, such as along x and y axes or by color, size or symbol 360 Krzywinski, M., Savig, E. Multidimensional data. Nat Methods 10, 595 (2013)

Slide 361

Slide 361 text

Small multiples • effective method for presentation • example: Study of drug effect on a network of signaling proteins 361 Krzywinski, M., Savig, E. Multidimensional data. Nat Methods 10, 595 (2013)

Slide 362

Slide 362 text

Data integration and exploration 362

Slide 363

Slide 363 text

The process • iterative • often overview-first, detail- later • graphical organization of the data is guided by expectations and hypotheses • observed patterns refine or germinate new hypotheses • Anscombe’s quartet shows not to rely solely on computational metrics Visualize Transform Model Data Communicate 363

Slide 364

Slide 364 text

The process 364