March 24, 2020 Visual Encoding of Datasets So far we have seen the presentation (visual encoding) of data types (see lecture 4) Visualisation techniques and visual encoding of datasets tables spatial data - geometry - fields network and trees
March 24, 2020 Arrange Tables … Why the arrange design choice? most crucial visual encoding since the use of space dominates a user's mental model of a dataset spatial position covers three most effective channels for ordered attributes primacy of spatial position channels - planar position against a common scale - planar position along an unaligned scale - length best channel for categorical attributes (grouping items in the same region) is also about spatial position
March 24, 2020 Arrange by Keys and Values Distinction between key and value attributes introduced earlier is very relevant for visually encoding table data Core design choices for visually encoding tables depend on the semantics of the table attributes (key or value) scatterplot: visually encoding two value attributes bar chart: visually encoding one key and one value attribute heatmap: visually encoding two keys and one value attribute … Keys typically used to define a spatial region for each item in which one or multiple value attributes are shown
March 24, 2020 Express Values Spatial position channel can be used to visually encode quantitative attributes each item is encoded with a mark at some position along an axis additional attributes encoded on the same mark via other non- spatial channels (e.g. colour or size) glyphs can be used for more complex cases (multiple marks) as shown later in this course
March 24, 2020 Scatterplot Example (Bubble Plot) Each point mark represents a country horizontal and vertical positions encoding life expectancy and infant mortality - colour channel for categorical country attribute - size channel for quantitative population attribute (bubble plot) Highly negatively correlated dataset downward sloping diagonal
March 24, 2020 Scatterplot Example Relation between diamond price and weight Derived attributes (logarithmically scaled) in right figure strongly positively correlated attributes - calculated regression line often superimposed to support the task
March 24, 2020 Scatterplots Scatterplots What(Data) Table: two quantitative value attributes. Why(Task) Find trends, outliers, distribution, correlation; locate clusters. How(Encode) Express values with horizontal and vertical spatial position and point marks. Scale Hundreds of items.
March 24, 2020 Separate, Order & Align Regions Categorical attributes have unordered identity semantics encoding them with spatial position would violate the principle of expressiveness Use spatial regions to group similar categorical attributes distribution of regions via three operations - separation into regions (based on categorical attribute) - aligning of regions (optional, based on ordered attribute) - ordering of regions (based on ordered attribute) One-dimensional list alignment often used for a single key (separation with one region per item) view itself covers a two-dimensional area - aligned list of items on one spatial dimension - region in which the values are shown on a second dimension (e.g. bar charts)
March 24, 2020 Bar Chart Example Categorical species key attribute separates the marks along the horizontal spatial axis alphabetical ordering (left picture) easy lookup by name data-driven ordering by weight (right picture) easier to see data trends Separate line marks for weight attribute in each region
March 24, 2020 Bar Charts Bar Charts What(Data) Table: one quantitative value attribute, one categorical key attribute. Why(Task) Lookup and compare values. How(Encode) Line marks, express value attribute with aligned vertical position, separate key attribute with horizontal position. Scale Key attribute: dozens to hundreds of levels.
March 24, 2020 Stacked Bar Chart Example Bars distributed along the x-axis based on combina- tion of processor and pro- cedure More complex glyph for each bar multiple sub-bars stacked vertically - uses colour (type of cache miss) as well as length (number of misses) coding - common scale only for lowest bar component order of stacking is relevant
March 24, 2020 Stacked Bar Charts Stacked Bar Charts What(Data) Multidimensional table: one quantitative value attribute, two categorical key attributes. Why(Task) Part-to-whole relationship, lookup values, find trends. How(Encode) Bar glyph with length-coded subcomponents of value attribute for each category of secondary key attribute. Separate bars by category of primary key attribute. Scale Key attribute (main axis): dozens to hundreds of levels. Key attribute (stacked glyph axis): several to one dozen.
March 24, 2020 Streamgraph Example Music listening history example one time series per artist counting the number of times their music was listened to each week Continuity of the horizontal layers emphasises legibility of individual streams deliberate organic silhouette (instead of x-axis as baseline
March 24, 2020 Streamgraphs Streamgraphs What(Data) Multidimensional table: one quantitative value attribute (e.g. counts), one ordered key attribute (time), one categorical key attribute. What(Derived) One quantitative attribute (for computation of layer ordering). Why(Task) Find trends. How(Encode) Use derived geometry showing artist layers across time, layer height encodes counts. Scale Key attribute (time, main axis) hundreds of time points. Key attributes (artists, not always over entire time axis): dozens to hundreds.
March 24, 2020 Dot Chart vs. Line Chart Example Dot chart and line chart for the same dataset showing a cat's weight over time trends are emphasised by line charts dot chart is like a bar chart where the quantitative attribute is encoded with point marks rather than line marks
March 24, 2020 Dot Charts Dot Charts What(Data) Table: one quantitative value attribute, one ordered key attribute. Why(Task) Lookup and compare values. How(Encode) Express value attribute with aligned vertical position and point marks. Separate/order into horizontal regions by key attribute. Scale Key attribute: dozens to hundreds of levels.
March 24, 2020 Line Charts Line charts should only be used for ordered key attri- butes but not for categorical key attributes! would imply trends that do not exist (violation of expressiveness principle) Aspect ratio (width/height) we are better in judging angles close to 45° banking to 45° idiom computes the best aspect ratio with as many lines as possible close to 45° Line Charts What(Data) Table: one quantitative value attribute, one ordered key attribute. Why(Task) Show trends. How(Encode) Dot chart with connection marks between dots. Scale Key attribute: hundreds of levels.
March 24, 2020 Matrix Alignment Datasets with two keys often arranged in a two-dimensional matrix alignment one key distributed along the rows and one key along the columns rectangular cell in the matrix shows item values Examples of matrix alignments are heatmaps or the scatterplot matrix (SPLOM)
March 24, 2020 Scatterplot Matrix (SPLOM) Usually only the lower or upper triangle of the matrix is shown (avoid redundancy) Scatterplot Matrix (SPLOM) What(Data) Table. What(Derived) Ordered key attribute: list of original attributes. Why(Task) Find correlation, trends, outliers. How(Encode) Scatterplots in 2D matrix alignment. Scale Attributes: one dozen. Items: dozens to hundreds.
March 24, 2020 Volumetric Grids and Recursive Subdivison Volumetric grid aligns data in three dimensions based on three key attributes typically not recommended for non-spatial (abstract) data due to perceptual problems (e.g. occlusion or perspective distortion) Recursive Subdivision recursively subdivides a cell via a list or matrix and thereby supports multiple keys discussed later in the course
March 24, 2020 Spatial Axis Orientation Rectilinear layouts regions or items are distributed along an orthogonal horizontal and vertical axis Parallel layouts parallel coordinates can be used to visualise many quantitative attributes at once provides overview over all attributes - also shows the range of values for individual attributes Radial layouts items distributed around a circle using the angle channel in addition to one or multiple linear spatial channels - more efficient in showing periodic patterns
March 24, 2020 Parallel Coordinates Parallel Coordinates What(Data) Table: many value attributes. Why(Task) Find trends, outliers, extremes, correlation. How(Encode) Parallel layout: horizontal spatial position used to separate axes, vertical spatial position used to express value along each aligned axis with connection line marks as segments between them. Scale Attributes: dozens along secondary axis. Items: hundreds.
March 24, 2020 Radial Bar Chart Example Radial Bar Charts What(Data) Table: one quantitative attribute, one categorical attribute. Why(Task) Find periodic patterns. How(Encode) Length coding of line marks; radial layout.
March 24, 2020 Pie Chart Examples Pie chart Bar chart Polar area chart Pie chart Normalised stacked bar chart Relative contributions of parts to a whole Pie chart versus bar chart accuracy Stacked bar chart
March 24, 2020 Pie Charts and Polar Area Charts Pie Charts What(Data) Table: one quantitative attribute, one categorical attribute. Why(Task) Part-whole relationship. How(Encode) Area marks (wedges) with angle channel; radial layout. Scale One dozen categories. Polar Area Charts What(Data) Table: one quantitative attribute, one categorical attribute. Why(Task) Part-whole relationship. How(Encode) Area marks (wedges) with length channel; radial layout. Scale One dozen categories.
March 24, 2020 Normalised Stacked Bar Charts Normalised Stacked Bar Charts What(Data) Multidimensional table: one quantitative value attribute, two categorical key attributes. What (Derived) One quantitative value attribute (normalised version of original attribute). Why(Task) Part-whole relationship. How(Encode) Line marks with length channel; rectilinear layout. Scale One dozen categories for stacked attribute. Several dozen categories for axis attribute.
March 24, 2020 Spatial Layout Density Layout can be dense or sparse Dense layout uses small and densely packed marks to provide an overview of as many items as possible maximally dense layout uses a single pixel for each point mark - only planar position and colour channels can be used Space-filling layout fills all available space in the view typically uses area marks for items and containment marks for relationships (e.g. treemaps discussed later) maximises the available room for colour coding and might offer space for labels disadvantage: cannot make use of white space in the layout
March 24, 2020 Dense Software Overviews Dense Software Overviews What(Data) Text with numbered lines (source code, test results log). What (Derived) Two quantitative attributes (test execution results). Why(Task) Locate faults, summarise results and coverage. How(Encode) Dense layout. Spatial position and line length from text ordering, Colour channels of hue and brightness. Scale Lines of text: ten thousand.
March 24, 2020 Arrange Spatial Data … Two main spatial data types Geometry Spatial Fields - scalar fields with single value associated with each cell in the field - vector fields with multiple values associated with each cell (e.g. computational fluid dynamics) - tensor fields with matrix associated with each cell capturing more complex structure (e.g. for stress, conductivity etc.) Given spatial position is the attribute of primary importance use provided position as the substrate for the visual layout
March 24, 2020 Geometry Geometric data does not necessarily have attributes associated with it can be derived from raw source data (e.g. geographic data about the earth) Geographic Data derive a geometry dataset based on abstractions (e.g. filtering, aggregation or level of detail) on the underlying raw data cartographic data (non-spatial information) can be used to size code the marks - e.g. size of point marks representing cities by their population Other derived geometry data e.g. based on computations on spatial fields
March 24, 2020 Choropleth Map Choropleth Map What(Data) Geographic geometry data. Table with one quantitative attribute per region. Why(Task) Find clusters. How(Encode) Space: use given geometry for area mark boundaries. Colour: sequential segmented colourmap.
March 24, 2020 Scalar Fields Scalar spatial field has a single value associated with each spatially defined cell e.g. data from medical scans with radio-opacity (CT scan) or proton density (MRI scan) Isocontours use isolines to represent the contours of a particular level of the scalar value. isolines close together for regions with fast change
March 24, 2020 Topographic Terrain Maps Topographic Terrain Maps What(Data) 2D spatial field; geographic data. What (Derived) Geometry: set of isolines computed from the feld. Why(Task) Query shape. How(Encode) Use given geographic data geometry of points, lines, and region marks. Used derived geometry as line marks (blue in example). Scale Dozens of contour levels.
March 24, 2020 Arrange Networks and Trees … Connections (links) can be represented in different ways node-link diagrams with explicit connection marks - well-suited for tasks that involve the understanding of the network topology (e.g. shortest path between two nodes or finding all adjacent nodes) adjacency matrix enclosure with containment marks - only works for trees but not for networks
March 24, 2020 Multilevel Force-Directed Placement (SFDP) Multilevel Force-directed Placement (SFDP) What(Data) Network. What (Derived) Cluster hierarchy on top of original network. Why(Task) Explore topology, locate paths and clusters. How(Encode) Point marks for nodes, connection marks for links. Scale Nodes: 1000-10'000. Links: 1000-10'000. Node/link density: L<4N.
March 24, 2020 Adjacency Matrix View Example Network nodes laid out along the horizontal and vertical edges of a square region links between nodes indicated by colouring an area mark in the cell forming the intersection of the two nodes' row and column additional attribute can be visualised by colouring the matrix cells
March 24, 2020 Adjacency Matrix View Better scalability than node-link diagrams Predictability and stability and support for reordering Drawback: impossible to investigate topological structure Adjacency Matrix View What(Data) Network. What (Derived) Table: network nodes as keys, link status between two nodes as values. Why(Task) "Explore" strongly connected networks. How(Encode) Area marks in 2D matrix alignment. Scale Nodes: 1000. Links: one million.
March 24, 2020 Enclosure Containment marks are effective for showing complete information about hierarchical structure (instead of pairwise relationships only) Treemaps as an alternative to node-link diagrams hierarchical relationships shown via containment rather than connection marks
March 24, 2020 Treemaps Treemaps What(Data) Tree. Why(Task) Query attributes at leaf nodes. How(Encode) Area marks and containment, with rectilinear layout. Scale Leaf nodes: one million. Links: one million.
March 24, 2020 GrouseFlocks Example Visualisation for compound networks (tree on top of a network) network nodes form the leaves of the tree tree encoded via containment on top of the original graph (picture on the right)
March 24, 2020 GrouseFlocks Grouse Flocks What(Data) Network. What (Derived) Cluster hierarchy on top of the original network. How(Encode) Connection marks for original network, containment marks for cluster hierarchy
March 24, 2020 Further Reading This lecture is mainly based on the book Visualization Analysis & Design chapter 7 - Arrange Tables chapter 8 - Arrange Spatial Data chapter 9 - Arrange Networks and Trees