Data Representation - Lecture 3 - Information Visualisation (4019538FNR)

2 December 2005 Information Visualisation Data Representation Prof. Beat Signer
Department of Computer Science Vrije Universiteit Brussel beatsigner.com Department of Computer Science Vrije Universiteit Brussel beatsigner.com

Beat Signer - Department of Computer Science - [email protected] 2
February 27, 2025 Information Visualisation Process Data Representation Data Data Presentation Interaction mapping perception and visual thinking

February 27, 2025 Data Representation and Abstraction ▪ Detailed look at the what part of the earlier what-why-how question → what-why-how analysis framework ▪ Provide a language that is meaningful and useful for vis design ▪ Data is typically described with domain language ▪ in order to find the suitable visual representations, we have to translate the data into more abstract structures that we know how to encode ▪ Data abstraction helps to narrow down the design space

February 27, 2025 Semantics and Types ▪ Many aspects of vis design driven by the kind of data ▪ semantics (real-world meaning) ▪ types (data as well as datasets) ▪ What do the following datasets represent? 15, 2.7, 27, 27, 15, 10021 Basil, 7, S, Pear

February 27, 2025 Semantics and Types … [Visualization Analysis & Design, Tamara Munzner, 2014]

February 27, 2025 Data Types ▪ Item ▪ individual discrete entity ▪ e.g.table row or network node ▪ Attribute ▪ also referred to as variable or dimension ▪ property that can be measured, observed or logged ▪ e.g.price or temperature ▪ Link ▪ relationship between items ▪ e.g.between items (nodes) in a network

February 27, 2025 Data Types … ▪ Position ▪ spatial data ▪ e.g.location in two-dimensional or three-dimensional space ▪ Grids ▪ sampling continous data in terms of geometric and topological relationships between its cells

February 27, 2025 Dataset Types ▪ Dataset ▪ collection of information to be analysed ▪ made out of the five data types ▪ complex combinations of basic dataset types are common

February 27, 2025 Tables ▪ Flat table ▪ row represents an item of data ▪ column represents an attribute of the dataset ▪ a cell contains the value for a given item and attribute ▪ Multidimensional table ▪ indexing into a cell via multiple keys

February 27, 2025 Networks and Trees ▪ Network (graph) ▪ defines relationships between two or more nodes (items) via links ▪ nodes can have associated attributes ▪ links can have associated attributes ▪ e.g.people and their friendships or gene interaction network ▪ Trees ▪ hierarchical structure without cycles ▪ each child node has one parent node ▪ e.g.company organisation chart or biological tree of life

February 27, 2025 Fields ▪ Field ▪ each cell contains measurements or calculation from a continous domain ▪ continous data brings along the issues of sampling and interpolation

February 27, 2025 Fields … ▪ Spatial fields ▪ sampling at spatial positions ▪ e.g.medical scan of a human body or measurements in wind tunnel ▪ if spatial position is given with dataset, we talk about scientific visualisation (scivis) (in contrast to information visualisation (infovis) where the use of space is chosen by the designer) ▪ Grid types ▪ uniform grid: sampling at regular intervals without any need to store the grid geometry or grid topology (connection of cells) ▪ rectilinear grid: supports non-uniform sampling - efficient storage of information with high complexity in some areas and low complexity in others (also store grid geometry)

February 27, 2025 Fields … ▪ Grid types … ▪ structured grid: enables curvilinear shapes where the geometric location of each cell needs to be specified ▪ unstructured grid: complete flexibility but grid geometry as well as grid topology has to be stored explicitly

February 27, 2025 Geometry ▪ Information about the shape of items with spatial positions ▪ points and one-dimensional lines or curves ▪ two-dimensional surfaces or regions ▪ three-dimensional volumes ▪ Geometry datasets do not necessarily have attributes ▪ e.g.contours derived from a spatial field or shapes generated from raw geographic data (e.g.boundaries of a forest) ▪ Shown alone or as backdrop for other data

February 27, 2025 Other Combinations ▪ Cluster ▪ grouping items based on similarity of attributes ▪ Set ▪ unordered group of items ▪ List (array) ▪ ordered group of items ▪ Path ▪ ordered set of segments formed by links connecting nodes in a network ▪ Compound network (multilevel network) ▪ network combined with superimposed tree (with all the nodes of the network as leaves)

February 27, 2025 Dataset Availability ▪ Static file (offline) ▪ entire dataset is available all at once ▪ Dynamic stream (online) ▪ dataset information trickles in over time ▪ addition, update or deletion of items ▪ adds complexity to the vis process - no longer have all data at a given time

February 27, 2025 Attribute Types ▪ Categorical (nominal) attributes ▪ no implicit ordering (but often hierarchical structure) ▪ external ordering can be superimposed ▪ e.g.different types of fruits

February 27, 2025 Attribute Types … ▪ Ordered attributes ▪ ordinal data - well-defined ordering but cannot do full-fledged arithmetic - e.g. t-shirt size ▪ quantitative data - measurement of magnitude that supports arithmetic comparison (integers as well as real numbers) - e.g. height, temperature or stock price ▪ Ordering directions ▪ sequential - homogeneous range from minimum to maximum value - e.g. mountain heights (from sea level to height of Mount Everest) ▪ diverging - e.g. valleys in the sea and mountains on land

February 27, 2025 Attribute Types … ▪ Ordering directions … ▪ cyclic - values wrap around back to the starting point - e.g. time measurements like the hour of the day or the day of the week ▪ Hierarchical attributes ▪ hierarchical structures within or between multiple attributes ▪ e.g.time series of daily stock prices where time can be aggregated hierarchically (from days to weeks, months and years)

February 27, 2025 Key Versus Value Semantics ▪ Type of an attribute does not tell us about its semantics ▪ key attribute (independent attribute) represents an index that is used to look up value attributes (dependant attributes) ▪ key attributes can be categorical or ordinal ▪ value attributes can be categorical, ordinal or quantitative ▪ Flat tables ▪ key might be implicit (simply the index of the row) or explicit (attribute within table with unique values) ▪ Multidimensional tables ▪ multiple keys are required to look up an item ▪ combination of all keys must be unique for each item

February 27, 2025 Example: Order Table [Visualization Analysis & Design, Tamara Munzner, 2014]

February 27, 2025 Key Versus Value Semantics … ▪ Fields ▪ independent variable to look up dependant variable ▪ multivariate structure - depends on number of value attributes - scalar field: one attribute per cell - vector field: two or more attributes per cell - tensor field: many attributes per cell ▪ multidimensional structure - depends on number of keys - e.g. 2D or 3D fields

February 27, 2025 Temporal Semantics ▪ Temporal attribute is any kind of information that is related to time ▪ Data about time is complicated to handle ▪ time hierarchy is deeply multiscale (from nanoseconds to hours, decades or millennia) ▪ temporal scales do not all fit into a strict hierarchy (e.g.weeks do not cleanly fit into months) ▪ transformation and aggregation become complex ▪ Time-varying semantics ▪ time is one of the key attributes (opposed to being a value) ▪ Time-series dataset ▪ ordered sequence of time-value pairs

February 27, 2025 Task Abstraction ▪ Next we have to investigate the why part of the what-why-how analysis framework ▪ what is the goal of using the vis? ▪ Transform task description from domain-specific language into abstract form ▪ enables reasoning about similarities ▪ Who has the goal? ▪ designer of the vis or the end user?

February 27, 2025 Actions ▪ User goals can be defined by actions at three levels of abstractions ▪ Analyse - consume existing or also produce additional data ▪ Search - what kind of search is involved (are the target and location known)? ▪ Query - need to identify one target, compare some targets or summarise all of the targets?

February 27, 2025 Analyse ▪ Most common use case for vis is to consume information that has already been generated

February 27, 2025 Consume ▪ Discover (Explore) ▪ use vis to find new knowledge that was not previously known ▪ serendipitous observation of unexpected data ▪ may be motivated by theories, models or hypotheses ▪ outcome is to generate a new hypothesis or verify (or disconfirm) an existing hypothesis ▪ need for sophisticated interactive vis idioms since we do not know in advance what the user will need to see ▪ note that the why the vis is being used does not dictate the how ▪ Present (Explain) ▪ communication of information, telling a story with data or guiding an audience through a series of cognitive operations - decision making, planning, forecasting or instructional processes ▪ e.g.Gapminder application shown earlier

February 27, 2025 Consume … ▪ Present (Explain) … ▪ output of a discover session might become input for a present session ▪ Enjoy ▪ casual encounter with vis - not driven by need to verify or generate a hypothesis Name Voyager

February 27, 2025 Produce ▪ Generate new material which is often immediately used as input for a next instance ▪ Annotate ▪ graphical or textual annotations of existing visualisation elements - annotations of data items might be stored as a new attribute ▪ typically a manual user action ▪ Record ▪ save or capture visualisation elements ▪ screenshots, bookmarks, parameter settings or interaction logs ▪ e.g.graphical history with a snapshot of the output of each task

February 27, 2025 Produce … ▪ Derive ▪ produce new data elements based on existing data elements ▪ strong relationship between the form of the data (attribute and dataset types) and the vis idioms that are effective at presenting it ▪ derived attributes can be used to extend the dataset - from quantitative to ordinal data (water temperature → cold, warm or hot) - adding latitude and longitude to city names (via lookup in separate DB) - arithmetic operations on existing attributes

February 27, 2025 Targets ▪ Three high-level targets ▪ Trends ▪ high-level characterisation of a pattern in the data ▪ e.g.increases, decreases, peaks, plateaus, …

February 27, 2025 Targets … ▪ Outliers ▪ data that does not fit well with the backdrop ▪ Features ▪ task-dependent structures of interest

February 27, 2025 Targets … ▪ Single attributes ▪ individual values, minimum or maximum, … ▪ Multiple attributes ▪ dependencies, correlations and similarities

February 27, 2025 Targets … ▪ network topology as well as specific paths

February 27, 2025 Targets … ▪ understanding and comparing geometric shapes

February 27, 2025 Search ▪ Lookup ▪ user knows what they are looking for and where it is

February 27, 2025 Search … ▪ Locate ▪ user knows what they are looking for but does not know where it is ▪ Browse ▪ user does not know exactly what they are looking for but has a location in mind where to look for it ▪ Explore ▪ user does not know what they are looking for and where to search ▪ often beginning from an overview of everything ▪ e.g.searching for outliers in a scatterplot

February 27, 2025 Query ▪ Once a target or set of targets is found, query these targets to identify, compare or summarise the data

February 27, 2025 Query … ▪ Identify ▪ if the search returns known targets (lookup or locate) then identify returns their characteristics ▪ if the search returns targets matching particular characteristics (browse or explore) the identify returns specific references ▪ Compare ▪ comparing multiple targets ▪ more difficult than identify task and requires more sophisticated vis idioms to support the user ▪ Summarise (overview) ▪ scope are all possible targets

February 27, 2025 Exercise 3 ▪ Preprocessing and Data Analysis Using Python

February 27, 2025 Further Reading ▪ This lecture is mainly based on the book Visualization Analysis & Design ▪ chapter 2 - What: Data Abstraction ▪ chapter 3 - Why: Task Abstraction

February 27, 2025 References ▪ Visualization Analysis & Design, Tamara Munzner, Taylor & Francis Inc, (Har/Psc edition), May, November 2014, ISBN-13: 978-1466508910 ▪ Name Voyager ▪ https://www.babynamewizard.com/voyager/ ▪ M. Brehmer and T. Munzner, A Multi-Level Typology of Abstract Visualization Tasks, IEEE Transactions on Visualization and Computer Graphics 19(12), 2013 ▪ https://doi.org/10.1109/TVCG.2013.124

2 December 2005 Next Lecture Analysis and Validation

Data Representation - Lecture 3 - Information V...

Data Representation - Lecture 3 - Information Visualisation (4019538FNR)

More Decks by Beat Signer

Other Decks in Education

Featured

Transcript