January 21, 2020

# Week 3 - Categorical

Data Visualization Slides

January 21, 2020

## Transcript

1. ### MB-6151 Data Visualization Week 3 • Vision & Attention •

Intro to Tableau • Categorical Visualizations • Hierarchical Visualizations

3. ### Categorical Visualizations Categorical data can be nominal, qualitative or ordinal

Purely categorical data can come in a range of formats • raw data: individual observations • aggregated data: counts for each unique combination of levels • cross-tabulated data

6. ### Bars & Columns Technically different things Modern nomenclature refers to

them pretty much interchangeably Only major difference is orientation (horizontal v. vertical)

8. ### Bars & Columns Use a bar chart when: • Labels

are long • Many categories
9. ### Bars & Columns Use a column chart when: • “Default”

standard • Representing data sets over a period of time
11. ### Standard Bar Chart A bar chart plots numeric values for

levels of a categorical feature as bars Levels are plotted on one chart axis, and values are plotted on the other axis Each categorical value claims one bar, and the length of each bar corresponds to the bar’s value
12. ### Bar v. Histogram Bar charts are used to compare variables

Bar charts plot categorical data Example: The variables on the horizontal axis are categorical - they provide the names of the exhibitions. The vertical axis indicates time in minutes. The height of each bar represents the median time for that exhibition https://www.forbes.com/sites/naomirobbins/2012/01/04/a-histogram-is-not-a-bar-chart/#166f022f6d77
13. ### Bar v. Histogram Histograms show distributions of variables Histograms plot

quantitative data with ranges of the data grouped into bins or intervals Example first bin includes visits from 0 up to and including ten minutes, the second bin from 10 up to and including 20 minutes, and so on https://www.forbes.com/sites/naomirobbins/2012/01/04/a-histogram-is-not-a-bar-chart/#166f022f6d77
14. ### Bar Chart Design Excel gives you a lot of ways

to be “cute” or give additional visual detail Don’t do it Rounding or making the bars 3D causes confusion where the cut off point is
15. ### Bar Chart Design Consider your pre-attentive processing attributes and gestalts

Use them sparingly and to call out key details

17. ### Grouped Bar Graph Again, technically could be a grouped bar

graph or grouped column graph Takes bar graph one step further and plots two variables instead of one Color almost always represents the secondary variable
18. ### Grouped Bar Graph Colors and positions should always be consistent

Potential accessibility problems (we will address in future) Use when: • want to look at how the second category variable changes within each level of the first • want to look at how the first category variable changes across levels of the second

20. ### Stacked Bar Chart Similar to the grouped bar graph Combines

to form one column Helpful when comparing whole categories Not helpful when comparing within a category
21. ### Grouped v. Stacked Grouped bar charts • Comparing between each

element in the categories • Comparing elements across categories • Hard to tell the difference between the total of each group Still confused? visual.ly/blog/how-groups-stack-up-when-to-use-grouped-vs-stacked-column-charts/ Stacked bar charts • Great for showing visual aggregate and differences between totals • Hard to compare sizes within one category

23. ### A Graph By Many Names In statistics, where it originated

— it’s called a Mosaic Plot In modern data visualization tools, including Tableau, known as a Marimekko Chart Has a ton of other names: matrix chart, stacked spinogram, spineplot, olympic or submarine chart, a Mondrian diagram, or even shortened to just mekko chart
24. ### Intro A combination of a 100% stacked column chart and

100% stacked horizontal-bar chart using a different variable for each A variable-width stacked column chart A way to show part-to-whole relationships across two variables at once
25. ### Mosaic Plot A mosaic plot is a graphical display that

allows you to examine the relationship among two or more categorical variables The area of each box demonstrates the total amount for each observation How to Build a Mosaic Plot - step by step. http://www.pmean.com/definitions/mosaic.htm
26. ### Example For this plot: • The proportions on the x-axis

represent the number of observations for each level of the X variable, which is country. • The proportions on the y-axis at right represent the overall proportions of Small, Medium, and Large cars for the combined levels (American, European, and Japanese). • The scale of the y-axis at left shows the response probability, with the whole axis being a probability of one (representing the total sample).
27. ### Mosaic Plot Like a histogram, the width of bins on

the X axis represent the number of observations The height of each box (on the Y axis) represents the frequency within each category
28. ### Design Tips Each category (or group) should be 100% The

heights within each bin are a percentage, not a count You can use multiple variables (example on right has four) Too many variables can make it hard to read
29. ### Visualizations Categorical • Bar & Column: Standard, Stacked, Grouped •

Mosaic Plot • Marimekko Chart Numerical • Scatterplot • Histogram • Box Plot • Line & Area • Pie } Table can be either