Week 3 - Categorical

Week 3 - Categorical

Data Visualization Slides


Ashley Dzick

January 21, 2020


  1. MB-6151 Data Visualization Week 3 • Vision & Attention •

    Intro to Tableau • Categorical Visualizations • Hierarchical Visualizations
  2. Visualizations Categorical

  3. Categorical Visualizations Categorical data can be nominal, qualitative or ordinal

    Purely categorical data can come in a range of formats • raw data: individual observations • aggregated data: counts for each unique combination of levels • cross-tabulated data
  4. Visualizations Bar Graph: Standard, Stacked, Grouped Mosaic Plot Frequency Table


  6. Bars & Columns Technically different things Modern nomenclature refers to

    them pretty much interchangeably Only major difference is orientation (horizontal v. vertical)
  7. Bars & Columns Column Chart Bar Chart

  8. Bars & Columns Use a bar chart when: • Labels

    are long • Many categories
  9. Bars & Columns Use a column chart when: • “Default”

    standard • Representing data sets over a period of time
  10. None
  11. Standard Bar Chart A bar chart plots numeric values for

    levels of a categorical feature as bars Levels are plotted on one chart axis, and values are plotted on the other axis Each categorical value claims one bar, and the length of each bar corresponds to the bar’s value
  12. Bar v. Histogram Bar charts are used to compare variables

    Bar charts plot categorical data Example: The variables on the horizontal axis are categorical - they provide the names of the exhibitions. The vertical axis indicates time in minutes. The height of each bar represents the median time for that exhibition https://www.forbes.com/sites/naomirobbins/2012/01/04/a-histogram-is-not-a-bar-chart/#166f022f6d77
  13. Bar v. Histogram Histograms show distributions of variables Histograms plot

    quantitative data with ranges of the data grouped into bins or intervals Example first bin includes visits from 0 up to and including ten minutes, the second bin from 10 up to and including 20 minutes, and so on https://www.forbes.com/sites/naomirobbins/2012/01/04/a-histogram-is-not-a-bar-chart/#166f022f6d77
  14. Bar Chart Design Excel gives you a lot of ways

    to be “cute” or give additional visual detail Don’t do it Rounding or making the bars 3D causes confusion where the cut off point is
  15. Bar Chart Design Consider your pre-attentive processing attributes and gestalts

    Use them sparingly and to call out key details

  17. Grouped Bar Graph Again, technically could be a grouped bar

    graph or grouped column graph Takes bar graph one step further and plots two variables instead of one Color almost always represents the secondary variable
  18. Grouped Bar Graph Colors and positions should always be consistent

    Potential accessibility problems (we will address in future) Use when: • want to look at how the second category variable changes within each level of the first • want to look at how the first category variable changes across levels of the second

  20. Stacked Bar Chart Similar to the grouped bar graph Combines

    to form one column Helpful when comparing whole categories Not helpful when comparing within a category
  21. Grouped v. Stacked Grouped bar charts • Comparing between each

    element in the categories • Comparing elements across categories • Hard to tell the difference between the total of each group Still confused? visual.ly/blog/how-groups-stack-up-when-to-use-grouped-vs-stacked-column-charts/ Stacked bar charts • Great for showing visual aggregate and differences between totals • Hard to compare sizes within one category

  23. A Graph By Many Names In statistics, where it originated

    — it’s called a Mosaic Plot In modern data visualization tools, including Tableau, known as a Marimekko Chart Has a ton of other names: matrix chart, stacked spinogram, spineplot, olympic or submarine chart, a Mondrian diagram, or even shortened to just mekko chart
  24. Intro A combination of a 100% stacked column chart and

    100% stacked horizontal-bar chart using a different variable for each A variable-width stacked column chart A way to show part-to-whole relationships across two variables at once
  25. Mosaic Plot A mosaic plot is a graphical display that

    allows you to examine the relationship among two or more categorical variables The area of each box demonstrates the total amount for each observation How to Build a Mosaic Plot - step by step. http://www.pmean.com/definitions/mosaic.htm
  26. Example For this plot: • The proportions on the x-axis

    represent the number of observations for each level of the X variable, which is country. • The proportions on the y-axis at right represent the overall proportions of Small, Medium, and Large cars for the combined levels (American, European, and Japanese). • The scale of the y-axis at left shows the response probability, with the whole axis being a probability of one (representing the total sample).
  27. Mosaic Plot Like a histogram, the width of bins on

    the X axis represent the number of observations The height of each box (on the Y axis) represents the frequency within each category
  28. Design Tips Each category (or group) should be 100% The

    heights within each bin are a percentage, not a count You can use multiple variables (example on right has four) Too many variables can make it hard to read
  29. Visualizations Categorical • Bar & Column: Standard, Stacked, Grouped •

    Mosaic Plot • Marimekko Chart Numerical • Scatterplot • Histogram • Box Plot • Line & Area • Pie } Table can be either