Slide 1

Slide 1 text

Exploratory Data Visualization with Altair

Slide 2

Slide 2 text

No content

Slide 3

Slide 3 text

Acknowledgments - Altair is developed by Jake Vanderplas and Brian Granger in close collaboration with the UW Interactive Data Lab - Much of this tutorial is from Jake Vanderplas’ PyCon 2018 tutorial - Special thanks to the Altair Community on GitHub.

Slide 4

Slide 4 text

Python Visualization Landscape

Slide 5

Slide 5 text

Python Visualization Landscape

Slide 6

Slide 6 text

Building Blocks of Visualization 1. Data 2. Transformations 3. Marks 4. Encoding - mapping from fields to mark properties 5. Scale - functions that map data to visual scales 6. Guides - visualizations of scales (axes, legends, etc.)

Slide 7

Slide 7 text

Key: Visualization concepts should map directly to visualization implementation

Slide 8

Slide 8 text

Hypothesis: Good implementation can influence good conceptualization

Slide 9

Slide 9 text

~ familiar tools ~

Slide 10

Slide 10 text

No content

Slide 11

Slide 11 text

No content

Slide 12

Slide 12 text

Strengths: - Designed like MatLab: switching was easy - Many rendering backends - Can reproduce just about any plot (with a bit of effort) - Well-tested, standard tool for over a decade Weaknesses: - API is imperative & often overly verbose - Poor support for web/interactive graphics - Often slow for large & complicated data

Slide 13

Slide 13 text

Example: Statistical Data Tidy data: i.e. rows are samples, columns are features

Slide 14

Slide 14 text

Tidy data: i.e. rows are samples, columns are features “ I want to scatter petal length vs. sepal length, and color by species” Example: Statistical Data

Slide 15

Slide 15 text

No content

Slide 16

Slide 16 text

No content

Slide 17

Slide 17 text

Problem: We’re mixing the what with the how

Slide 18

Slide 18 text

Toward a well-motivated Declarative Visualization Imperative - Specify How something should be done. - Specification & Execution intertwined. - “Put a red circle here and a blue circle here” Declarative - Specify What should be done. - Separates Specification from Execution - “Map to a position, and to a color” Declarative visualizations lets you think about the data and relationships, rather than incidental details

Slide 19

Slide 19 text

Toward a well-motivated Declarative Visualization Imperative - Specify How something should be done. - Specification & Execution intertwined. - “Put a red circle here and a blue circle here” Declarative - Specify What should be done. - Separates Specification from Execution - “Map to a position, and to a color” Declarative visualizations lets you think about the data and relationships, rather than incidental details

Slide 20

Slide 20 text

Based on the Vega and Vega-Lite grammars https://altair-viz.github.io/ Altair Declarative visualization in Python

Slide 21

Slide 21 text

Altair for Statistical Visualization

Slide 22

Slide 22 text

Encodings are Flexible:

Slide 23

Slide 23 text

Altair is Interactive

Slide 24

Slide 24 text

...

Slide 25

Slide 25 text

No content

Slide 26

Slide 26 text

~ From D3 to Vega to Altair ~

Slide 27

Slide 27 text

D3 is everywhere

Slide 28

Slide 28 text

But working with D3 can be challenging ...

Slide 29

Slide 29 text

“Simple” Barchart

Slide 30

Slide 30 text

D3 is a Javascript package that streamlines the manipulation of objects on a webpage

Slide 31

Slide 31 text

Vega is a detailed declarative specification for visualizations, built on D3

Slide 32

Slide 32 text

Vega-Lite is a simpler declarative specification aimed at statistical visualization

Slide 33

Slide 33 text

Altair is a Python API for creating Vega-Lite specifications

Slide 34

Slide 34 text

No content

Slide 35

Slide 35 text

~ Thinking about Visualization ~

Slide 36

Slide 36 text

No content

Slide 37

Slide 37 text

No content

Slide 38

Slide 38 text

No content

Slide 39

Slide 39 text

No content

Slide 40

Slide 40 text

No content

Slide 41

Slide 41 text

No content

Slide 42

Slide 42 text

No content

Slide 43

Slide 43 text

No content

Slide 44

Slide 44 text

No content

Slide 45

Slide 45 text

Key: Visualization concepts should map directly to visualization implementation Check out Jeff Heer’s class on data visualization https://courses.cs.washington.edu/courses/cse442/17au/