Slide 1

Slide 1 text

1 PyData Amsterdam 2018 Uwe L. Korn Building customer-visible data science dashboards with Altair / Vega / Vue

Slide 2

Slide 2 text

2 • Senior Data Scientist at Blue Yonder (@BlueYonderTech) • Apache {Arrow, Parquet} PMC • Work in Python, C++11 and SQL • Data Engineer and Architect with heavy focus around Pandas About me xhochy [email protected]

Slide 3

Slide 3 text

3 1. Use Case 2. Conflict of interests 3. The nice compromise 4. Technical dive-in Agenda

Slide 4

Slide 4 text

4 Photo by Kari Shea on Unsplash

Slide 5

Slide 5 text

5 Why do we need dashboards? • Present output of your machine learning models • Make insights available to non-technical users • Repetitive tasks can also be done much faster, even for tech-savy folks • „You cannot give your customer just an API“

Slide 6

Slide 6 text

6 So, have you seen Bokeh? from bokeh.io import curdoc from bokeh.layouts import column from bokeh.models.widgets import TextInput, Button, Paragraph # create some widgets button = Button(label="Say HI") input = TextInput(value="Bokeh") output = Paragraph() # add a callback to a widget def update(): output.text = "Hello, " + input.value button.on_click(update) # create a layout for everything layout = column(button, input, output) # add the layout to curdoc curdoc().add_root(layout)

Slide 7

Slide 7 text

7 Why didn’t we use it? • It’s really great but… • It provides an environment to write dashboards in purely Python • Our frontend devs work in JavaScript et al. • Bokeh(js) introduces its own dependencies on the frontend • Building dashboards just for you or your data science team? Use it!

Slide 8

Slide 8 text

8 What do these UI developers want? • Work with their native toolchain, i.e. JavaScript, CSS, … not Python • Choose dependencies freely • Don’t be constrained by the backend • Custom widgets should be a concern of the frontend

Slide 9

Slide 9 text

9 Vega and Vega-lite Vega is a declarative format for creating, saving, and sharing visualization designs. With Vega, visualizations are described in JSON, … Vega-Lite is a more high-level version of this grammar approach. https://vega.github.io/

Slide 10

Slide 10 text

10 VueJS for the frontend • Vega is for visualizations, we also need widgets • Could be substituted by ReactJS / Angular / … • provides reactive and composable view components • Basics can be learned without deep frontend knowlegde

Slide 11

Slide 11 text

Demo 11

Slide 12

Slide 12 text

12 Vega(-lite) specs in Python from flask import Flask, jsonify app = Flask(__name__) @app.route("/barchart.json") def barchart(): return jsonify({ "$schema": "https://vega.github.io/schema/vega-lite/v2.json", "description": "A simple bar chart with embedded data.", "data": { "values": [ {"a": "A","b": 28}, {"a": "B","b": 55}, {"a": "C","b": 43}, {"a": "D","b": 91}, {"a": "E","b": 81}, {"a": "F","b": 53}, {"a": "G","b": 19}, {"a": "H","b": 87}, {"a": "I","b": 52} ] }, "mark": "bar", "encoding": { "x": {"field": "a", "type": "ordinal"}, "y": {"field": "b", "type": "quantitative"} } }

Slide 13

Slide 13 text

13 Altair import altair as alt import pandas as pd df = pd.DataFrame({ 'a': ["A", "B", "C", "D", "E", "F", "G", "H", "I"], 'b': [28, 55, 43, 91, 81, 53, 19, 87, 52] }) alt.Chart(df).mark_bar().encode( x='a', y='b', )

Slide 14

Slide 14 text

Code 14

Slide 15

Slide 15 text

15 app = Flask(__name__) cars = vega_datasets.data.cars() @app.route("/vega-example") def hello(): columns = [ … ] chart = alt.Chart(cars).mark_point().encode( x=random.choice(columns), y=random.choice(columns) ) return jsonify(chart.to_dict()) app.py

Slide 16

Slide 16 text

16 HelloWorld.vue (I/II) import {default as vegaEmbed} from 'vega-embed' export default { methods: { reloadImage () { fetch('/vega-example').then(response => { response.json().then(spec => { vegaEmbed('#vega-box', spec, {actions: false}) }) }) } } }

Slide 17

Slide 17 text

17 HelloWorld.vue (II/II) Reload

Slide 18

Slide 18 text

18 https://github.com/xhochy/altair-vue-vega-example

Slide 19

Slide 19 text

19 Altair Basics Some really simple basics, for more see https://github.com/altair-viz/ altair-tutorial from vega_datasets import data import altair as alt cars = data.cars()

Slide 20

Slide 20 text

20 alt.Chart(cars).mark_point()

Slide 21

Slide 21 text

21 alt.Chart(cars).mark_point().encode( x='Miles_per_Gallon' )

Slide 22

Slide 22 text

22 alt.Chart(cars).mark_tick().encode( x='Miles_per_Gallon' )

Slide 23

Slide 23 text

23 alt.Chart(cars).mark_point().encode( x='Miles_per_Gallon' )

Slide 24

Slide 24 text

24 alt.Chart(cars).mark_point().encode( x='Miles_per_Gallon', y='Horsepower' )

Slide 25

Slide 25 text

25 alt.Chart(cars).mark_point().encode( x='Miles_per_Gallon', y='Horsepower', color='Origin' )

Slide 26

Slide 26 text

26 Summary • Chose technologies that make all involved happy • Talk to each other • Tools that work good for you, might not work for your team • Altair is a great visualization library • Use it in UIs • Use it in Jupyter Notebooks

Slide 27

Slide 27 text

27 By JOEXX (Own work) [CC BY-SA 3.0 (http://creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons By JOEXX (Own work) [CC BY-SA 3.0 (http://creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons 24. - 26. October + 2 days of sprints (27/28.10.) ZKM Karlsruhe, DE Karlsruhe Call for Participation opens next week.

Slide 28

Slide 28 text

28 Karlsruhe 24. - 26. October ZKM Karlsruhe + 2 days of sprints (27/28.10.) Conference all in English language. More info: http://pycon.de Wed Fri Call for Proposals OPEN! Tickets soon.

Slide 29

Slide 29 text

29 I’m Uwe Korn Twitter: @xhochy https://github.com/xhochy Thank you!