Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building customer-visible data science dashboards with Altair / Vega / Vue

Building customer-visible data science dashboards with Altair / Vega / Vue

There are several tools to build ML dashboards and visulisations. Their focus is often on making it as simple as possible for a (Python) data scientist. Shipp ing them as part of our product means that other roles like frontend developers get involved. Aspects that ease development for one role, create pains for others. We want to show how balance this using Altair, Vega and Vue.

Uwe L. Korn

May 27, 2018
Tweet

More Decks by Uwe L. Korn

Other Decks in Programming

Transcript

  1. 1
    PyData Amsterdam 2018
    Uwe L. Korn
    Building customer-visible data science
    dashboards with Altair / Vega / Vue

    View Slide

  2. 2
    • Senior Data Scientist at Blue Yonder
    (@BlueYonderTech)
    • Apache {Arrow, Parquet} PMC
    • Work in Python, C++11 and SQL
    • Data Engineer and Architect with heavy
    focus around Pandas
    About me
    xhochy
    [email protected]

    View Slide

  3. 3
    1. Use Case
    2. Conflict of interests
    3. The nice compromise
    4. Technical dive-in
    Agenda

    View Slide

  4. 4
    Photo by Kari Shea on Unsplash

    View Slide

  5. 5
    Why do we need dashboards?
    • Present output of your machine learning models
    • Make insights available to non-technical users
    • Repetitive tasks can also be done much faster, even for tech-savy folks
    • „You cannot give your customer just an API“

    View Slide

  6. 6
    So, have you seen Bokeh?
    from bokeh.io import curdoc
    from bokeh.layouts import column
    from bokeh.models.widgets import TextInput, Button, Paragraph
    # create some widgets
    button = Button(label="Say HI")
    input = TextInput(value="Bokeh")
    output = Paragraph()
    # add a callback to a widget
    def update():
    output.text = "Hello, " + input.value
    button.on_click(update)
    # create a layout for everything
    layout = column(button, input, output)
    # add the layout to curdoc
    curdoc().add_root(layout)

    View Slide

  7. 7
    Why didn’t we use it?
    • It’s really great but…
    • It provides an environment to write dashboards in purely Python
    • Our frontend devs work in JavaScript et al.
    • Bokeh(js) introduces its own dependencies on the frontend
    • Building dashboards just for you or your data science team? Use it!

    View Slide

  8. 8
    What do these UI developers want?
    • Work with their native toolchain, i.e. JavaScript, CSS, … not Python
    • Choose dependencies freely
    • Don’t be constrained by the backend
    • Custom widgets should be a concern of the frontend

    View Slide

  9. 9
    Vega and Vega-lite
    Vega is a declarative format for creating,
    saving, and sharing visualization
    designs. With Vega, visualizations are
    described in JSON, …
    Vega-Lite is a more high-level version of
    this grammar approach.
    https://vega.github.io/

    View Slide

  10. 10
    VueJS for the frontend
    • Vega is for visualizations, we also need widgets
    • Could be substituted by ReactJS / Angular / …
    • provides reactive and composable view
    components
    • Basics can be learned without deep frontend
    knowlegde

    View Slide

  11. Demo
    11

    View Slide

  12. 12
    Vega(-lite) specs in Python
    from flask import Flask, jsonify
    app = Flask(__name__)
    @app.route("/barchart.json")
    def barchart():
    return jsonify({
    "$schema": "https://vega.github.io/schema/vega-lite/v2.json",
    "description": "A simple bar chart with embedded data.",
    "data": {
    "values": [
    {"a": "A","b": 28}, {"a": "B","b": 55}, {"a": "C","b": 43},
    {"a": "D","b": 91}, {"a": "E","b": 81}, {"a": "F","b": 53},
    {"a": "G","b": 19}, {"a": "H","b": 87}, {"a": "I","b": 52}
    ]
    },
    "mark": "bar",
    "encoding": {
    "x": {"field": "a", "type": "ordinal"},
    "y": {"field": "b", "type": "quantitative"}
    }
    }

    View Slide

  13. 13
    Altair
    import altair as alt
    import pandas as pd
    df = pd.DataFrame({
    'a': ["A", "B", "C", "D", "E", "F", "G", "H", "I"],
    'b': [28, 55, 43, 91, 81, 53, 19, 87, 52]
    })
    alt.Chart(df).mark_bar().encode(
    x='a',
    y='b',
    )

    View Slide

  14. Code
    14

    View Slide

  15. 15
    app = Flask(__name__)
    cars = vega_datasets.data.cars()
    @app.route("/vega-example")
    def hello():
    columns = [

    ]
    chart = alt.Chart(cars).mark_point().encode(
    x=random.choice(columns),
    y=random.choice(columns)
    )
    return jsonify(chart.to_dict())
    app.py

    View Slide

  16. 16
    HelloWorld.vue (I/II)
    <br/>import {default as vegaEmbed} from 'vega-embed'<br/>export default {<br/>methods: {<br/>reloadImage () {<br/>fetch('/vega-example').then(response => {<br/>response.json().then(spec => {<br/>vegaEmbed('#vega-box', spec, {actions: false})<br/>})<br/>})<br/>}<br/>}<br/>}<br/>

    View Slide

  17. 17
    HelloWorld.vue (II/II)




    Reload





    View Slide

  18. 18
    https://github.com/xhochy/altair-vue-vega-example

    View Slide

  19. 19
    Altair Basics
    Some really simple basics, for more see https://github.com/altair-viz/
    altair-tutorial
    from vega_datasets import data
    import altair as alt
    cars = data.cars()

    View Slide

  20. 20
    alt.Chart(cars).mark_point()

    View Slide

  21. 21
    alt.Chart(cars).mark_point().encode(
    x='Miles_per_Gallon'
    )

    View Slide

  22. 22
    alt.Chart(cars).mark_tick().encode(
    x='Miles_per_Gallon'
    )

    View Slide

  23. 23
    alt.Chart(cars).mark_point().encode(
    x='Miles_per_Gallon'
    )

    View Slide

  24. 24
    alt.Chart(cars).mark_point().encode(
    x='Miles_per_Gallon',
    y='Horsepower'
    )

    View Slide

  25. 25
    alt.Chart(cars).mark_point().encode(
    x='Miles_per_Gallon',
    y='Horsepower',
    color='Origin'
    )

    View Slide

  26. 26
    Summary
    • Chose technologies that make all involved happy
    • Talk to each other
    • Tools that work good for you, might not work for your team
    • Altair is a great visualization library
    • Use it in UIs
    • Use it in Jupyter Notebooks

    View Slide

  27. 27
    By JOEXX (Own work) [CC BY-SA 3.0 (http://creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons
    By JOEXX (Own work) [CC BY-SA 3.0 (http://creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons
    24. - 26. October
    + 2 days of sprints (27/28.10.)
    ZKM Karlsruhe, DE
    Karlsruhe
    Call for Participation opens next week.

    View Slide

  28. 28
    Karlsruhe
    24. - 26. October
    ZKM Karlsruhe
    + 2 days of sprints (27/28.10.)
    Conference all in English language.
    More info:
    http://pycon.de
    Wed Fri
    Call for Proposals OPEN!
    Tickets soon.

    View Slide

  29. 29
    I’m Uwe Korn
    Twitter: @xhochy
    https://github.com/xhochy
    Thank you!

    View Slide