Building Ambitious Data Visualisations

Building Ambitious Data Visualisations IVAN VANDERBYL Co-founder Flood IO

What is a data visualisation?

“Visual representations of abstract data to amplify cognition” — Readings
in information visualization: using vision to think. Morgan Kaufmann, 1999.

What's the most natural tool for specifying a visualization?

A configurable chart?

High level charting library?

Low level geometric shapes and graphical markings?

efficiency expressiveness

Efficiency

Configurable Charting Package® XP Pro

“If we endeavour to develop a charting instead of a
graphing program, we will accomplish two things. First, we inevitably will offer fewer charts than people want. Second, our package will have no deep structure. Our computer program will be unnecessarily complex, because we will fail to reuse objects or routines that function similarly in different charts. And we will have no way to add new charts to our system without generating complex new code. Elegant design requires us to think about a theory of graphics, not charts.” — Leland Wilkinson, The Grammar of Graphics

“Elegant design requires us to think about a theory of
graphics, not charts.”

Separation of concerns

series: [{ data: [29.9, 71.5, 106.4, 129.2, 144.0, 176.0, 135.6,
148.5, {y: 216.4, marker: { fillColor: '#BF0B23', radius: 10 } }, 194.4] }] series: [{ data: [29.9, 71.5, 106.4, 129.2, 144.0, 176.0, 135.6, 148.5, {y: 216.4, marker: { fillColor: '#BF0B23', radius: 10 } }, 194.1, 54.4] }] Data Presentation

Source: http://blog.codinghorror.com/the-php-singularity/

Single Responsibility Principle

scaleGridLineColor : "rgba(0,0,0,.05)", //Number - Width of the grid lines
scaleGridLineWidth : 1, //Boolean - Whether to show horizontal lines (except X axis) scaleShowHorizontalLines: true, //Boolean - Whether to show vertical lines (except Y axis) scaleShowVerticalLines: true, //Boolean - Whether the line is curved between points bezierCurve : true, //Number - Tension of the bezier curve between points bezierCurveTension : 0.4, //Boolean - Whether to show a dot for each point pointDot : true, //Number - Radius of each point dot in pixels pointDotRadius : 4, //Number - Pixel width of point dot stroke pointDotStrokeWidth : 1, //Number - amount extra to add to the radius to cater for hit detection outside the drawn point pointHitDetectionRadius : 20, //Boolean - Whether to show a stroke for datasets datasetStroke : true, //Number - Pixel width of dataset stroke datasetStrokeWidth : 2, //Boolean - Whether to fill the dataset with a colour datasetFill : true, //String - A legend template legendTemplate : "<ul class=\"<%=name.toLowerCase()%>-legend\"><% for (var i=0; i<datasets.

import { technicalDebt } from "charting-package";

let chartConfig = { bigRedLine: true, bigRoundDotsOnLines: true, poniesEnabled: false,
puppiesEnabled: true, thatFeatureMyBossAskedFor: true, notSureWhatThisOptionOptions: false, accuracy: “good” };

Expressiveness

d3-shape “A small JavaScript library for drawing geometric shapes commonly
found in data visualizations.”

LIVE EXAMPLE: ec16.tomster.io/curves/linear

Live Example App: ec16.tomster.io/line-test

ember install ember-cli-d3-shape Use it with ember, today.

import { arc, pie } from 'd3-shape'; let arcFn =
arc() .cornerRadius(8) .innerRadius(200) .outerRadius(232); let pieFn = pie().padAngle(5/360); arcFn(pieFn([80,20])[0]); arcFn(pieFn([80,20])[1]); //"M2.1271150941153607...

import { quantile } from 'd3-array'; quantile([0, 4, 7, 9,
12, 18, 22, 25, 28, 31], 0.95); // 29.65

import { deviation, extent, histogram, thresholdFreedmanDiaconis, thresholdScott, thresholdSturges, max, mean,
median, min, permute, quantile, range, scan, shuffle, sum, ticks, transpose, variance, Zip, ... } from 'd3-array';

Theory of Graphics

"concurrency_mean": [ { "timestamp": 1450345920000, "flood_id": 1697, "grid_id": 553, "project_id":
1, "value": 200, "label": null }, { "timestamp": 1450345935000, "flood_id": 1697, "grid_id": 553, "project_id": 1, "value": 200, "label": null }, { "timestamp": 1450345950000, "flood_id": 1697, "grid_id": 553, "project_id": 1, "value": 400, "label": null }, { "timestamp": 1450345965000, "flood_id": 1697, "grid_id": 553, "project_id": 1, "value": 400, "label": null }, "response_time_mean": [ { "timestamp": 1450345920000, "flood_id": 1697, "grid_id": 553, "project_id": 1, "value": 1885, "label": null }, { "timestamp": 1450345935000, "flood_id": 1697, "grid_id": 553, "project_id": 1, "value": 2023, "label": null }, { "timestamp": 1450345950000, "flood_id": 1697, "grid_id": 553, "project_id": 1, "value": 1938, "label": null }, { "timestamp": 1450345965000, "flood_id": 1697, "grid_id": 553, "project_id": 1, "value": 1940, "label": null }, "transaction_rate_mean": [ { "timestamp": 1450345920000, "flood_id": 1697, "grid_id": 553, "project_id": 1, "value": 465, "label": null }, { "timestamp": 1450345935000, "flood_id": 1697, "grid_id": 553, "project_id": 1, "value": 1074, "label": null }, { "timestamp": 1450345950000, "flood_id": 1697, "grid_id": 553, "project_id": 1, "value": 1256, "label": null }, { "timestamp": 1450345965000, "flood_id": 1697, "grid_id": 553, "project_id": 1, "value": 2090, "label": null },

Response Time

Response Time Concurrency

Response Time Concurrency Passed / RPM Failed / RPM

http://www.caltrain.com/schedules/PDF_Schedules.html

https://bl.ocks.org/mbostock/5544008

The correct representation doesn’t matter as much as the accuracy
of the visualisation.

Fast & Furious 6 - Universal Pictures

You can’t exaggerate the presentation without disregarding the underlying data.

The Grammar of Graphics Wilkinson, Leland. The grammar of graphics.
Springer Science & Business Media, 2006. 1. Data 2. Transforms 3. Scales 4. Coordinates 5. Elements 6. Guides

Empirical Data E.g. Events observed in the real world Abstract
data E.g. Data generated by a modeling function. range(0, 100, 0.25), etc. Metadata E.g. Data about data

import Route from "ember-route"; import fetch from "ember-network/fetch"; export default
Route.extend({ model() { return fetch("/api/metrics").then((response) => response.json()); }, setupController(controller, metrics) { controller.setProperties({ metrics }); } });

"response_time_mean": [ { "timestamp": 1450345920000.0, "flood_id": 1697, "grid_id": 553, "project_id":
1, "value": 1885, "label": null }, { "timestamp": 1450345935000.0, "flood_id": 1697, "grid_id": 553, "project_id": 1, "value": 2023, "label": null }, { "timestamp": 1450345950000.0, "flood_id": 1697, "grid_id": 553, "project_id": 1, "value": 1938, "label": null }, { "timestamp": 1450345965000.0, "flood_id": 1697, "grid_id": 553, "project_id": 1, "value": 1940, "label": null }, [ [ 1450345920000, 1885 ], [ 1450345935000, 2023 ], [ 1450345950000, 1938 ], [ 1450345965000, 1940 ], [ 1450345980000, 1914 ], [ 1450345995000, 1996 ], [ 1450346010000, 1912 ], [ 1450346025000, 2171 ],

export default Controller.extend({ /** * Represents all the metrics we
want to display. * Example: * {response_time_mean: [{timestamp: TIMESTAMP, value: VALUE, label...}, ...]} * * @type {Object} */ metrics: {}, /** * Computes response times collection containing [TIMESTAMP, VALUE] pairs. */ responseTimeValues: computed.map('metrics.response_time_mean.[]', (d) => [d.timestamp, d.value]), });

1500 54 SCALE Input domain=[1200,4000] Output domain=[0,500]

d3-scale github.com/d3/d3-scale

export default Component.extend(Coordinates, { values: [], xValues: computed.map('values.[]', (d) =>
new Date(d[0])), yValues: computed.mapBy('values.[]', 'lastObject'), xDomain: computedExtent('xValues.[]'), yDomain: computedExtent('yValues.[]'), xRange: computedExtent('plotArea.left', 'plotArea.width'), yRange: computedExtent('plotArea.top', 'plotArea.height'), xScale: computed('xRange', 'xDomain', { get() { const { xDomain: domain, xRange: range } = this.getProperties('xRange', 'xDomain'); return scaleTime().domain(domain).rangeRound(range); } }), yScale: computed('yRange', 'yDomain', { get() { const { yDomain: domain, yRange: range } = this.getProperties('yRange', 'yDomain'); return scaleLinear().domain(domain).rangeRound(range.reverse()); } }), });

Boilerplate

github.com/ivanvanderbyl/maximum-plaid

// components/plaid-plot/template.hbs {{yield (hash line=(component "plaid-line" xScale=xScale yScale=yScale x=plotArea.left y=plotA
scatter=(component "plaid-scatter" xScale=xScale yScale=yScale x=plotArea.left y symbol=(component "plaid-symbol" x=plotArea.left y=plotArea.top) bottom-axis=(component "plaid-axis" orientation="bottom" scale=xScale top=plotAr )}}

{{#plaid-plot xScale yScale plotArea as |plot|}} {{plot.bottom-axis}} {{plot.line values stroke=stroke
strokeWidth="2"}} {{#plot.scatter values as |x y|}} <circle cx={{x}} cy={{y}} r="2" fill-opacity="0.54"/> {{/plot.scatter}} {{/plaid-plot}}

LIVE EXAMPLE: ec16.tomster.io/timeline

d3-shape features implemented: - [x] Lines (plaid-line) - [x] Symbols
(plaid-symbol) - [ ] Areas (plaid-area) - [ ] Pie - [ ] Donut - [ ] Arc - [ ] Stack Layout - [-] Curves Additions: - [x] Computed data transforms https://github.com/ivanvanderbyl/maximum-plaid/issues/1

“All sufficiently ambitious applications will eventually require data visualisation”

Thank You

CODE: github.com/ivanvanderbyl/maximum-plaid DEMO: ec16.tomster.io SLIDES: ec16.tomster.io/slides NEXT: HallwayConf (Next to
this room) Ember Community Slack “@ivan”

Building Ambitious Data Visualisations

Building Ambitious Data Visualisations

More Decks by Ivan Vanderbyl

Other Decks in Programming

Featured

Transcript