Slide 1

Slide 1 text

Advanced Time Series Technology & Use Cases Mathias Herberts - CTO [email protected] @herberts

Slide 2

Slide 2 text

Introduction

Slide 3

Slide 3 text

Time Series are universal and ubiquitous ■ Time Series are all about capturing change, not simply state ■ Time Series help understand the past and predict the future ■ Time Series are the bridges between the physical world and its digital twin ■ Time Series are the memory of the universe we live in ■ Time Series are eating the world

Slide 4

Slide 4 text

What are Time Series? ■ Time Series are sequences of values indexed by time ■ Time is an illusion, any sequence can be seen as a Time Series

Slide 5

Slide 5 text

Where can Time Series be found? ■ Time Series are present in many if not all verticals

Slide 6

Slide 6 text

Why do Time Series require specific tools? ■ Time Series data are different by nature ■ Their production rate is massive and continuous ■ The historical datasets that need to be retained are gigantic ■ The access pattern to Time Series data is unique ■ The type of analysis performed on Time Series data is uncommon ■ Traditional tools MUST be adapted if they are to be used

Slide 7

Slide 7 text

for Machine Data Storage Analytics Visualization

Slide 8

Slide 8 text

Data Model

Slide 9

Slide 9 text

A universal data model

Slide 10

Slide 10 text

Geo Time Series™ data containers

Slide 11

Slide 11 text

Architecture

Slide 12

Slide 12 text

Warp 10™ standalone version Single jar, no external dependencies in-memory disk based persistence HDD / SSD

Slide 13

Slide 13 text

Standalone Warp 10™ Standalone Warp 10™ Standalone Warp 10™ Standalone with datalog replication

Slide 14

Slide 14 text

Standalone Warp 10™ Standalone Warp 10™ Standalone Warp 10™ Standalone with datalog sharding

Slide 15

Slide 15 text

Metadata index WarpScript™ analytics engine Ingestion endpoint Persistence daemon Warp 10™ distributed version

Slide 16

Slide 16 text

Storage

Slide 17

Slide 17 text

A high performance Geo TSDB ■ Simple interaction via HTTP and text format for easy integration ■ Ability to ingest and fetch very long streams of data points ■ Support for WebSocket input and output ■ Fine grained access control via cryptographic tokens ■ Proven scalability with no cardinality problems ■ Support for Univariate and Multivariate data points ■ Distributed throttling mechanisms for number of series and data points rate

Slide 18

Slide 18 text

Anatomy of storage engine input TIMESTAMP/LATITUDE:LONGITUDE/ELEVATION CLASS{LABELS} VALUE ■ Support for time precisions from ns to ms ■ Class and labels support UTF-8 in both names and values ■ Support for 5 types LONG, DOUBLE, BOOLEAN, STRING, BINARY 64 -Infinity NaN 4E-05 F ’foo’ b64:UmVmbHV4Cg== ■ Support for nested Multivariate values - each MV is a GTS (Geo Time Series™) [ 2/42 64/48.0:-4.5/’hello’ 128/[ 1 2 3 ] 256/hex:12345 ]

Slide 19

Slide 19 text

Real world scalability and performance figures ■ Known deployments of over 500M series ■ Ingestion performance of 120M data points per second on a single in-memory ■ Historical datasets of several hundreds of trillions of data points ■ Sustained ingestion of several million data points per second per ingress ■ Ingestion of over 300k data points per second on a single thread on a RPi 4 ■ Random deletions at several million data points per second

Slide 20

Slide 20 text

Analytics

Slide 21

Slide 21 text

Built around a data processing language

Slide 22

Slide 22 text

Full featured language dedicated to Time Series ■ Fully functional concatenative language ■ Turing complete with loops, conditionals, asynchronous transfer of control ■ Supports Geo Time Series as first class citizens ■ Over 980 functions available - from summary statistics to signal processing ■ 6 frameworks - BUCKETIZE, MAP, REDUCE, FILL, APPLY, FILTER ■ Fully extensible and embeddable ■ Ability to call external programs

Slide 23

Slide 23 text

Web IDE and Visual Studio Code Plugin

Slide 24

Slide 24 text

Powerful expressiveness [ ‘TOKEN’ ‘class’ {} NOW 24 h ] FETCH ‘gts’ STORE // Fetch last 24 hours [ $gts bucketizer.mean NOW 0 1 m ] BUCKETIZE ‘mean’ STORE // mean every 1’ [ $gts mapper.rate 1 0 0 ] MAP ‘rate’ STORE // Compute rate of change NEWGTS 'randomwalk' RENAME 0.0 'v' STORE 42 PRNG 1 1000 <% 10 m * NOW SWAP - NaN DUP DUP $v SRAND 0.5 - + 'v' STORE $v ADDVALUE %> FOR

Slide 25

Slide 25 text

Complex algorithms available as simple functions NEWGTS 'randomwalk' RENAME 0.0 'v' STORE 42 PRNG 1 1000 <% 10 m * NOW SWAP - NaN DUP DUP $v SRAND 0.5 - + 'v' STORE $v ADDVALUE %> FOR DUP 100 LTTB

Slide 26

Slide 26 text

Hiding WarpScript complexity in macros NEWGTS 'randomwalk' RENAME 0.0 'v' STORE 42 PRNG 1 1000 <% 10 m * NOW SWAP - NaN DUP DUP $v SRAND 0.5 - + 'v' STORE $v ADDVALUE %> FOR 'UTC' @senx/cal/byday

Slide 27

Slide 27 text

No content

Slide 28

Slide 28 text

Complete documentation online at warp10.io

Slide 29

Slide 29 text

Visualization

Slide 30

Slide 30 text

Flexible visualization options

Slide 31

Slide 31 text

Full support for Processing in WarpScript 800 'width' STORE 800 'height' STORE 400.0 'maxspeed' STORE 40000.0 'maxalt' STORE 3.0 2.0 2.0 @orbit/heatmap/kernel/triangular 'kernel' STORE @orbit/heatmap/palette/classic 'palette' STORE 'TOKEN''token' STORE $width $height '2D' PGraphics 'MULTIPLY' PblendMode 'CENTER' PimageMode [ $token '~(ALT|CAS)' {} NOW -2000000 ] FETCH DUP 0 GET LASTTICK 'now' STORE [ SWAP bucketizer.last $now STU 0 ] BUCKETIZE // Create heatmap <% 7 GET LIST-> DROP 'CAS' STORE 'ALT' STORE <% $CAS ISNULL NOT $ALT ISNULL NOT && %> <% $kernel $CAS $maxspeed / $width * $ALT $maxalt / 1.0 SWAP - $height * Pimage %> IFT 0 NaN NaN NaN NULL %> MACROREDUCER 'GRAPHER' STORE [ SWAP [] $GRAPHER ] REDUCE DROP // Colorize Ppixels <% DROP Palpha $palette SWAP GET %> LMAP PupdatePixels Pencode Pdecode $width $height '2D' PGraphics // Do the grid PnoFill 0 0 $width 1 - $height 1 - Prect 2.0 PstrokeWeight 200.0 Pcolor Pstroke 250.0 $maxspeed / $width * DUP 0 SWAP $height Pline 0 10000 $maxalt / 1.0 SWAP - $height * DUP $width SWAP Pline SWAP 0 0 Pimage Pencode

Slide 32

Slide 32 text

Extensibility

Slide 33

Slide 33 text

Macros Factorizing WarpScript code to separate responsabilities and encourage reusability <% // This is a macro body %> ■ Macros can be deployed on the server side ■ Macros can be packaged in a jar ■ Macros can access some config elements (MACROCONFIG) ■ Macros can be deployed on a remote server

Slide 34

Slide 34 text

WarpFleet™ Resolver Enable hosting of macros on remote servers ■ Macros can be hosted on any HTTP server including GitHub ■ Resolution is performed at runtime ■ Support for multiple macro repositories ■ Script execution can modify repositories ■ WarpFleet™ resolver can be disabled altogether ■ Support for versioning via the IMPORT function ■ SenX provides a growing set of macros via its own repo ■ Warp 10 does intelligent caching of fetched macros ■ Support for runtime injection of elements (MACROCONFIG)

Slide 35

Slide 35 text

Extensions Add, remove or modify WarpScript functions ■ Write new functions in Java (JVM), Go, Rust, C++, C (JNA) ■ Simple API to interact with the WarpScript execution runtime ■ Freedom of licensing for extensions ■ Growing list of existing extensions, contributions welcome! Barcode, GeoTransforms, Grok, InfluxDB, JDBC, PCap, PMML, Polyglot, Redis, S3, Swift, TensorFlow, EGADS, Elastic, GCode, H2O, Keras, memcached, Parquet, ORC, Neo4J, OpenTSDB, LAS, Pig, Spark Some commercial ones by SenX LevelDB, MapMatching, Forecasting, WarpScript Compiler

Slide 36

Slide 36 text

Plugins Extend Warp 10 by adding new features ■ Plugins are run in the Warp 10 process ■ Plugins can be in a Java (JVM) or Go, Rust, C, C++ (JNA) ■ Very diverse things can be done using plugins ■ Authentication plugins add new types of credentials ■ No license constraints Kafka, MQTT, WarpStudio, Zeppelin, HTTP, UDP, TCP, Py4J, InfluxDB Line Protocol OVH is considering open sourcing plugins to support PromQL, Graphite, OpenTSDB, InfluxQL query languages Poke them to make it happen!

Slide 37

Slide 37 text

WarpFleet™ Community site for finding extensions, macro packages and plugins ■ CLI tool on NPM - npm install -g @senx/warpfleet ■ Modules are hosted on maven repositories ■ Benefit from dependency resolution mechanisms ■ Modules can be fetched by Spark for example ■ Again, contributions more than welcome!

Slide 38

Slide 38 text

Integrations

Slide 39

Slide 39 text

Augment existing tools and frameworks

Slide 40

Slide 40 text

Use Cases

Slide 41

Slide 41 text

Flight data analysis for fleet reliability Pressure Altitude vs TAS

Slide 42

Slide 42 text

Weather data 1,000,000 cells 400 parameters 208 time steps 86 B data points every 6 hours in 400 M series Using rank 2 tensors multi values Warp 10 can store all of GFS in just 1,000,000 Geo Time Series

Slide 43

Slide 43 text

Helping racing sailboats fly Automatic phase extraction by TWA analysis

Slide 44

Slide 44 text

chemtrails-locator.com 200,000 aircrafts 15 B positions Spatio-temporal indexing 150 km / 5 minutes cells Served entirely by Warp 10

Slide 45

Slide 45 text

sandbox.senx.io

Slide 46

Slide 46 text

@SenXHQ - @Warp10io - @WarpScript senx.io - warp10.io