Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Warp 10: Collect, store and manipulate sensor data - BreizhCamp 2016

Warp 10: Collect, store and manipulate sensor data - BreizhCamp 2016

6f8d092fec403f766c734ce36e1eef93?s=128

Horacio Gonzalez

March 24, 2016
Tweet

Transcript

  1. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Warp 10: Collect, store and manipulate

    sensor data Horacio Gonzalez Sébastien Lambour
  2. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Horacio Gonzalez @LostInBrittany Cityzen Data Spaniard

    lost in Brittany, developer, dreamer and all- around geek
  3. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Sébastien Lambour @FinistSeb Cityzen Data Runner,

    2 Kids, Geek, Handyman, Polyglot JVM Developer
  4. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Warp 10

  5. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Introduction Geo-Time SeriesTM Image: Spacetime distorsions

  6. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Time Series Image: Mike Bostock

  7. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Time series storage and analysis Image:

    Hamza Fessi and ABC Bourse Not suited for your vanilla SQL RDBMS One simple example: moving averages...
  8. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Geo-Time SeriesTM Image: AIS Vessel Tracking

  9. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Geo-Time Series

  10. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Geo-Time Series and the IoT Image:

    LinkedIn
  11. @FinistSeb @LostInBrittany #BzhCmp #Warp10 IoT means talking thing How fast

    are they talking?
  12. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Internet of very introverted Things Long

    range transmissions
  13. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Internet of introverted Things Personal Area

    Network
  14. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Internet of shy Things Local Area

    Network Cellular Networks
  15. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Lots of shy thing generate a

    huge lot of data Image: Universal Studios
  16. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Internet of chatty Things 10 000

    Hz 670 000 sensors 20 000 metrics per second
  17. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Internet of garrulous Things Image: Google

    Millions of metrics per second
  18. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Warp 10 : A software platform

    for IoT Warp 10 is a software platform that • Ingests and stores data • Manipulates and analyzes data • Is dedicated to data from sensors, meters, IoT and any real or virtual probe
  19. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Warp 10 General Synoptic Stockage Architecture

    Language, Functions, Algorithms Application access Vizualisation Real Time
  20. @FinistSeb @LostInBrittany #BzhCmp #Warp10 #collect How do you get these

    metrics? Image: Games Radar
  21. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Using our own Sensision agent

  22. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Using our own Sensision agent With

    queue forwarder
  23. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Using plugins for other collecting systems

  24. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Or simply pushing data directly

  25. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Choosing an input format

  26. @FinistSeb @LostInBrittany #BzhCmp #Warp10 XML? JSON? 139 bytes 108 bytes

  27. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Warp 10 GTS Input Format 57

    bytes But size isn't the most important reason parsing time is way more important XML or even JSON parsing is slow and costly Warp 10 GTS input format isn't
  28. @FinistSeb @LostInBrittany #BzhCmp #Warp10 timestamp (us by default) latitude:longitude (WGS84)

    elevation (millimeters) classname* labels (key=value) value* (long, double, boolean or string) * mandatory fields Warp 10 GTS Input Format
  29. @FinistSeb @LostInBrittany #BzhCmp #Warp10 #store From tiny to huge Image:

    Games Radar
  30. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Warp 10 on Raspberry Pi B+

    1 000 datapoints per second
  31. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Warp 10 on Raspberry Pi 2

    B 3 000 datapoints per second
  32. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Warp 10 on a modern server

    120 000 datapoints per second
  33. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Warp 10 on a cluster 3

    millions of datapoints per second (our current record on input traffic)
  34. @FinistSeb @LostInBrittany #BzhCmp #Warp10 #analyse From tiny to huge Image:

    Amazon
  35. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Many time-series solutions TSAR

  36. @FinistSeb @LostInBrittany #BzhCmp #Warp10 But they are only stores... Fetching

    data is only the tip of the iceberg
  37. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Analysing the data High level analysis

    must be done elsewhere
  38. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Algorithms are resource hungry

  39. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Your computer is not a datacenter

  40. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Manipulating GTS To be scalable, analysis

    must be done in Warp 10 platform, not in user's computer
  41. @FinistSeb @LostInBrittany #BzhCmp #Warp10 A true GTS analysis toolbox ◦

    Hundreds of functions ◦ Manipulation frameworks ◦ Analysis workflow Manipulating GTS
  42. @FinistSeb @LostInBrittany #BzhCmp #Warp10 GTS manipulation Why not a simple

    REST API? • One endpoint by function? • How to chain a workflow analysis? REST API not suitable for complex manipulations
  43. @FinistSeb @LostInBrittany #BzhCmp #Warp10 GTS manipulation Why not a SQL

    dialect? • How do you do a simple moving average in SQL? • How do you geo-time fencing in SQL? SQL is not adapted to (G)TS analysis!
  44. @FinistSeb @LostInBrittany #BzhCmp #Warp10 GTS manipulation language Our solution: a

    GTS manipulation language WarpScript
  45. @FinistSeb @LostInBrittany #BzhCmp #Warp10 A stack based language

  46. @FinistSeb @LostInBrittany #BzhCmp #Warp10 WarpScript Non-compiled Optimized functions, fast execution

  47. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Basic operations

  48. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Five frameworks • • • •

  49. @FinistSeb @LostInBrittany #BzhCmp #Warp10 More than 500 functions

  50. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Time series functions

  51. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Time series functions

  52. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Geo-Time Series functions Geo mapping (WKT)

    Horizontal & vertical speed Horizontal & vertical distance Haversine ...
  53. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Quantum IDE

  54. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Enough teasing...

  55. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Fuel prices data 16 297 448

    metrics 11 379 fuel stations 42 885 Geo Time Series
  56. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Basic analysis Average diesel fuel prices

    in France since 2007 Image: LEGO Ideas
  57. @FinistSeb @LostInBrittany #BzhCmp #Warp10 First Fetch Data (SQL vs WarpScript

    )
  58. @FinistSeb @LostInBrittany #BzhCmp #Warp10 FETCH gives us a GTS list

  59. @FinistSeb @LostInBrittany #BzhCmp #Warp10 FETCH gives us a GTS list

    Timestamp (microseconds since epoch)
  60. @FinistSeb @LostInBrittany #BzhCmp #Warp10 FETCH gives us a GTS list

    Location (latitude, longitude)
  61. @FinistSeb @LostInBrittany #BzhCmp #Warp10 FETCH gives us a GTS list

    Value
  62. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Calculate the average Using Groovy:

  63. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Calculate the average with WarpScript 1-

    Calculate the mean price by station
  64. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Calculate the average with WarpScript BUCKETIZE

    framework Put the data of a GTS into regularly spaced buckets
  65. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Calculate the average with WarpScript

  66. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Calculate the average with WarpScript 2-

    Reduce to get the global average
  67. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Calculate the average with WarpScript REDUCE

    framework Apply a function on a set of GTS tick by tick
  68. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Too verbose? Write it differently

  69. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Even more concise

  70. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Basic analysis Mean of the last

    available diesel fuel prices in France Image: LEGO Ideas
  71. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Fetching Data (SQL vs WarpScript )

  72. @FinistSeb @LostInBrittany #BzhCmp #Warp10 FETCH gives us a GTS list

  73. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Mean of those last prices align

    ticks with BUCKETIZE framework compute the average with REDUCE
  74. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Geo-time analysis Find the cheapest fuel

    station near here 48.115434, -1.636877
  75. @FinistSeb @LostInBrittany #BzhCmp #Warp10 WKT: Well-known text geometry

  76. @FinistSeb @LostInBrittany #BzhCmp #Warp10 … WKT in WarpScript

  77. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Geo-filtering points of GTS

  78. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Geo-filtering points of GTS MAPPER framework

    Apply a function on values of a GTS that fall into a sliding window
  79. @FinistSeb @LostInBrittany #BzhCmp #Warp10 The stations near my position

  80. @FinistSeb @LostInBrittany #BzhCmp #Warp10 There can only be one

  81. @FinistSeb @LostInBrittany #BzhCmp #Warp10 And this is only the surface

    Possibilities are endless
  82. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Think differently Geo-Time Series are everywhere

  83. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Warp 10 platform and tools

  84. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Everything is on GitHub https://github.com/cityzendata/

  85. @FinistSeb @LostInBrittany #BzhCmp #Warp10 Thank you !