Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Paris Time Series Meetup #5 - FLoWS

Paris Time Series Meetup #5 - FLoWS

Edition 5 du Paris Time Series Meetup où Mathias Herberts, CTO de SenX et éditeur de la solution Warp10, nous présente FLoWS, le langage alternatif à WarpScript.

D0a6969000c3715087b3a00abd646d20?s=128

ParisTimeSeries

July 01, 2020
Tweet

Transcript

  1. How you will Finally Love WarpScript and embrace True Time

    Series Analytics Virtual Mathias Herberts - CTO 2020-07-01
  2. None
  3. The analytics environment (library + language) of The Most Advanced

    Time Series Platform What is WarpScript?
  4. Origins

  5. Paradigm ▪ Dynamically typed concatenative language, Turing complete ▪ Functional

    by nature, everything is a function ▪ Powerful macro mechanism with early/late binding ▪ Fully extensible ▪ Sandboxed ▪ Designed for Geo Time Series manipulation
  6. In practice ▪ Implemented around a stack data structure ▪

    Functions consume their parameters off the stack ▪ Results are produced on the stack ▪ Availability of variables and registers ▪ Postfix notation (RPN - Reverse Polish Notation) ▪ Interpreted with a commercially available compiler
  7. Why ?

  8. WarpScript adoption ▪ Younger users struggle learning WarpScript ▪ Yet

    they don’t really know what is slowing them down ▪ They blame things which are not the root cause of their struggle ▪ Result is they don’t persevere ▪ And miss lots of great opportunities ▪ Took us some time to really understand what needed to be changed
  9. What causes trouble? ▪ The RPN is being blamed: 1

    2 3 4 5 FUNC ▪ But what is disturbing seems to be that you need to know how FUNC behaves ▪ Clear parameter identification makes things clearer: 1 2 (3,4,5)FUNC ▪ Still no clue about what FUNC produces ▪ Mandatory assignment makes it better: (a,b) = (3,4,5)FUNC ▪ This is conceptually what FLoWS brings to WarpScript
  10. Quick comparison ▪ FLoWS uses prefix notation for function calls

    ▪ Function parameters are specified in parentheses ▪ Function results are either ignored, used as single values, assigned or returned ▪ Some minor syntax additions ▪ Those simple changes make it more accessible while preserving everything
  11. All of !

  12. Comments Comments are important to understand code! // C++ Style

    comments /* C-Style comments */
  13. Numbers FLoWS supports LONGs and DOUBLEs 42 // LONG 3.14

    // DOUBLE 1.0E-12 // DOUBLE
  14. Booleans true // Not false false // Not true maybe

    // Syntax error
  15. Character strings Percent encoded STRINGs using UTF-8, enclosed in single

    or double quotes 'Hello' "%F0%9F%96%96" // 'Multiline Strings'
  16. Lists Comma separated expressions [ 'Hello', 3.1415, 42 ] [

    'Hello', 3.1415, 42 // Works on multiple lines too ]
  17. Maps Comma separated list of colon separated pairs of expressions

    (Key:Value) { 'A':65, '@':64, 64:'@' } { '@':64, 64:'@', 'A':65 // Works on multiple lines too }
  18. Accessing map and list elements Assuming the previous list and

    map are stored in variables list and map map['A'] // 65 map[64] // '@' list[0] // 'Hello'
  19. Function calls Comma separated list of expressions as function parameters

    F(1,2,'A',b)// F is the function name G() // Parameterless function call // Functions can return 0-N values
  20. Assignments Assignments assign values to variables A = 12 (x,

    y) = F(1) // F MUST return two values M[0][1] = 3.14 // Assign to list/map element
  21. Macros Macros are sequences of statements M = (a,b,c) ->

    { // 3 parameters ... }
  22. Return values Macros can return values just like functions, return

    as last statement M = (a,b,c) -> { // 3 parameters return F(a,b), G(c) } // return 0 to N values
  23. Strict return values Expected number of return values can be

    enforced M = (a,b,c) -> 2 { // 2 return values return F(a,b), G(c) } // Error if more than 2 return values
  24. Macro calls As for functions, comma separated list of expressions

    as parameters @M(1,2,3) // @ prefix like in WarpScript (x,y) = @M(1,2,3) // Assign return values
  25. None
  26. Full compatibility ▪ All WarpScript functions, including from extensions, are

    available in FLoWS ▪ FLoWS macros are WarpScript macros ▪ WarpScript macros can be used as FLoWS macros ▪ Variables are interchangeable, bang variables are supported: !$a ⇒ a! ▪ FLoWS can call WarpScript and vice versa, macro resolution works the same ▪ FLoWS code can be transpiled into WarpScript
  27. Minor differences ▪ FLoWS macros have an enforced fixed number

    of parameters ▪ Using some asynchronous transfer of control functions can be tricky ⇒ RETURN NRETURN STOP BREAK CONTINUE ▪ Some exotic names cannot be used as is but can via EVAL ▪ FLoWS makes extensive use of HIDE and SHOW to manage the stack ▪ Variables are scoped to a macro and its descendants, no global variables
  28. Calling FLoWS from WarpScript The FLoWS WarpScript extension provides the

    FLOWS function <' A = 1 '> FLOWS $A // Pushes 1 into the pipeline
  29. Calling FLoWS from WarpScript 42 'B' STORE <' A =

    B // Variable B from WarpScript is available in FLoWS '> FLOWS $A // Pushes 42 onto the stack
  30. Calling WarpScript from FLoWS Simply use the EVAL function B

    = 42 EVAL('$B "A" STORE') // WarpScript can access B, FLoWS will see A return A // returns 42
  31. Transpiling FLoWS The FLoWS WarpScript extension also provides the FLOWS->

    function <' return (a,b) -> { return F(b,a) } '> FLOWS-> // Generates a macro of the FLoWS code <% <% // BEGIN Macro definition #0 'L1:9-1:34' SECTION SAVE '# 0' STORE // Storing macro parameters 2 FLOWS.ASSERTDEPTH [ 'a' 'b' ] STORE // F(...) 'L1:27-1:32' SECTION 0 HIDE '# 2' STORE $b $a F '# 2' LOAD SHOW '# 0' LOAD RESTORE %> // END Macro definition #0 %>
  32. Roadmap

  33. Availability ▪ Immediately in the Warp 10 sandbox ▪ As

    a WarpScript extension later this summer ▪ Needs the latest Warp 10 code for HIDE/SHOW functions. Will need 2.7.0+ ▪ License not yet finalized ⇒ Most likely Commons Clause or BSL like
  34. Evolution ▪ Documentation will provide FLoWS syntax for all functions

    ▪ Examples will be adapted opportunistically ▪ Optional dedicated endpoint will be created for Warp 10 instances ▪ WarpStudio and VS Code support to come ▪ Tutorials will be written to accelerate onboarding
  35. in Action

  36. Typical WarpScript flow NOW = NOW() GTS = FETCH([ 'TOKEN',

    '~.*', {}, NOW, -100 ]) BUCKETIZED = BUCKETIZE([ GTS, bucketizer.last(), NOW, m(1), 0 ]) return BUCKETIZED
  37. Use of macros // Inline anonymous macro return LMAP([ 1,2,3,4

    ], (n) -> { return *(n,n) }, false) // Macro stored in a variable SQUARE = (n) -> { return *(n,n) } return LMAP([1,2,3,4], SQUARE, false) // Remote macro loaded by the WarpFleet resolver return @senx/geo/circle(48.0,-4.5,100)
  38. Conditionals (value,text) = IFTE( cond, // Assumed to contain a

    boolean () -> { return 1, 'true' }, // return two elements for value,text () -> { return 0, 'false' } // return two elements for value,text )
  39. Loops L = [] FOR(1,10, (i) -> { +!(L,*(i,i)) })

    return L
  40. A more elaborate example ▪ Trajectory of a vehicle timestamp,latitude,longitude

    ▪ Determine at each timestamp the traveled distance since start
  41. Solution using the most popular time series platform import "math"

    import "experimental" planetRadiusKm = 6371.0 // helper function to convert degrees to radians degreesToRadians = (tables=<-) => tables |> map(fn: (r) => ({ r with _value: r._value * math.pi / 180.0 })) // let's call all latitude values LATRAW = from(bucket: "my_bucket") |> range($range) |> filter(fn: (r) => r._measurement == "gps" and r._field == "latitude" ) |> degreesToRadians() |> aggregateWindow(every: $__interval, fn: mean) |> fill(column: "_value", usePrevious: true) // let's create the differences of all latitude values and shift them by one LATDIFF = from(bucket: "my_bucket") |> range($range) |> filter(fn: (r) => r._measurement == "gps" and r._field == "latitude" ) |> degreesToRadians() |> aggregateWindow(every: $__interval, fn: mean) |> difference(nonNegative: false, columns: ["_value"]) |> timeShift(duration: -$__interval, columns: ["_start", "_stop", "_time"]) |> fill(value: 0.0) // let's join both lat tables together, // so we have current and previous latitudes in one row LAT = join(tables: {raw: LATRAW, diff: LATDIFF}, on: ["_time"]) |> sort() |> map(fn: (r) => ({ _time: r._time, lat_curr: r._value_raw, lat_last: r._value_raw + r._value_diff })) // let's do this stuff again for the longitude values LONRAW = from(bucket: "my_bucket") |> range($range) |> filter(fn: (r) => r._measurement == "gps" and r._field == "longitude" ) |> degreesToRadians() |> aggregateWindow(every: $__interval, fn: mean) |> fill(column: "_value", usePrevious: true) LONDIFF = from(bucket: "my_bucket") |> range($range) |> filter(fn: (r) => r._measurement == "gps" and r._field == "longitude" ) |> degreesToRadians() |> aggregateWindow(every: $__interval, fn: mean) |> difference(nonNegative: false, columns: ["_value"]) |> timeShift(duration: -$__interval, columns: ["_start", "_stop", "_time"]) |> fill(value: 0.0) LON = join(tables: {raw: LONRAW, diff: LONDIFF}, on: ["_time"]) |> sort() |> map(fn: (r) => ({ _time: r._time, lon_curr: r._value_raw, lon_last: r._value_raw + r._value_diff })) // let's join lats and lons together, filter out NaNs (ugly), // apply haversine formula and accumulate the sums to get travel-distance join(tables: {lat:LAT, lon:LON}, on: ["_time"]) |> filter(fn: (r) => r.lat_curr >= -90.0 and r.lon_curr >= -90.0 and r.lat_last >= -90.0 and r.lon_last >= -90.0 ) |> map(fn: (r) => ({ _time: r._time, _field: "travel-distance", _value: ( 2.0 * math.atan2( x: math.sqrt(x: (math.sin(x: (r.lat_curr - r.lat_last)/2.0) * math.sin(x: (r.lat_curr - r.lat_last)/2.0)) + (math.sin(x: (r.lon_curr - r.lon_last)/2.0) * math.sin(x: (r.lon_curr - r.lon_last)/2.0)) * math.cos(x: r.lat_curr) * math.cos(x: r.lat_last)), y: math.sqrt(x: 1.0 - (math.sin(x: (r.lat_curr - r.lat_last)/2.0) * math.sin(x: (r.lat_curr - r.lat_last)/2.0)) + (math.sin(x: (r.lon_curr - r.lon_last)/2.0) * math.sin(x: (r.lon_curr - r.lon_last)/2.0)) * math.cos(x: r.lat_curr) * math.cos(x: r.lat_last))) ) * planetRadiusKm * 1000.0 })) |> cumulativeSum(columns: ["_value"])
  42. Solution using GTS = FETCH([ 'TOKEN', 'gps', {}, NOW(), h(2)

    ]) DIST = MAP([ GTS, mapper.hdist(), 1, 0, 0 ]) return MAP([ DIST, mapper.sum(), MAXLONG(), 0 0 ])
  43. The Surprise

  44. None
  45. The Lounge supports both WarpScript and /flows L = []

    FOR(1,10, (i) -> { +!(L,*(i,i)) }) return L /warpscript [] 1 10 <% DUP * +! %> FOR Execution is performed on the sandbox.senx.io instance
  46. Takeaways

  47. Takeaways ▪ FLoWS is a novel and friendly syntax for

    using the WarpScript library ▪ It is fully compatible with WarpScript ▪ It is usable where WarpScript is, including in all integrations (Spark, Pig, …) ▪ WarpScript code can be generated from FLoWS code ▪ Deployed on the Warp 10 sandbox, available as an extension soon ▪ Accessible from the Warp 10 Lounge which has just opened ⇒ lounge.warp10.io