Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The F1 Demo: Streaming Real-time Telemetry Using Apache Kafka and StreamSets

B368ef65fbf835fc57b08617f9b8d5a5?s=47 OmniSci
September 05, 2019

The F1 Demo: Streaming Real-time Telemetry Using Apache Kafka and StreamSets

The F1 Demo: Streaming Real-time Telemetry Using Kafka and StreamSets
By Randy Zwitch, Senior Director of Developer Relations at OmniSci

We have created a demo that streams telemetry data from an F1 video game into OmniSci (a GPU-accelerated relational database) for use in our booth at conferences. Visitors can drive a lap, and onlookers can watch the data flow into OmniSci via StreamSets as well as see a custom Python Dash app to visualize the data in real-time. We'd like to present the StreamSets portion at your conference.

Here is our GitHub repo with the StreamSets portion explained: https://github.com/omnisci/vehicle-telematics-analytics-demo/tree/master/dataengineering.

B368ef65fbf835fc57b08617f9b8d5a5?s=128

OmniSci

September 05, 2019
Tweet

Transcript

  1. The F1 Demo: Streaming Real-time Telemetry Using Apache Kafka and

    StreamSets DataOps Summit SF - September 5, 2019
  2. Randy Zwitch Senior Director of Developer Advocacy @randyzwitch randy.zwitch@omnisci.com /in/randyzwitch/

    /randyzwitch
  3. Volume Agility Spatio- Temporal

  4. (You’re welcome, Dima)

  5. OmniSciDB: Compiled, Columnar and (Lots of) Cores Traditional DBs can

    be highly inefficient - Each operator in SQL treated as a separate function - Incurs tremendous overhead and prevents vectorization OmniSci compiles queries w/ LLVM to create one custom function - Queries run at speeds approaching hand-written functions - LLVM enables generic targeting of different architectures (GPUs, X86, ARM, etc.) - Code can be generated to run query on CPU and GPU simultaneously
  6. The F1 Demo at NVIDIA GTC 2019

  7. “We need to build something cool for our booth...”

  8. Step 1: Write UDP stream to Kafka https://raw.githubusercontent.com/omnisci/vehicle-telematics-analytics-demo/master/dataengineering/pipelines/UD P736c69c5-0b2b-4e9a-8263-85d8bd5e5fd2.json

  9. Step 2: Parse UDP Packets, Write to Kafka https://raw.githubusercontent.com/omnisci/vehicle-telematics-analytics-demo/master/dataengineering/pipelines/01P arsemessagestoJSONcopy3d6023cc-0620-4312-9957-01f0d91b8302.json

  10. Step 3: Parse JSON, Write to OmniSci https://raw.githubusercontent.com/omnisci/vehicle-telematics-analytics-demo/master/dataengineering/pipelines/02L oadF1messagestoOmniSci269b8b03-6dd1-4744-980b-bf7008ff714b.json

  11. What I Learned - Build large pipelines as a series

    of smaller pipelines - Watch your defaults when developing! - Avoid serializing to plain text to improve throughput - Watch out for Jython issues in multi-threaded pipelines / Use Groovy instead
  12. • • • • •

  13. None