Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Visualizing Streams

A23789f299ed06fe7d9f1c6940440bfa?s=47 FTisiot
December 04, 2018

Visualizing Streams

Building a Modern Analytical Platform with Kafka, Apache Drill and Oracle Data Visualization

A23789f299ed06fe7d9f1c6940440bfa?s=128

FTisiot

December 04, 2018
Tweet

Transcript

  1. info@rittmanmead.com www.rittmanmead.com @rittmanmead !1 Visualizing Streams

  2. info@rittmanmead.com www.rittmanmead.com @rittmanmead Francesco Tisiot BI Tech Lead at Rittman

    Mead Verona, Italy Rittman Mead Blog 10 Years Experience in BI/Analytics francesco.tisiot@rittmanmead.com @FTisiot Oracle ACE !2
  3. info@rittmanmead.com www.rittmanmead.com @rittmanmead About Rittman Mead !3 Rittman Mead is

    a data and analytics company who specialise in data visualisation, predictive analytics, enterprise reporting and data engineering. We use our skill, experience and know-how to work with organisations across the world to interpret their data. We enable the business, the consumers, the data providers and IT to work towards a common goal, delivering innovative and cost-effective solutions based on our core values of thought leadership, hard work and honesty. We work across multiple verticals on projects that range from mature, large scale implementations to proofs of concept and can provide skills in development, architecture, delivery, training and support.
  4. info@rittmanmead.com www.rittmanmead.com @rittmanmead !4 Visualizing Streams

  5. info@rittmanmead.com www.rittmanmead.com @rittmanmead Let Me Know My Audience !5 Who

    Likes + ? +
  6. info@rittmanmead.com www.rittmanmead.com @rittmanmead Let Me Know My Audience !6 Who

    Likes + ? +
  7. info@rittmanmead.com www.rittmanmead.com @rittmanmead !7 The Good Old Days… Photo by

    Bruno Martins on Unsplash
  8. info@rittmanmead.com www.rittmanmead.com @rittmanmead !8 I Need This “Precise” KPI! Ok!

    Let’s Create the Model and ETL the Data! Photo by Cristina Gottardi on Unsplash
  9. info@rittmanmead.com www.rittmanmead.com @rittmanmead !9 Predefined KPIs • Structured Reporting Database

    Batch Processing • Overnight Load
  10. info@rittmanmead.com www.rittmanmead.com @rittmanmead !10 Self Service Analytics Photo by Dominik

    Scythe on Unsplash
  11. info@rittmanmead.com www.rittmanmead.com @rittmanmead !11 Python R Data Scientist Photo by

    Lucas Vasques on Unsplash
  12. info@rittmanmead.com www.rittmanmead.com @rittmanmead !12 Visualize Business Analyst Extract Calculate Photo

    by Craig Garner on Unsplash
  13. info@rittmanmead.com www.rittmanmead.com @rittmanmead !13 IT Driven Organised Pre-Defined OBIEE Photo

    by Tiago Muraro on Unsplash
  14. info@rittmanmead.com www.rittmanmead.com @rittmanmead Business Intelligence Tools !14 Oracle Analytics

  15. info@rittmanmead.com www.rittmanmead.com @rittmanmead !15 Business Driven Data Discovery No Prebuilt

    Model Data Visualization Access To Raw Data Photo by Samuel Zeller on Unsplash
  16. info@rittmanmead.com www.rittmanmead.com @rittmanmead Data Visualization !16 • Information Exploration and

    Discovery - Single Panel Analytics - Data Mashup - Integrated with OBIEE - DataFlow Component
  17. info@rittmanmead.com www.rittmanmead.com @rittmanmead DataFlow Component !17 • Transform/Enrich Data -

    Filter - Aggregate - Join - Store Locally or Push Back - V4 Release • Machine Learning • Essbase Cube
  18. info@rittmanmead.com www.rittmanmead.com @rittmanmead !18 Predefined KPIs • Structured Reporting Database

    Batch Processing • Overnight Load
  19. info@rittmanmead.com www.rittmanmead.com @rittmanmead !19 Volume Velocity Variety $$$ BIG DATA

  20. info@rittmanmead.com www.rittmanmead.com @rittmanmead !20

  21. info@rittmanmead.com www.rittmanmead.com @rittmanmead !21 Predefined KPIs • Structured Reporting Database

    Batch Processing • Overnight Load
  22. info@rittmanmead.com www.rittmanmead.com @rittmanmead !22 Real Time Analytics Photo by Genessa

    Panainte on Unsplash Batch vs Stream
  23. info@rittmanmead.com www.rittmanmead.com @rittmanmead !23 Any Source Any Target Open Formats

    Scalable
  24. info@rittmanmead.com www.rittmanmead.com @rittmanmead !24 https://www.confluent.io/product/confluent-platform/ Data Hub

  25. info@rittmanmead.com www.rittmanmead.com @rittmanmead !25 https://www.confluent.io/product/confluent-platform/

  26. info@rittmanmead.com www.rittmanmead.com @rittmanmead !26 Hub!

  27. info@rittmanmead.com www.rittmanmead.com @rittmanmead !27 https://www.confluent.io/product/confluent-platform/ Client Library

  28. info@rittmanmead.com www.rittmanmead.com @rittmanmead !28 https://www.confluent.io/product/confluent-platform/ SQL!

  29. info@rittmanmead.com www.rittmanmead.com @rittmanmead !29

  30. info@rittmanmead.com www.rittmanmead.com @rittmanmead !30

  31. info@rittmanmead.com www.rittmanmead.com @rittmanmead !31 Currently Limited SQL functions Can be

    Extended with UDF! GA Since March 2018! Enhancements expected KSQL
  32. info@rittmanmead.com www.rittmanmead.com @rittmanmead !32 Sources Targets Transformations Kafka ?

  33. info@rittmanmead.com www.rittmanmead.com @rittmanmead !33 Stream

  34. info@rittmanmead.com www.rittmanmead.com @rittmanmead !34 Time Series Visualization

  35. info@rittmanmead.com www.rittmanmead.com @rittmanmead !35 https://www.rittmanmead.com/blog/2017/11/taking-ksql-for-a-spin-using-real-time-device-data/

  36. info@rittmanmead.com www.rittmanmead.com @rittmanmead !36 Sources Targets Transformations Kafka

  37. info@rittmanmead.com www.rittmanmead.com @rittmanmead Photo by Alexandre Debiève on Unsplash !37

    How do I Visualise the Data in Kafka?
  38. info@rittmanmead.com www.rittmanmead.com @rittmanmead !38 KSQL Kafka Consumer

  39. info@rittmanmead.com www.rittmanmead.com @rittmanmead !39 KSQL Limited set of functions Can’t

    be called from “outside” Kafka (Need special adapter)
  40. info@rittmanmead.com www.rittmanmead.com @rittmanmead !40

  41. info@rittmanmead.com www.rittmanmead.com @rittmanmead !41 https://www.rittmanmead.com/blog/2017/04/sql-on-hadoop-impala-vs-drill/ Big Data Traditional BI Tools

    SQL-on-Hadoop ODBC JDBC
  42. info@rittmanmead.com www.rittmanmead.com @rittmanmead !42

  43. info@rittmanmead.com www.rittmanmead.com @rittmanmead !43 SQL-on-(almost)Everything

  44. info@rittmanmead.com www.rittmanmead.com @rittmanmead !44 https://www.rittmanmead.com/blog/2017/07/analyzing-wimbledon-twitter-feeds-in-real-time-with-kafka-presto-and-oracle-dvd-v3/

  45. info@rittmanmead.com www.rittmanmead.com @rittmanmead !45 Static List of Streams No AVRO

    Support Limitations
  46. info@rittmanmead.com www.rittmanmead.com @rittmanmead !46 Dynamic List of Streams No AVRO

    Support
  47. info@rittmanmead.com www.rittmanmead.com @rittmanmead !47 Define a KSQL Stream as JSON

    Format CREATE STREAM STREAM_NAME WITH ( VALUE_FORMAT=‘JSON’ ) AS SELECT … FROM …;
  48. info@rittmanmead.com www.rittmanmead.com @rittmanmead !48 Use Kafka Show Tables

  49. info@rittmanmead.com www.rittmanmead.com @rittmanmead !49 Query Data

  50. info@rittmanmead.com www.rittmanmead.com @rittmanmead !50 KSQL vs SQL-on-Hadoop Continuous Query Static

    Query Data Resides in Kafka External (Static) Tables can be created Limited SQL Rich set of SQL Functions (Views can be created) Can be Accessed via ODBC No JDBC/ODBC access officially Supported Examining Data Streams and Stream Processing Ad-hoc Random Access
  51. info@rittmanmead.com www.rittmanmead.com @rittmanmead !51 Latest Apache Drill Release - Kafka

    Enhancements Filter Pushdown •PartitionId •MsgOffset •MsgTimestamp
  52. info@rittmanmead.com www.rittmanmead.com @rittmanmead Photo by Alexandre Debiève on Unsplash !52

    Why Should I use DV to Visualize Streams?
  53. info@rittmanmead.com www.rittmanmead.com @rittmanmead !53 It’s Data! Self Service Analytics Real

    Time Insights Mash-up Test and Apply ML
  54. info@rittmanmead.com www.rittmanmead.com @rittmanmead Photo by Alexandre Debiève on Unsplash !54

    Does DV Provide a Kafka Connection?
  55. info@rittmanmead.com www.rittmanmead.com @rittmanmead !55 Photo by Gemma Evans on Unsplash

    But We Can Use Drill!
  56. info@rittmanmead.com www.rittmanmead.com @rittmanmead !56 #obihackers Kafka Connect DVD Example

  57. info@rittmanmead.com www.rittmanmead.com @rittmanmead !57 Use the Tools for What they

    are Good at! •Stream Keys and Timestamps •Stream in JSON Format •Aggregated Tables •Complex SQL Functions •Combining Data •Ranking/Ordering •Visualize Data •Mashup •Machine Learning/Advanced Analytics
  58. info@rittmanmead.com www.rittmanmead.com @rittmanmead !58 DVD and Drill • Install MapR

    ODBC Driver • Create Connection • Select Storage • Select Table • In case of Errors replace “…” with `…` "dfs.tmp".localkafka `dfs.tmp`.localkafka
  59. info@rittmanmead.com www.rittmanmead.com @rittmanmead !59 DVD and Drill • Remove Caching

    • No Data Flows • Logical SQL transformations
  60. info@rittmanmead.com www.rittmanmead.com @rittmanmead !60

  61. info@rittmanmead.com www.rittmanmead.com @rittmanmead !61 Limits • Creation of a Kafka

    Consumer for each Query • Reads all Data from Topic (now with pushdown filters) • JSON format (until AVRO support) Not Suitable for Massive Dashboard Style Reporting
  62. info@rittmanmead.com www.rittmanmead.com @rittmanmead !62 Suggestion …Use the Tools for What

    they are Good at! Kafka Connect • Data/Insights Discovery • Self Service Analytics • Data Mashup • Small Datasets • Prototype Building • Monitoring/Alerting • Dashboards • Massive Datasets • Consolidated KPIs • Complex Calculations
  63. info@rittmanmead.com www.rittmanmead.com @rittmanmead !63 Photo by Jason Blackeye on Unsplash

    #Cloud
  64. info@rittmanmead.com www.rittmanmead.com @rittmanmead !64

  65. info@rittmanmead.com www.rittmanmead.com @rittmanmead !65 Final Suggestions….

  66. info@rittmanmead.com www.rittmanmead.com @rittmanmead !66 • Kafka ‣ Separate Topics for

    direct Reporting ‣ Include Discovery back to Kafka • Drill ‣ Prototype and Data Mashup • DVD ‣ Visualizations and Personal Data Mashup • Standardised Reporting ‣ Kafka Connect Sink and “Old Days” Tools Modern Analytical Platform Photo by Drew Patrick Miller on Unsplash
  67. info@rittmanmead.com www.rittmanmead.com @rittmanmead !67 Life Is Too Short to Use

    the WRONG Technology
  68. info@rittmanmead.com www.rittmanmead.com @rittmanmead !68 Visualizing Streams