Pro Yearly is on sale from $80 to $50! »

Enabling Rapid Business Insight into Data with Stream Analytics and GoldenGate

Enabling Rapid Business Insight into Data with Stream Analytics and GoldenGate

Data streaming is rapidly becoming the norm in modern data architectures. This can be from stream-enabled applications, or through the capabilities of Oracle GoldenGate to stream changes made to the database in realtime to targets including Kafka. This availability of data streams offers great potential and advances in the analytics world, enabling business insight to be realised sooner, and actions taken on the data whilst it is still current. Oracle Stream Analytics (OSA) brings this insight into "Fast Data" to business users through an intuitive web interface. It enables them to filter and analyse data as it arrives, including with predefined algorithms and spatial technology.

In this presentation I will present a live demonstration of how to filter, transform, and analyse streaming data with Oracle Stream Analytics from sources including Kafka. Using Oracle GoldenGate we will see how to stream individual changes from the database made by applications that were not even written with streaming capabilities. How OSA is deployed will be discussed, including its use with Spark as the runtime engine. We will also consider OSA's place in the broader analytics architecture alongside Oracle Data Integrator.


Robin Moffatt

March 10, 2017


  1. @rittmanmead 1 Enabling Rapid Business Insight into Fast

    Data with Oracle Stream Analytics and Oracle GoldenGate Robin Moffatt, Rittman Mead OUGN 2017 Slides: Twitter: @rmoff
  2. @rittmanmead Robin Moffatt 2 • Head of R&D,

    Rittman Mead • Previously OBIEE/DW developer at large UK retailer • Previously SQL Server DBA, Business Objects, 
 DB2, COBOL…. • Oracle ACE • Frequent blogger : and • Twitter: @rmoff • IRC: rmoff / #obihackers / freenode
  3. @rittmanmead Rittman Mead 3 • Oracle Gold Partner

    with offices in the UK and USA • 70+ staff delivering Oracle BI, DW, Big Data and Advanced Analytics projects • Significant web presence with the Rittman Mead Blog 
 ( • Hadoop R&D lab for “dogfooding” solutions developed for customers
  4. @rittmanmead 4 “Stream” Processing…

  5. @rittmanmead 5 “Stream” Processing… This uses Environment Agency

    flood and river level data from the real-time data API (Beta)
  6. @rittmanmead Oracle Stream Analytics 6 • Stream processing

    runtime with GUI • Previously known as Oracle Stream Explorer - Part of Fusion Middleware • Custom runtime (not WLS) - Future : Spark Streaming
  7. @rittmanmead Create Connection 7

  8. @rittmanmead Create Stream 8

  9. @rittmanmead Exploration 9

  10. @rittmanmead Derivations and Calculations 10

  11. @rittmanmead Aggregations 11

  12. @rittmanmead Patterns 12 • Pre-built templates to create

    Explorations • Simple transformation and aggregation • Built-in algorithms and spatial functionality
  13. @rittmanmead Join the Streams (but never cross them…)

  14. @rittmanmead References 14

  15. @rittmanmead Join the Streams (but never cross them…)

  16. @rittmanmead Under the Covers… 16

  17. @rittmanmead Join the Streams 17

  18. @rittmanmead Spatial Pattern 18 • Uses geofencing to

    overlay stream events and tag them as ‘entering’, ‘leaving’, or ‘staying’ in an area
  19. @rittmanmead Map 19

  20. @rittmanmead Spatial Pattern 20

  21. @rittmanmead Stream Processing Pipeline 21 This uses Environment

    Agency flood and river level data from the real-time data API (Beta)
  22. @rittmanmead Your Database Transactions Are Events! Transaction Log

    Kafka Oracle GoldenGate Connector Oracle GoldenGate for Big Data Change Data Capture (CDC) streams every change made to the database into Kafka Data resides in Kafka ready for streaming by one or more consumers Oracle Stream Analytics Each database table is a stream of events in Oracle Stream Analytics
  23. @rittmanmead Streaming from OGG to Kafka 23 •

    Oracle GoldenGate for Big Data is required • Set up extract as before • Replicat is defined as KafkaConnect, with JSON encoding - gg.handler.confluent.type=
 oracle.goldengate.kafkaconnect.KafkaConnectHandler - value.converter=
 org.apache.kafka.connect.json.JsonConverter • See
  24. @rittmanmead Handling Nested JSON 24 • OSA currently

    only works with “flat” JSON • OGG nests the ‘payload’ in a sublevel of the message - e.g. OSA can’t currently reference payload.LOGON_ID
  25. @rittmanmead Logstash to the Rescue! 25 • Stream

    processing tool from Elastic • Very easy to configure • Ref: - 40131532/350613 - logstash-filter-mutate/issues/90 input { kafka { zk_connect => 'localhost:2181' topic_id => 'ORCL.SOE.ORDERS' } } filter { ruby { code => " event.to_hash.delete_if {|k, v| k != 'payload'} event.to_hash.update(event['payload'].to_hash) event.to_hash.delete_if {|k, v| k == 'payload'} " } } output { kafka { topic_id => "ORCL.SOE.ORDERS_flat" bootstrap_servers => "localhost:9092" }}
  26. @rittmanmead Flattened JSON 26

  27. @rittmanmead View All Data Changes As They Happen

  28. @rittmanmead Filter and Chart Stream Values 28

  29. @rittmanmead Stream Aggregates 29

  30. @rittmanmead Join Streams 30

  31. @rittmanmead Output to Kafka 31 $ kafka-console-consumer \

    --zookeeper localhost:2181 \ -—topic order_aggregates { "ts": 1487253152764000000, "count_30sec": 16, "avg_order_total_30sec": 8138.75, "max_order_total_30sec": 15102 } { "ts": 1487253162686000000, "count_30sec": 17, "avg_order_total_30sec": 5898.5884, "max_order_total_30sec": 15102 }
  32. @rittmanmead Stream Processing with Oracle Stream Analytics Transaction

    Log Kafka Oracle GoldenGate Connector Oracle GoldenGate for Big Data Oracle Stream Analytics HDFS connector HDFS/Hive Impala
  33. @rittmanmead Tool vs Hand-Coding 33 • Several powerful

    open-source frameworks - e.g. Spark Streaming, Apache Flink, Apache Kafka Streams, Google DataFlow/Apache Beam • Generally requires specialist coders and infrastructure/ops understanding • Parallel arguments with
 ETL tools vs PL/SQL
  34. @rittmanmead Tool vs Hand-Coding 34 • Faster to

    prototype and deploy • Accessible by non-coders • Built-in visualisation capabilities • Easier to support • Less flexibility • Slower to adopt new capabilities of streaming frameworks • License costs
  35. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | Fault Tolerant, Highly Available, Extreme Streaming – “with Big Data brings massive flows of Data Streams” New Enabling Runtime Infrastructure : Spark Streaming with Kafka Messaging Layer Oracle Stream Analytics – Driving New Innovation }
  36. Copyright © 2016, Oracle and/or its affiliates. All rights reserved.

    | Streaming predictive probability scoring – “I don’t know what I don’t know” New PMML Pattern offering that enables OSA integration with ORE (and SparkML) Oracle Stream Analytics – Driving New Innovation Predictive Analytics Data Warehouse Streaming Probability Scoring Oracle Stream Analytics
  37. @rittmanmead EOF 37 email web

 @rmoff irc
 rmoff @ #obihackers #EOF
  38. @rittmanmead Reading and References 38 • Simple developer

    download/install process for OSA • Docker image exists, enabling you to very quickly get up and running with OSA, on Mac or Windows too! • •