Upgrade to Pro — share decks privately, control downloads, hide ads and more …

An Introduction to CEP with StreamInsight

canoas
March 28, 2012

An Introduction to CEP with StreamInsight

Implementing Complex Event Processing using Microsoft StreamInsight will probably change the way some problems are solved in current Event-Drive Architectures.
Demo: StreamInsight and SignalR concept and source

canoas

March 28, 2012
Tweet

More Decks by canoas

Other Decks in Programming

Transcript

  1. Insight requires one version of the truth Advertising analytics Social

    Media monitoring Web analytics Transactional data CRM / Customer data The Data Conversation Relevant experiences need data. Smart Iteration requires insight
  2. Web Analytics Scenario: Real-time Behavioral Targeting Continuously analyze online behavior

    per user Identify relevant content before the next click Define content behind next click based on detected online behavior StreamInsight advantage Scale to millions of concurrent online users Immediate insight - real time analytics Web logs no longer processed offline in batches Correlate across your web farms and applications
  3. Industry trends Data acquisition costs are negligible Raw storage costs

    are small and continue to decrease Processing costs are non- negligible Data loading costs continue to be significant Manage business via KPI-triggered actions Mine historical data Devise new KPIs Monitor KPIs Record raw data (history) Cycle: Monitor, Manage, Mine 5 CEP advantage Process data incrementally, i.e., while it is in flight Avoid loading while still doing the processing you want Seamless querying for monitoring, managing and mining
  4. Digital Marketing Solution Example solution components Web Servers StreamInsight Ad

    Servers Database Servers Search Servers Facebook YouTube Blogs Twitter CRM Transaction Data
  5. This is the streaming data paradigm in a nutshell –

    ask questions about data in flight.
  6. Present Time of interest Web Analytics – Ad placement, Financial

    Services, Smart Grids, Monitoring – Systems mgmt, Health Care, Manufacturing, etc. years months days hrs min sec $ value of analytics Forecasting in Enterprises Historical Trend Analysis
  7. Traditional DW Analytics Active DW analytics Present Time of interest

    100000 10000 1000 100 Custom-built solutions that carry huge development and customization costs Facts/sec. years months days hrs min sec Load time in ETL ET time in ETL Load barrier is dictated by current choices of the solution, e.g., loading into databases, persisting into files. This is intrinsic because in current approaches no processing can be done till the data is loaded.
  8. Analytical results need to reflect important changes in business reality

    immediately and enable responses to them with minimal latency Database Applications Event-driven Applications Query Paradigm Ad-hoc queries or requests Continuous standing queries Latency Seconds, hours, days Milliseconds or less Data Rate Hundreds of events/sec Tens of thousands of events/sec or more Query Semantics Declarative relational analytics Declarative relational and temporal analytics request response Event output stream input stream
  9. Relational Database Applications Financial trading Applications Aggregate Data Rate (Events/sec.)

    Latency 0 10 100 1000 10000 100000 ~1million Months Days hours Minutes Seconds 100 ms < 1ms Operational Analytics Applications, e.g., Logistics, etc. Manufacturing Applications Monitoring Applications CEP Target Scenarios Data Warehousing Applications Web Analytics Applications 15
  10. Standing Queries Query Logic Event sources Event targets ` Devices,

    Sensors Web servers Event stores & Databases Stock ticker, news feeds Event stores & Databases Pagers & Monitoring devices KPI Dashboards, SharePoint UI Trading stations Input Adapters Output Adapters StreamInsight Engine Query Logic Query Logic StreamInsight Application Development StreamInsight Application at Runtime
  11. 20 Data Stream Stream Data Store & Archive Event Processing

    Engine Data Stream Asset Specs & Parameters Power, Utilities: • Energy consumption • Outages • Smart grids • 100,000 events/sec Visual trend-line and KPI monitoring Batch & product management Automated anomaly detection Real-time customer segmentation Algorithmic trading Proactive condition-based maintenance Web Analytics: • Click-stream data • Online customer behavior • Page layout • 100,000 events /sec Manufacturing: • Sensor on plant floor • React through device controllers • Aggregated data • 10,000 events/sec • Threshold queries • Event correlation from multiple sources • Pattern queries Lookup Asset Instrumentation for Data Acquisition, Subscriptions to Data Feeds Financial Services: • Stock & news feeds • Algorithmic trading • Patterns over time • Super-low latency • 100,000 events /sec
  12. A Scenario: Market Monitor StreamInsight Output Adapters Input Adapters Asset

    Class Ticker Exchange SUM Volume SUM Bid SUM Ask Stock MSFT NASDAQ 100 100 100 Stock IBM NASDAQ 200 200 200 Push Push Push Pull
  13. 34

  14. LINQ Example – GROUP&APPLY, WINDOW: from e3 in MyStream3 group

    e3 by e3.i into SubStream from win in SubStream.HoppingWindow( FiveMinutes,ThreeSeconds) select new { i = SubStream.Key, a = win.Avg(e => e.f) }; LINQ Example – JOIN, PROJECT, FILTER: from e1 in MyStream1 join e2 in MyStream2 on e1.ID equals e2.ID where e1.f2 == “foo” select new { e1.f1, e2.f4 }; Join Filter Project Grouping Window Project & Aggregate
  15. Data Sources Aggregation & Correlation 37 StreamInsight StreamInsight StreamInsight CEP

    for lightweight processing and filtering CEP for aggregation and correlation of in-flight events CEP for complex analytics including historical data Event processing engines are deployed at multiple places on different scales • At the edge – close to the data source • In the mid-tier – consolidate related data sources, • In the data center – historical archive, mining, large scale correlation. Devices Sensors Web servers Feeds StreamInsight Complex Analytics & Mining StreamInsight StreamInsight StreamInsight StreamInsight StreamInsight StreamInsight StreamInsight
  16. Manufacturing Utilities Oil & Gas Financial Services Web Analytics Telco

    Scenarios: Alarming Notifications Real-Time Analysis AMI/SmartGrid Outage Management Well Monitoring Operational Intelligence Risk Management Market Monitoring Behavioral Targeting Load Monitoring CDR Aggregation ISV: OSIsoft Matrikon ICONICS OSIsoft Matrikon Telvent ICONICS OSIsoft Matrikon Lab49 IMGroup MSFT AdCenter XBox DPE SI: Logica Logica Logica Hitachi Consulting Lab49 IMGroup MSFT AdCenter XBox DPE
  17. Standing Queries Query Logic Event sources Event targets ` Devices,

    Sensors Web servers Event stores & Databases Stock ticker, news feeds Event stores & Databases Pagers & Monitoring devices KPI Dashboards, SharePoint UI Trading stations Input Adapters Output Adapters StreamInsight Engine Query Logic Query Logic StreamInsight Application Development StreamInsight Application at Runtime Flexible adapter SDK with high performance to connect to different event sources and sinks Event-driven applications are fundamentally different from traditional database applications: queries are continuous, consume and produce streams, and compute results incrementally The CEP platform does the heavy lifting for you to deal with temporal characteristics of event stream data Development experience with .NET, C#, LINQ and Visual Studio 2008 and 2010 CEP platform from Microsoft to build event-driven applications