Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Becoming Insight Driven With Big Data

Daan Bakboord
December 04, 2017

Becoming Insight Driven With Big Data

Organisations need to innovate to stay ahead of the competition and to survive into the future. Most of the time innovation needs IT. IT needs reliable information to get insights. Reliable information is derived from data.
Oracle has a (Big) Data platform which aims to offer organizations access to any data source of any data type. This data could be accessed by any type of analyses and with any type of language. Oracle offers the same platform both on-premise as well as in the cloud.

This presentation gives an introduction into the Oracle Big Data Cloud offering. Which services are offered and does this fit into the Oracle Information Reference Architecture.

The Oracle Cloud offering keeps evolving and growing. This presentation aims to put things in context. Which Oracle Cloud services would one need to fulfill the needs of an organization?

Daan Bakboord

December 04, 2017
Tweet

More Decks by Daan Bakboord

Other Decks in Technology

Transcript

  1. 4 Introduction Quistor: Your Business Analytics Partner of Choice Customers

    Worldwide 150+ Analytics & Big Data 12Years In Business Value Propositions 4 Delivery Centers 170 Employees 10 European Offices 35y Average Age Oracle Platinum Partner Managed Services JD Edwards Digital 24 7 Cloud ExaHotel
  2. 5 Who am I? http://www.daanbakboord.com https://twitter.com/daanbakboord https://nl.linkedin.com/in/daanbakboord Daan Bakboord •

    Oracle Big Data Anlytics Consultant @ Quistor – Oracle BI EE (OBIEE) – Oracle Analytics Cloud (OAC, BICS) – Oracle Data Visualization – Oracle Big Data – Oracle BI Applications (OBIA) • Information Architecture – TOGAF – Archimate http://blog.daanalytics.nl #obihackers nl.OUG BIWA SIG Lead
  3. 6 Data Driven Decisions Initiatives IT IT IT Reliable Information

    / Insights A B Change / Policy / Strategy Data - Organisations need to take actions to move forward from A to B - Initiatives involve a IT component, which are driven by reliable information / insights - It all begins by having access to the right data
  4. 8

  5. 9 Today’s data challenges • Too many different variety’s of

    Data Sources • Big Data Technology fragmented and complex • Specialized skills in short supply
  6. 10 Oracle Information Management Reference Architecture – Data Management Data

    Fast Data Events Actions 1 2 3 Streams Data Management Reservoir Factory Warehouse Results People Data Services Smart Things Data Lab Data Science Discovery Data Sets Apps Packaged Custom Business Analytics Visualization Reports Execution Innovation
  7. 11 Traditional Business Intelligence – Oracle Data Warehouse Deployment Choice

    Data Management Reservoir Factory Warehouse • Ideal Database Hardware • Smart System Software • Full-Stack Integration On-Premises Oracle Exadata Customer Data Center Purchased Customer Managed Oracle Cloud Oracle Exadata Cloud Service Oracle Data Center Subscription Oracle Managed Customer Cloud Oracle Exadata Cloud Service Customer Data Center Subscription Oracle Managed Oracle Cloud Autonomous Database Cloud Service Oracle Data Center Subscription Oracle Managed
  8. 12 Traditional Business Intelligence – Data Warehousing Data Management Reservoir

    Factory Warehouse Oracle Database 12.2 – New Features • Better In-Memory capabilities for DWH – Data Scans – Joins – Aggregation • New SQL Features – Approximate Query processing – Faster JSON processing via in-memory – Analytic Views • Common business logic inside the database • New highly-scalable Property Graph analytics
  9. 13 What’s Hadoop? Hadoop is a Software Framework for Storing,

    Processing and Analyzing Big Data • Distributed • Scalable • Fault-tolerant • Open Source
  10. 14 Core Hadoop • Distributed File System HDFS – Stores

    data • Hadoop MapReduce – Processes data • Hadoop Yarn – Schedules work
  11. 16 Big Data Distributions How to Answer to: • Support

    • Compliance • Performance • Scalability • Security Big Data Distributions Pre-Packaged, tested and validated packaged solutions based on Apache Hadoop – Technical Support – Services – Training
  12. 17 Big Data Distributions – Cloudera Enterprise Data Hub •

    Proven, user-friendly technology. – Use Case; Enterprise Data Hub. Let the Hadoop platform serve as a central data repository.
  13. 18 Big Data Distributions – MapR Converged Data Platform •

    Stable platform with a generic file-system and fast processing. – Use Case; Integrated platform with a focus on streaming.
  14. 19 Big Data Distributions – Hortonworks Connected Data Platforms •

    100% Open source with minimal investment. – Use Case; Modernizing your traditional EDW.
  15. 20 Oracle Big Data Appliance Data Management Reservoir Factory Warehouse

    Hardware and Software engineered together Oracle Big Data Appliance includes: • Oracle Sun x86 servers powered by the Intel® Xeon® processor family • InfiniBand and Ethernet connectivity • Cloudera Enterprise – Data Hub Edition (including CDH, Impala, Spark, Kafka, etc.) • Oracle NoSQL Database Community Edition • Comprehensive security, including authentication, authorization, and auditing capabilities • Oracle Linux • Oracle Java JDK • Oracle R Distribution
  16. 21 Oracle Big Data Appliance Data Management Reservoir Factory Warehouse

    Hardware and Software engineered together Oracle Big Data Appliance includes optionally: • Oracle Big Data SQL • Oracle Big Data Connectors: – Oracle SQL Connector for Hadoop – Oracle Loader for Hadoop – Oracle XQuery for Hadoop – Oracle R Advanced Analytics for Hadoop – Oracle Data Integrator • Audit Vault and Database Firewall for Hadoop Auditing • Oracle Data Integrator • Oracle GoldenGate • Oracle NoSQL Database Enterprise Edition • Oracle Big Data Spatial and Graph • Oracle Big Data Discovery
  17. 25 Object Storage Cloud Object Storage Cloud Data in &

    Data out. • Ingest Data through Kafka (Event Hub – Kafka Cloud) • Process it through a processing tier – Oracle Big Data Cloud Service – Oracle Database Cloud • Land it in the Object Storage Cloud Service – Object Storage (Elastic, Fast and Secure) – Archive Storage (Infrequently accessed Data) – Database Backup – Large Dataset Transfer • Work with it via PaaS or Custom Service Database Cloud Event Hub – Kafka Cloud Big Data Cloud Object Storage Cloud Service
  18. 27 Oracle Big Data in the Cloud Services (Compute Edition)

    Big Data Cloud Compute Oracle Big Data Cloud Service (BDCS) • Long-Lived Clusters • Full Cloudera Eco-System • Engineered Systems backbone • Focus on Performance & Control of environment • Big Data Cloud SQL • Big Data Cloud Machine (Customer On-Premise) Big Data Cloud Oracle Big Data Cloud Service Compute Edition (BDCS-CE) • Short-Lived Clusters – POC • Apache Hadoop & Apache Spark • Focus on Flexibility & Simplicity
  19. 28 Oracle Event Hub Cloud Service • Apache Kafka delivered

    as a managed service • Real-time Streaming data into Oracle Object Storage • Integrated with Oracle Data Integration Cloud • Elastic o Dedicated by the nodes o Multi-tenant by the partitions • On Oracle Public Cloud or On-Premise through Cloud@Customer • Open Standards based Event Hub – Kafka Cloud
  20. 32 ”Change is the law of life. And those who

    look only to the past or present are certain to miss the future. – John F. Kennedy”
  21. 34 Big Data Analytics – Oracle Analytics Cloud "The Forrester

    Wave™: Enterprise BI Platforms with Majority Cloud Deployments, Q3 2017” Complete spectrum of Enterprise BI needs – From Self Service to Oracle Essbase MOLAP – Common Enterprise Information Model – Oracle Day by Day Mobile • Smart • Governed • Hadoop / Spark • Search • Visual-based
  22. 35 Oracle Analytics Cloud Service (OAC) – Collaborative, providing efficient

    methods to interact and share information – Connected, to all the data required to support processes and decisions – Complete, providing all the needed analytical capabilities – Choice, about how to deploy, both now and in the future. ”Ask any Question of any Data, in any Environment, using any Device”
  23. 37 Data Lake • Collect & organize large volumes of

    diverse data for later use – raw / original / native / as-is • Preparation & transformation based on the use case – Schema on Read • Benefits – Lower costs – Greater flexibility James Dixon, CTO of Pentaho in 2012
  24. 38 Data Swamp ”Storing all data does not automatically return

    Value” • Context • Governance – planning (e.g. ingestion still needed?) – rules – processes – health checks • Security • Ownership & Sponsorship • Architecture & Technologies Data Lakes don’t replace the Data Warehouse
  25. 40 • Data Ingestion – Replicate (SaaS & Fusion apps)

    – Incremental – Continuous (Oracle GoldenGate) • Manage Data – Projects – Connections – Data Sets – Data Flows & Sequences Discover Discover & access diverse Data Sources Data Integration Services
  26. 41 Prepare • Prepare Data Sets – Excel like interface

    • Data Flows • Advanced Transforms & Scripts – Time Series Forecast – Sentiment Analysis – Custom Scripts in Python & R • Load data into Essbase Programmatic Integration
  27. 43 Predict • Machine Learning Data Flows • Various Machine-Learning

    Algorithms – Numeric Prediction – Multi-Classifier – Binary Classifier – Clustering – Custom Algorithms Machine Learning models Data Scientists Analysts • Machine to Human – Self-Learning Algorithms • Human to Machine – Define models
  28. 44 Manage & Monitor • Schedule – Data flows –

    Replication – Ingestion • Jobs overview – Statistics – Success or failure information Scheduling and Monitoring
  29. 47 Oracle Data Warehouse Evolution – “Transforming to Big Data”

    ”Data Warehousing is Dead?” Data Management Reservoir Factory Warehouse Combine the best of both worlds • Extend Oracle DWH with Oracle Big Data • Combining (new) Big Data with Enterprise Data • Relational & Hadoop & NoSQL – On-Premises & Cloud • Transactional & Social and Web & IoT • Analytics & Data Mining & Machine Learning
  30. 49