Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Is Data Warehousing dead?

Daan Bakboord
December 06, 2017

Is Data Warehousing dead?

More and more people and devices are connected. Mobile, social and the Internet of Things are causing data volumes to grow. New technologies lead to new possibilities. These new technologies also lead to new ways of Data Management.
Traditional Data Warehousing might not be suitable enough to serve new data needs. Think about structured versus unstructured data, batch versus streaming data. And what about graph data? Do we need new technologies and architectures to be able to keep up with the ever-changing and evolving business demands?

This presentation covers the subject of Oracle Data Warehousing in the Big Data Age. Data Warehousing is around for years and years. The Oracle Database plays a central role in Oracle Data Warehousing. The Oracle 12c Database is not the relational database it used to be. It is equipped to challenge the changing needs of Data Management. What does the Oracle 12c Database have to offer for these changing needs?

Daan Bakboord

December 06, 2017
Tweet

More Decks by Daan Bakboord

Other Decks in Technology

Transcript

  1. 2 Google's pizza • Hello! Gordon's pizza? o No sir

    it's Google's pizza. • So it's a wrong number? Sorry. o No sir, Google bought it. • OK. Take my order please o Well sir, you want the usual? • The usual? You know me? o According to our caller ID data sheet, in the last 12 times, you ordered pizza with cheeses, sausage, thick crust. • OK! This is it ... o May I suggest to you this time ricotta, arugula with dry tomato.? • What? I hate vegetables.
  2. 3 Google's pizza o Your cholesterol is not good, sir.

    • How do you know? o We crossed the number of your fixed line ☎with your name, through the subscribers guide. We have the result of your blood tests for the last 7 years. • Okay, but I do not want this pizza! I already take medicine ... o Excuse me, but you have not taken the medicine regularly. From our commercial database, 4 months ago, you only purchased a box with 30 cholesterol tablets at Drugsale Network. • I bought more from another drugstore. o It's not showing on your credit card statement o I paid in cash
  3. 4 Google's pizza o But you did not withdraw that

    much cash according to your bank statement • I have have other source of cash o This is not showing as per you last Tax form unless you bought them from undeclared income source. • WHAT THE HELL? o I'm sorry, sir, we use such information only with the intention of helping you. • Enough! I'm sick of Google, Facebook, Twitter, WhatsApp. I'm going to an Island without internet, cable TV, where there is no cell phone line and no one to watch me or spy on me o I understand sir but you need to renew your passport first as it has expired 5 weeks ago
  4. 5 Introduction Quistor: Your Business Analytics Partner of Choice Customers

    Worldwide 150+ Analytics & Big Data 12Years In Business Value Propositions 4 Delivery Centers 170 Employees 10 European Offices 35y Average Age Oracle Platinum Partner Managed Services JD Edwards Digital 24 7 Cloud ExaHotel
  5. 6 Who am I? http://www.daanalytics.nl https://twitter.com/daanbakboord https://nl.linkedin.com/in/daanbakboord Daan Bakboord •

    Oracle Big Data Anlytics Consultant @ Quistor – Oracle BI EE (OBIEE) – Oracle Analytics Cloud (OAC, BICS) – Oracle Data Visualization – Oracle Big Data – Oracle BI Applications (OBIA) • Information Architecture – TOGAF – Archimate http://blog.daanalytics.nl #obihackers nl.OUG BIWA SIG Lead
  6. 7 Bloodhound: The 1,000 Mph Car, Powered by Data ”Oracle

    is helping Bloodhound smash the land speed record and reach 1,000 mph” http://bit.ly/oracle_bloodhound • Artificial Intelligence • Augmented Reality • Geo Spatial Technology • Data Visualisation • Virtual Reality • Real Time Education
  7. 9 • Challenge – Massive amounts of storage to indexing

    the entire web – Process large amounts of data requires a new approach • Solution – GFS, the Google File System o Described in a paper released in 2003 – Distributed MapReduce o Described in a paper released in 2004
  8. 10 What’s Hadoop? Hadoop is a Software Framework for Storing,

    Processing and Analyzing Big Data • Distributed • Scalable • Fault-tolerant • Open Source
  9. 11 Core Hadoop Core Hadoop • Distributed File System HDFS

    – Stores data • Hadoop MapReduce – Processes data • Hadoop Yarn – Schedules work Hadoop Eco System
  10. 13 The Data Warehouse • Defenition (Bill Inmon) – "A

    (Data) Warehouse is a subject- oriented, integrated, time-variant and non-volatile collection of data in support of management's decision making process.“ William H. Inmon (1990) A Data Warehouse is NOT the ultimate goal; It servers to support the ‘decision-making proces’. This process should be clear first before you can make a definitive decision about the development of the Data Warehouse.
  11. 14 Challenges Business • What do we want with Analytics?

    • Uniformed Data Definitions • Ownership Technical • Complexity of the Source – Lack of documentation • Combining Data Sources • Keeping history • Data Quality – Shit in – Shit out • Performance
  12. 15 Data Lake • Collect & organize large volumes of

    diverse data for later use – raw / original / native / as-is • Preparation & transformation based on the use case – Schema on Read • Benefits – Lower costs – Greater flexibility James Dixon, CTO of Pentaho in 2012
  13. 16 Data Swamp ”Storing all data does not automatically return

    Value” • Context • Governance – planning (e.g. ingestion still needed?) – rules – processes – health checks • Security • Ownership & Sponsorship • Architecture & Technologies Data Lakes don’t replace the Data Warehouse
  14. 17 Data Lake compared to the Data Warehouse ”What is

    Data Lake capable of and a Data Warehouse is not?” • Quickly and Cheaply Store & Process any type of data DWH ”Schema-on-Write” Data Lake ”Schema-on-Read” Create schema before load Schema (changes) No schema, just copy the data Explicit load operation Transformation SerDer to extract columns Standards Governance Loosely structured Limited Processing Coupled with the data Structured Data Types Un-/semi-structured Scale up Scalability Scale out
  15. 18 Use cases for Hadoop compared to the Data Warehouse

    Data Lake is for….. • Data Discovery • Processing & Storing Large (un-/semi-structured) data sets Data Warehouse is for…. • Interactive OLAP-Analytics • Complex ACID transactions DWH ”Schema-on-Write” Data Lake ”Schema-on-Read” Reads are Fast Writes are Fast Governance and Structure Flexibilty & Agility
  16. 19 Traditional Business Intelligence – Oracle Data Warehouse Deployment Choice

    Data Management Reservoir Factory Warehouse • Ideal Database Hardware • Smart System Software • Full-Stack Integration On-Premises Oracle Exadata Customer Data Center Purchased Customer Managed Oracle Cloud Oracle Exadata Cloud Service Oracle Data Center Subscription Oracle Managed Customer Cloud Oracle Exadata Cloud Service Customer Data Center Subscription Oracle Managed Oracle Cloud Autonomous Database Cloud Service Oracle Data Center Subscription Oracle Managed
  17. 20 Oracle Database Platform Analytical Services SQL, In-Memory, R, Advanced

    Analytics, OLAP Data Support Node.js, Python, .NET, Java, PHP, Ruby, PL/SQL, C, C++, Perl, ORDS, APEX, SODA Relational, JSON, XML, Spatial, Graph, Text, Binary Platform Services Cloud to On-Premise, Clustering, Security, High Availability, Zero Data Loss, Administration Development Services Node.js, Python, .NET, Java, PHP, Ruby, PL/SQL, C, C++, Perl, ORDS, APEX, SODA
  18. 21 Traditional Business Intelligence – Data Warehousing Data Management Reservoir

    Factory Warehouse Oracle Database 12.2 – New Features • Better In-Memory capabilities for DWH – Data Scans – Joins – Aggregation • New SQL Features – Approximate Query processing – Faster JSON processing via in-memory – Analytic Views • Common business logic inside the database • New highly-scalable Property Graph analytics
  19. 22 Oracle REST Data Services Relational DB Document Store NoSQL

    Oracle REST Data Services Standalone (Jetty) Weblogic, Tomcat and Glasfish REST
  20. 23 • Included in Oracle Database & Oracle SQL Developer

    installs o Mid-tier Java application o ORDS maps HTTP(S) verbs - (GET, POST, PUT, DELETE, etc.) o database transactions o returns any results formatted using JSON
  21. 24 JSON • What is JSON? – JavaScript Object Notation

    (JSON) • a lightweight data-interchange format • a syntax for storing and exchanging data • "self-describing" & easy to understand • language independent • Why use JSON? – the JSON format is text only, • easily send to and from a server • data format for any programming language
  22. 25 JSON support in Oracle Database • Creating Tables to

    Hold JSON • Querying JSON Data – IS JSON – JSON_EXISTS – JSON_VALUE – JSON_QUERY – JSON_TABLE – JSON_TEXTCONTAINS • Identifying Columns Containing JSON • Loading JSON Files Using External Tables
  23. 26 Analytical Views – a new type of view in

    the Oracle Database • Business logic back into the database – 3 new database objects – aggregations, hierarchies, calculations • Easily queried and designed with SQL • No persistent storage – works on existing tables and views • Built-in data visualization via APEX
  24. 27 3 New Database Objects Attribute Dimensions Maps to Dimension

    / Attribute data Hierarchies Organizes levels into aggregations and drill paths Analytical Views Maps to data objects with Fact / Measure data Can be queried with MDX & SQL
  25. 28

  26. 29 Polyglot Persistence ”Polyglot Persistence is a fancy term to

    mean that when storing data, it is best to use multiple data storage technologies, chosen based upon the way data is being used by individual applications or components of a single application. Different kinds of data are best dealt with different data stores. In short, it means picking the right tool for the right use case. ” http://www.jamesserra.com/archive/2015/07/what-is-polyglot-persistence/
  27. 31 Polyglot Persistence – Multi Model • Integrated access to

    all data in the different database objects – Relational – XML – JSON – Text – Graph & Spatial
  28. 32 Polyglot Persistence – Single Model • Support for multiple

    Single Model data stores • Integrated access via Oracle Big Data SQL
  29. 36 Oracle Data Warehouse Evolution – “Transforming to Big Data”

    ”Data Warehousing is Dead?” Data Management Reservoir Factory Warehouse Combine the best of both worlds • Extend Oracle DWH with Oracle Big Data • Combining (new) Big Data with Enterprise Data • Relational & Hadoop & NoSQL – On-Premises & Cloud • Transactional & Social and Web & IoT • Analytics & Data Mining & Machine Learning ”Data Warehousing is not Dead, it’s Evolving!”
  30. 38