Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Unifying Analytics Across Data Sources with Red Hat JBoss Data Virtualization

Unifying Analytics Across Data Sources with Red Hat JBoss Data Virtualization

How can a business intelligence team evaluate and unify data that exists in many different types of data sources? For Red Hat IT, the answer is our own JBoss Data Virtualization product, which enables unifying access to these various sources behind a SQL interface, easily integrated with existing reporting software.

In this session you will learn:
* The problems data virtualization solves for Red Hat Business Intelligence, such as relating data between databases, web services, spreadsheets and flat files.
* How Red Hat IT enabled analytics across numerous data sources using JBoss Data Virtualization:
** Cloud friendly architecture
** Automated installation and configuration using Puppet on AWS
** Caching and materialized views
** Supporting multiple clients with different data needs

9067acc5914589a6cb467b298344e106?s=128

Naveen Malik

June 29, 2016
Tweet

Transcript

  1. #redhat #rhsummit UNIFYING ANALYTICS UNIFYING ANALYTICS ACROSS DATA SOURCES WITH

    ACROSS DATA SOURCES WITH RED HAT RED HAT JBOSS DATA JBOSS DATA VIRTUALIZATION VIRTUALIZATION Naveen Malik - Principal Software Engineer Burak Serdar - Principal Software Engineer Ian Firman - Business Intelligence Architect June 29, 2016 1 1
  2. GET THIS DECK NOW... GET THIS DECK NOW... #redhat #rhsummit

    2 2
  3. RED HAT RED HAT IT JOURNEY IT JOURNEY #redhat #rhsummit

    Opportunity Research Test Plan Execute Maintain 3 3
  4. OUR JOURNEY TODAY OUR JOURNEY TODAY #redhat #rhsummit Opportunity Research

    Test Plan Execute Maintain 4 4 Ian Firman Burak Serdar Naveen Malik
  5. WHAT IS JBOSS DATA WHAT IS JBOSS DATA VIRTUALIZATION? VIRTUALIZATION?

    #redhat #rhsummit JBoss Data Virtualization is a data abstraction solution that sits in front of multiple data sources, allows them to be treated as a single source and accessed by various data consumers and/or applications Heterogeneous Sources Heterogeneous Clients 5 5
  6. OPPORTUNITY OPPORTUNITY BUSINESS CASE BUSINESS CASE #redhat #rhsummit Opportunity 6

    6
  7. Data is coming in fast #redhat #rhsummit OPPORTUNITY OPPORTUNITY 7

    7
  8. Data is coming in fast Diverse data consumers #redhat #rhsummit

    OPPORTUNITY OPPORTUNITY 8 8
  9. Diverse data sources Data is coming in fast Diverse data

    consumers #redhat #rhsummit OPPORTUNITY OPPORTUNITY 9 9
  10. OPPORTUNITY OPPORTUNITY Diverse data sources Data is coming in fast

    Diverse data consumers #redhat #rhsummit Combining real time and historical data 10 10
  11. RED HAT RED HAT USE CASES USE CASES TRANSACTIONAL TRANSACTIONAL

    #redhat #rhsummit LOW COMPLEXITY (ONE END POINT) LOW COMPLEXITY (ONE END POINT) Report from Bugzilla Simplify queries Filter sensitive data Extract to Warehouse 11 11
  12. RED HAT RED HAT USE CASES USE CASES OPERATIONAL OPERATIONAL

    #redhat #rhsummit MEDIUM COMPLEXITY (MANY END POINTS) MEDIUM COMPLEXITY (MANY END POINTS) "Where's my message?" Choice of reporting tools 12 12
  13. RED HAT RED HAT USE CASES USE CASES ANALYTICAL ANALYTICAL

    #redhat #rhsummit HIGH COMPLEXITY (MANY END POINTS, DYNAMIC) HIGH COMPLEXITY (MANY END POINTS, DYNAMIC) Data Science - Discovery, Mining, Advanced Analytics Real-time and Historical 13 13
  14. "All problems in computer science can be solved by another

    level of abstraction" paraphrased from David Wheeler #redhat #rhsummit 14 14
  15. LOGICAL ARCHITECTURE LOGICAL ARCHITECTURE #redhat #rhsummit Data Sources Data Warehouse(s)

    Files Data Consumers BI Reports & Analytics Mobile ESB & ETL 15 15
  16. LOGICAL ARCHITECTURE LOGICAL ARCHITECTURE #redhat #rhsummit Data Sources Data Warehouse(s)

    Files 16 16 Data Consumers BI Reports & Analytics Mobile ESB & ETL
  17. LOGICAL ARCHITECTURE LOGICAL ARCHITECTURE #redhat #rhsummit JBOSS Data Virtualization Integrated

    and abstracted sources Data Sources 17 17 Data Warehouse(s) Files Data Consumers BI Reports & Analytics Mobile ESB & ETL Multiple protocol access
  18. LOGICAL ARCHITECTURE LOGICAL ARCHITECTURE #redhat #rhsummit Business Logic and data

    formatting Central Security JBOSS Data Virtualization Integrated and abstracted sources 18 18 Data Sources Data Warehouse(s) Files Data Consumers BI Reports & Analytics Mobile ESB & ETL Multiple protocol access
  19. LOGICAL ARCHITECTURE LOGICAL ARCHITECTURE #redhat #rhsummit Business Logic and data

    formatting Central Security JBOSS Data Virtualization Integrated and abstracted sources 19 19 Data Sources Data Warehouse(s) Files Data Consumers BI Reports & Analytics Mobile ESB & ETL Multiple protocol access
  20. VIRTUAL DATABASES VIRTUAL DATABASES DEFINITION DEFINITION #redhat #rhsummit A virtual

    database (or VDB) is a container for components used to integrate data from multiple data sources, so that they can be accessed in an integrated manner through a single, uniform API. 20 20
  21. VDB STRATEGY VDB STRATEGY SOURCES SOURCES #redhat #rhsummit 21 21

  22. VDB STRATEGY VDB STRATEGY BASE VDB BASE VDB Abstracts the

    physical source #redhat #rhsummit 22 22
  23. VDB STRATEGY VDB STRATEGY VIRTUAL DATA MART VIRTUAL DATA MART

    #redhat #rhsummit Combine Base VDBs for analysis/reporting Business logic/formatting applied Data security applied at this layer Abstracts the physical source 23 23
  24. #redhat #rhsummit DEVELOP DEVELOP VDB LIFECYCLE VDB LIFECYCLE Virtual Database

    24 24
  25. #redhat #rhsummit DEPLOY DEPLOY VDB LIFECYCLE VDB LIFECYCLE Virtual Database

    25 25
  26. #redhat #rhsummit USE! USE! VDB LIFECYCLE VDB LIFECYCLE Virtual Database

    26 26
  27. PLAN PLAN ARCHITECTURE ARCHITECTURE #redhat #rhsummit Opportunity Plan 27 27

  28. THINGS TO CONSIDER THINGS TO CONSIDER DESIGN OPPORTUNITIES DESIGN OPPORTUNITIES

    #redhat #rhsummit 28 28
  29. THINGS TO CONSIDER THINGS TO CONSIDER FLEXIBILITY FLEXIBILITY #redhat #rhsummit

    WHERE TO DEPLOY? WHERE TO DEPLOY? Cloud and on premise Real-time data vs. materialized views Locality to clients 29 29
  30. OPEN HYBRID CLOUD OPEN HYBRID CLOUD Heterogeneous Sources Heterogeneous Clients

    #redhat #rhsummit 30 30
  31. THINGS TO CONSIDER THINGS TO CONSIDER SCALABILITY SCALABILITY #redhat #rhsummit

    HOW EASY TO SCALE HOW EASY TO SCALE Initial starting point Easily scale by adding and removing nodes Clients isolated from infrastructure changes 31 31
  32. SIZING SIZING JDV SIZING TOOL JDV SIZING TOOL INPUT: REQUIREMENTS

    INPUT: REQUIREMENTS OUTPUT: RECOMMENDATION OUTPUT: RECOMMENDATION How much data? How is data being accessed? CPU Storage Memory JVM Architecture #redhat #rhsummit 32 32
  33. THINGS TO CONSIDER THINGS TO CONSIDER SECURITY SECURITY #redhat #rhsummit

    IN TRANSIT, AT REST, AUTH? IN TRANSIT, AT REST, AUTH? Transport Layer Security (TLS) Disk encryption Authentication Authorization 33 33
  34. JBOSS EAP ROLE MANAGEMENT JBOSS EAP ROLE MANAGEMENT SAML LDAP

    Basic Auth Custom more.. SECURITY SECURITY ROLE BASED ACCESS CONTROL ROLE BASED ACCESS CONTROL #redhat #rhsummit 34 34
  35. Object #redhat #rhsummit SECURITY SECURITY DEFINED AT... DEFINED AT... 35

    35
  36. Object Row SECURITY SECURITY DEFINED AT... DEFINED AT... #redhat #rhsummit

    36 36
  37. Object Row Field #redhat #rhsummit SECURITY SECURITY DEFINED AT... DEFINED

    AT... 37 37
  38. PHYSICAL ARCHITECTURE PHYSICAL ARCHITECTURE Amazon Web Services and AWS are

    trademarks of Amazon.com, Inc. or its affiliates in the United States and/or other countries. #redhat #rhsummit 38 38
  39. EXECUTE EXECUTE IMPLEMENTATION IMPLEMENTATION Opportunity Plan Execute #redhat #rhsummit 39

    39
  40. #redhat #rhsummit Virtual Database MAKING IT HAPPEN! MAKING IT HAPPEN!

    TOOLS TOOLS 40 40
  41. #redhat #rhsummit Virtual Database MAKING IT HAPPEN! MAKING IT HAPPEN!

    TOOLS TOOLS jcliff 41 41
  42. PUPPET & JCLIFF PUPPET & JCLIFF #redhat #rhsummit jcliff .

    . . Configuration Snippets 42 42
  43. PUPPET & JCLIFF PUPPET & JCLIFF #redhat #rhsummit jcliff .

    . . Differences Configuration Snippets 43 43
  44. PUPPET & JCLIFF PUPPET & JCLIFF #redhat #rhsummit Differences jcliff

    . . . Configuration Snippets 44 44
  45. DATA SOURCES DATA SOURCES REDSHIFT RESOURCE REDSHIFT RESOURCE #redhat #rhsummit

    jcliff::datasource { 'redshift_ds': } jndi_name => 'java:/redshift', url => hiera('jdvbi::redshift::url'), driver_name => 'RedshiftJDBC4-1.1.1.0001.jar', username => hiera('jdvbi::redshift::username'), 45 45
  46. DATA SOURCES DATA SOURCES REDSHIFT RESOURCE REDSHIFT RESOURCE #redhat #rhsummit

    { "datasource" => { "redshift_ds" => { "jndi-name" => "java:/redshift", "driver-name" => "RedshiftJDBC4-1.1.1.0001.jar", "enabled" => "true", " " " " jcliff::datasource { 'redshift_ds': } jndi_name => 'java:/redshift', url => hiera('jdvbi::redshift::url'), driver_name => 'RedshiftJDBC4-1.1.1.0001.jar', username => hiera('jdvbi::redshift::username'), 46 46
  47. #redhat #rhsummit jcliff::teiid_salesforce_ra { 'sfcom': jndi_name =>'java:/sf_ds', url => hiera('jdvbi::salesforce::url'),

    username => hiera('jdvbi:salesforce:username'), SALESFORCE SALESFORCE DEFINE RESOURCE ADAPTER DEFINE RESOURCE ADAPTER 47 47
  48. #redhat #rhsummit { "resource-adapter" => { "sf_ra" => { "module"

    => "org.jboss.teiid.resource-adapter.salesforce:main", "transaction-support" => "NoTransaction", "connection-definitions" => { "sf_ra" => { "enabled" => true, "jndi-name" => "java:/sf_ds", "config-properties" => { jcliff::teiid_salesforce_ra { 'sfcom': jndi_name =>'java:/sf_ds', url => hiera('jdvbi::salesforce::url'), username => hiera('jdvbi:salesforce:username'), SALESFORCE SALESFORCE DEFINE RESOURCE ADAPTER DEFINE RESOURCE ADAPTER 48 48
  49. CONNECTIONS CONNECTIONS SECURE DATABASE SECURE DATABASE # Add a teiid

    JDBC transport with TLS jcliff::configfile { 'ssl-jdbc.conf': content > template('jbossdvbi/ssl jdbc conf erb') #redhat #rhsummit 49 49
  50. CONNECTIONS CONNECTIONS SECURE DATABASE SECURE DATABASE {"teiid" => { "transport"

    => { "jdbc" => { "keystore-key-alias" => "<%=@keystore_alias%>", "keystore-key-password" => "<%=@keystore_password%>", "keystore-password" => "<%=@keystore_password%>", "keystore-type" => "JKS", "socket-binding" => "teiid-jdbc", "ssl-authentication-mode" => "1-way", # Add a teiid JDBC transport with TLS jcliff::configfile { 'ssl-jdbc.conf': content > template('jbossdvbi/ssl jdbc conf erb') #redhat #rhsummit 50 50
  51. CONNECTIONS CONNECTIONS SECURE CLIENTS SECURE CLIENTS # Add a socket

    binding for TLS JDBC jcliff::socket_binding { 'teiid-jdbc': #redhat #rhsummit 51 51
  52. CONNECTIONS CONNECTIONS SECURE CLIENTS SECURE CLIENTS { "standard-sockets" => {

    "socket-binding" => { " " "t iid jdb " # Add a socket binding for TLS JDBC jcliff::socket_binding { 'teiid-jdbc': #redhat #rhsummit 52 52
  53. VIRTUAL DATABASES VIRTUAL DATABASES DEPLOY VDB DEPLOY VDB #redhat #rhsummit

    # Retrieve VDB from staging area exec { 'get-vdb': command => "wget ..." } # deploy VDB 53 53
  54. VIRTUAL DATABASES VIRTUAL DATABASES DEPLOY VDB DEPLOY VDB #redhat #rhsummit

    { "deployments" => { "my.vdb" => { # Retrieve VDB from staging area exec { 'get-vdb': command => "wget ..." } # deploy VDB 54 54
  55. CONCLUSION CONCLUSION #redhat #rhsummit 55 55

  56. VALUE VALUE IS IT WORKING OUT? IS IT WORKING OUT?

    #redhat #rhsummit CURRENT DEPLOYMENTS CURRENT DEPLOYMENTS MORE IN THE FUTURE! MORE IN THE FUTURE! Marketing VDB Data Scientists VDB 56 56
  57. USEFUL NEW FEATURES USEFUL NEW FEATURES NEW THINGS FROM THE

    PRODUCT! NEW THINGS FROM THE PRODUCT! #redhat #rhsummit NEW AND EXCITING NEW AND EXCITING Dynamic VDB Unified RBAC Redshift Translator 57 57
  58. CONCLUSION CONCLUSION PARTING THOUGHTS PARTING THOUGHTS Opportunity Plan Execute #redhat

    #rhsummit 58 58
  59. CONCLUSION CONCLUSION PARTING THOUGHTS PARTING THOUGHTS Fast integration. Heterogeneous sources.

    Decoupling. #redhat #rhsummit Plan Execute 59 59
  60. CONCLUSION CONCLUSION PARTING THOUGHTS PARTING THOUGHTS Centralize business logic. Centralize

    security. Flexible architecture. Fast integration. Heterogeneous sources. Decoupling. #redhat #rhsummit Execute 60 60
  61. Repeatable. Automated. Supported. CONCLUSION CONCLUSION PARTING THOUGHTS PARTING THOUGHTS Centralize

    business logic. Centralize security. Flexible architecture. Fast integration. Heterogeneous sources. Decoupling. #redhat #rhsummit 61 61
  62. #redhat #rhsummit QUESTIONS? QUESTIONS? 62 62

  63. LEARN. NETWORK. LEARN. NETWORK. EXPERIENCE OPEN SOURCE. EXPERIENCE OPEN SOURCE.

    #redhat #rhsummit 63 63
  64. APPENDIX APPENDIX ADDITIONAL MATERIAL ADDITIONAL MATERIAL #redhat #rhsummit 64 64

  65. RESOURCES RESOURCES PRODUCTS PRODUCTS #redhat #rhsummit Red Hat JBoss Data

    Virtualization https://www.redhat.com/en/technologies/jboss-middleware/data-virtualization Red Rat CloudForms https://www.redhat.com/en/technologies/cloud-computing/cloudforms Ansible https://www.ansible.com/ 65 65
  66. RESOURCES RESOURCES TOOLS TOOLS #redhat #rhsummit JBoss Data Virtualization Sizing

    Architecture Tool https://access.redhat.com/labs/jbossdvsat/ jcliff https://github.com/bserdar/jcliff puppet-jcliff https://github.com/bserdar/puppet-jcliff 66 66