Agile Business Intelligence with SAP and Cloud at heise data2day

Agile Business Intelligence with SAP and Cloud at heise data2day

The BI world has started to spin faster and faster. It's no longer just about reporting metrics from a few enterprise systems for management. Trends and forecasts from all areas of the company are to be provided in new forms of presentation such as apps, dashboards or bots in shorter intervals of a growing user base.

In ERP and other LOB systems, which come from SAP, HANA has positioned itself as a new, central data technology in position to address the above problems. The cloud as a data storage and processing platform is also becoming more widespread. In our talk, we demonstrate how HANA can be efficiently connected to cloud databases and data pipelines through Smart Data Integration (SDI) and its associated Adapter Framework. AS an example we connect Google BigQuery, a scalable cloud databases and CloudDataflow a cost-effective, large scale data transformation service to HANA running in a docker container. The example further illustrates that even open source technologies such as Hadoop, Spark or Parquet still have their place in the cloud world.

7bdeb615dac5c3c67bb21304b2606f82?s=128

Joachim Rosskopf

September 26, 2018
Tweet

Transcript

  1. 1.

    AGILE BUSINESS INTELLIGENCE WITH SAP AND CLOUD The BI World

    is spinning faster! How can Agile, Cloud and HANA play well together. data2day, Heidelberg, 26.09.18 Joachim Rosskopf, Maren Übelhör
  2. 2.

    zoi.de WHO WE ARE? 2 Joachim Rosskopf The physicist works

    now for 16 years as developer, trainer and consultant for software systems in the area of enterprise applications, data processing, big data and data science. Maren Übelhör Maren works as Data Scientist at Zoi GmbH. The mathematician is an expert in time series analysis and is passionate about data science and machine learning in the big data environment.
  3. 3.

    The Past and Present of Business Intelligence In the first

    decades of enterprise computing data was used to: ▪ Validate, Control & Report ▪ Proof & Due Diligence Therefore BI experts manly were installed beside bookkeeping or controlling. Companies had very few data sources at the beginning: ▪ Few digital data artifacts, mainly financial. ▪ Kept in a single system, e.g. a mainframe. Over time digital data and system count increased dramatically! zoi.de 3 HOST & MAINFRAME SYSTEMS PROCESS AUTOMATION & DIGITALIZATION ERP SYSTEMS INTERNET & E COMMERCE
  4. 4.

    Future Drivers of Business Intelligence zoi.de 4 Data Nerd NEXT

    GEN ERP & CRM SYSTEMS BUSINESS ANALYTICS & DATA SCIENCE & MACHINE LEARNING SUPER PLATFORMS MOBILE COMPUTING INTERNET OF THINGS Progressing digitization of processes and customer interaction forces enterprises to be more data driven. Data is necessary to: ▪ Get customer intimacy ▪ Operational excellence & process experience ▪ Smart & surprising user experience for products and services Unfortunately right now, nobody knows the right questions to ask to the data. A BI expert is more explorer than bookkeeper. Companies have an enormous amount of internal and external data sources: ▪ Rapidly growing volume of various data artifacts. ▪ Data buried in silos with incompatible interf., semantics and syntax.
  5. 5.

    Modern Business Intelligence (BI) Playground zoi.de 5 Traditional Data Warehouse

    Netweaver Business Intelligence or BW / 4 • BI with Reports and traditional tools operated interactively. • Main data source is the ERP system with high quality structured data. • Consumption by business users and management • Next big innovation is in-memory and perhaps realtime. Aggregating business data and presenting it. A lot of users, no experts Business Analytics BigData Ecosystem & Cloud Services File, fuse, refine, process and explore enterprise data. Few expert users, produces data artifacts • Data Lake to store the growing amount of data in grouped, described data sets. • Persistence strategy makes different workloads possible. Thereby it is scalable and cost efficient. • Processing of data happens in special frameworks and is done by data scientists or engineers. • Traditional BI is just an aspect. Hence results of analysis should be fed back to reporting systems. • Interface to collect data from different sources from various systems. • Digital transformation, cloud and IoT are reasons, why many of these systems are already cloud based. Many companies already operate virtual private clouds. • Data arrives in packages (batch) or as continuous streams of data. Ingest & Record ICT & eCommerce & IoT & Cloud Dienste Record data where it arrives and harmonize syntax and semantics. Different aspects will exist concurrently, and interact and exchange!
  6. 6.

    Agile Business Intelligence to address new Challenges zoi.de Responsibility Autonomy

    Evolution Agile Experiments Incremental Design Quick Learning Unclear req. make upfront design hard. Teams start small, get problem understanding, and incrementally improve. There is no one right solution. Success in the field and time to market should be the figure of merit. BI Teams get access to the data sets they want to operate on as a whole. At best, they operate on a copy, to not affect other teams. Architectures and tools are such, that teams can on build up and tear down environments, to experiment with different solution strategies. Trend Monitoring: Categorize all customers due to their behaviour or preferences. How does this evolve over time? BI Teams have flexibility to decide on tools and methods they use for analysis. Hence, they they are responsible to ensure operations. The people and tech-trends in enterprise IT change. “Agile” is one of the mega-trends shaping how people solve problems. Technologies and Architectures have to adapt to support success.
  7. 7.

    Agile Business Intelligence to address new Challenges zoi.de Responsibility Autonomy

    Evolution Agile Experiments Incremental Design Quick Learning The people and tech-trends in enterprise IT and Business intelligence change. The focus and requirements are moving. “Agile” is one of the mega-trends shaping how people solve problems. Technologies and Architectures have to adapt to support success.
  8. 8.

    Provide the data in a way, that BI Teams can

    work with it. zoi.de Responsibility Autonomy Evolution Agile Together with agile tenets: To be able to efficiently cycle through the iterations, on has to stick to the three core principles we defined before. A typical analytical workflow: A large fraction of time in todays enterprise data projects is spent in finding the right data, understanding it, integrating with the sources, and then quickly refining, validating hypotheses. Hypotheses, ideas innovations Find data Understand and integrate Refine and enrich Validate and build trust Finish models and gain insights Operationalize, Action and Production Many enterprise data projects loose track on the first steps of data acquisition! Or they get dependent on data providers and brittle integrations.
  9. 9.

    Provide the data in a way, that BI Teams can

    work with it. zoi.de A typical analytical workflow: A large fraction of time in todays enterprise data projects is spent in finding the right data, understanding it, integrating with the sources, and then quickly refining, validating hypotheses. Hypotheses, ideas innovations Find data Understand and integrate Refine and enrich Validate and build trust Finish models and gain insights Operationalize, Action and Production Many enterprise data projects loose track on the first steps of data acquisition! Or they get dependent on data providers and brittle integrations.
  10. 10.

    Data Services as a Solution to Data Demands of Agile

    BI Teams. zoi.de Data / Platform Information Capabilities Information Domains Light Transformation Layers Data Services Business Technology Agile Interface Value proposition of Data Services ▪ Data democratization ▪ Standardized, reusable, scalable, decentralized library of central data sets and models ▪ Cross departmental usage to foster enterprise adoption of big data. ▪ Lower cost and time to production by “lightly transformed” data models. ▪ Remove blockers in analytical workflow. ▪ (Some organizations call Data Services “Feature stores)
  11. 11.

    Approach of Data Services based on BigData formats and Cloud

    Services. zoi.de Light Transformation Layers Data Services Technology Ideas for simple data services ▪ Domain experts together with users decide on important data-aspects of the domain. ▪ Integration and transformation of data is realized with an appropriate technology for the respective domain. ▪ Results and important data artefacts are provided for batch uses in form of data services. ▪ At first data services could be files in a partition scheme in a BigData format. ▪ There is enough metadata to find and trace data. Gather and Harmonize Data Transform to Business Semantics Data / Platform Master Data Transaction Data Semi-Struc. Data Third Party Data Relationships Events & Actions Process Flows Relations Nodes/Edges Events as Denorm. Grains Process Flows Denorm. Grains Suitable Metadata Information Suitable File Format Suitable Partition Scheme
  12. 12.

    Sounds more crazy, than it actually is. zoi.de Open Source

    Implementations: Massive parallel, staged processing over distributed storage systems has valid scientific roots. Scientific background There are various open source implementations providing query of a distributed file-system. Sm all, random selection! Cloud services: As the pattern is widespread, the major cloud vendors provide all the building blocks and services. Amazon Athena AWS Redshift Spectrum Google BigQuery Azure DataLake Analytics Alibaba MaxCompute
  13. 13.

    Familiarity and Talent The spectrum of people accustomed to cloud

    and Open Source technologies is much wider than specialized vendor knowledge. Extending Business Intelligence in the Cloud zoi.de 13 Traditional Data Warehouse Netweaver Business Intelligence or BW / 4 Business Analytics BigData Ecosystem & Cloud Services Ingest & Record ICT & eCommerce & IoT & Cloud Dienste BI in the Cloud Technology Options The data landscape of tomorrow is versatile. No single technology or vendor is able to cover all requirements. Clouds are interesting platforms to access diverse solutions. Optimization of Operation Costs Cost efficient archival and separation of storage and compute makes cost efficient architectures possible. Fast Innovation and Experiments Without capital commitment, innovation can be used as soon as it is available. Globally Available Data Cloud infrastructures provide global infrastructure and replication. If services have to be provided world-wide, this is a enormous plus. Reliability The more data driven a company gets the more important the availability of BI system gets. Typically cloud providers operate at highest service levels. Cloud and OpenSource technologies extend the BI playground by providing ready to use solution options. This goes in line with agile principles, as experimentation with new technologies is cheap and easy. In addition to the rapid development of offering of analytical systems the innovative pressure of fiercely competing cloud providers also leads to a high degree of scalability and availability.
  14. 14.

    Clouds are Ecosystems zoi.de 14 Moreover cloud plattforms are more

    and more ecosystems for digital services. Like operating systems in the previous years they act as entry points for vendors, customers and suppliers. Information System ... Customer Supplier Domain Specific Services Public Cloud Services
  15. 15.

    A Typical Information Domain Portfolio of a Medium Company zoi.de

    15 Data Domains Customer Master Data Material Master Data Sales Data Product Information Data in SAP Finance Data Customer Relationship Controlling Material Management Manufacturing Execution System Production Planning E-Commerce IoT Aftersales & Support ... ... Data in other Sys. For a lot of data and processes which are core to an enterprise and carry a lot of business value SAP systems are crucial. Hence many modern, growing datasets reside outside of SAP. This means a flexible two way integration is needed.
  16. 16.

    (R)Evolution in SAPs System Architecture of Core Products zoi.de 16

    DataBase Abstraction For a long time SAPs products where DB vendor independant. Application Server Containing content as ABAP modules, doing most of the calculations here. Due to performance opt. a lot of intermediate results are persisted. Presentation Layer A fat client called SAP UI. Not so interesting for us. HANA in memory DB SAP tries to converge all of it’s products to the HANA platform, with the clustered, in-memory DB as core element. Application Server Application logic is still written in ABAP Presentation Layer Now a web-interface, based on a framework called Fiori. Not so interesting for us. HANA Platform Some logic is pushed down to DB Older architecture Modern architecture Core implications of SAP’s architecture shift. ▪ HANA database lies at the core of the new products. ▪ It gives the unifying possibility to integrate at a single point into the whole SAP ecosystem. Questions: ▪ Viable developer story? ▪ Flexible API available?
  17. 17.

    17 zoi.de 17 Demo Time: HANA in Docker in 5

    Minutes. ▪ First step: Start docker container! ▪ Second step: Connect to HANA with Eclipse
  18. 18.

    The HANA Data Integration Story was … is confusing ...

    zoi.de 18 Source: https://blogs.sap.com/.../hana-smart-data-integration-architecture/ “Prior to HANA SP9 SAP suggested to use different tools to get data into Hana: Data Services (DS), System Landscape Transformation (SLT), Smart Data Access (SDA), Sybase Replication Server (SRS), Hana Cloud Integration – DS (HCI-DS),… to name the most important ones. You used DS for batch transformations of virtually any sources, SLT for realtime replication of a few supported databases with little to no transformations, HCI-DS when it comes to copying database tables into the cloud etc.” “With the HANA Smart Data Integration SDI feature you get all in one package plus any combination, when it comes to loading a single HANA instance.”
  19. 19.

    Smart Data Integration (SDI) in HANA zoi.de 19 SAP HANA

    Platform Index Server Tables Virtual Tables Data Provisioning Server Data Provisioning Agent Part of HANA Cluster Custom Adapter SDK ODBC SOAP Files Twitter ;) TCP Proto Whatever you want! Autonomous Deploy.
  20. 20.

    20 zoi.de 20 Demo Time: SDI Adapter to BigQuery ▪

    First step: Create SDI remote source ▪ Second step: Create virtual table ▪ Third step: Issue SQL statement to load data with full debug story ▪ Fourth Step: Watch DataFlow job while running
  21. 21.

    Customers Virtual Private Cloud Architecture of a Custom Adapter zoi.de

    21 On Premise SAP HANA DP AGENT Hello World Adapter Cloud Storage Job Executor (Dataflow) Metadata Query Stored Procedure HTTP Avro & JSON Payload Hello World Gateway Big Query State Machine & Jobs Result Serving REST Interface The whole architecture is not optimized for low latency or query performance. Hence it enables to transfer reliably large datasets and at the same time release compute resources as quick as possible. Upload
  22. 22.

    22 zoi.de 22 Demo Time: SDI Adapter to BigQuery ▪

    Fifth step: Have a look at job dir & results ▪ Six step: Query transfered data
  23. 23.

    Other Possibilities of this Architecture zoi.de 23 Cloud Storage Job

    Executor (Dataflow) HTTP Avro & JSON Payload Gateway State Machine Result Serving REST Interface Customers Virtual Private Cloud Relying on OpenSource file format for transfer and schema specificaton reduces implementation work. Avro SerDe is widely supported by OpenSource and cloud engines. No recoding necessary. Hiding job and execution specific steps behind a general state machine interface, makes interaction with more complicated tasks possible (e.g. serverless function exec.) Serving results not directly form the service, but from an object store frees cloud resources early on, scales and is reliable. Executors can be implemented serverless when relying on combination of cloud services and OpenSource (e.g. Beam/Dataflow, Spark/Glue). Nearly all cloud services, like serverless functions, ETL services, ML services or databases can read and write to cloud storage and be triggered async. A mapping of the functionality of the service on query, upload or stored procedure is possible. This is not limited to Google cloud, but also works for AWS, Azure and Alibaba.
  24. 24.

    24 zoi.de 24 Summary ▪ Business Intelligence world is changing

    ▪ There is no clear line between BI, DataScience and BigData ▪ New drivers and new people require different approaches. ▪ Cloud services provide new benefits for BI. ▪ SAP is and will be central to enterprise information architecture. ▪ HANA is the future foundation of SAP Systems. ▪ There are easy, flexible integration strategies to HANA systems. ▪ Cloud and HANA go very well together. … with a little bit of effort ;-)
  25. 25.

    ZOI: OUR DNA? DIGITAL. zoi.de 25 ▪ We are growing

    with experienced minds at our locations in Stuttgart and Berlin ▪ We combine new technologies, tools and methods with our strong competence to implement and the challenges of our customers. ▪ We are computer scientists, electrical engineers, mathematicians, physicists, biologists, business economics. ▪ Our technological drive is unbroken: We use part of our working time trying out new technologies. ▪ Zoi is a 100% digital subsidiary of Kärcher. ZOI IS THE ABBREVIATION FOR ZERO ONE INFINITY: OUR DIGITAL DNA
  26. 26.

    zoi.de 26 THANK YOU FOR THE OPPORTUNITY TO PRESENT OUR

    IDEAS! Agile Business Intelligence with SAP and Cloud Joachims Email: jr@zoi.de Joachims Twitter: @jrosskopf Marens Email: mue@zoi.de Marens Twitter: @datamue WE ARE HIRING! meet@zoi.de