Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Essent

Marketing OGZ
September 15, 2023
94

 Essent

Marketing OGZ

September 15, 2023
Tweet

Transcript

  1. Revolutionise the Data Lake using Kafka v Sami Alashabi Associate

    Manager Data & AI - Accenture Solutions Architect - Essent v Flavius Fernandes Technical Capability Lead - Essent
  2. Agenda ▸ Introduction ▸ Problem / Result ▸ Architecture ▸

    Main Concepts ▸ Key Learnings ▸ Q&A 2
  3. Problem 4 ▸ Need for (near) real time analytics ▸

    Tedious operations & monitoring ▸ Outdated un-scalable data architecture ▸ Limitations with on-premise CDC solution
  4. Result 5 ▸ Ability to react faster to customer feedback

    ▸ Reduced costs for monitoring & operations ▸ Modern scalable cloud architecture ▸ Centralized Event Bus
  5. 6 Amazon ECS Kafka Connect Confluent Cloud ksqlDB Landing S3

    AWS Lambda (event source) Amazon Aurora Amazon DynamoDB S3 Sink Transit Gateway Schema Registry Snowflake Amazon Glue Amazon Athena Analytics Source Connectors Transit Gateway Apps Amazon ECS Kafka Connect AWS Fargate Real-Time Apps Microservices SAP LT Replication Server ODQ ODP Framework Structure Mapping & Transformation Write Module SAP Extractor Read Module DB Trigger Logging Table Application Table Data stores Delta Lake Databricks Clean S3
  6. 7 Amazon ECS Kafka Connect Confluent Cloud ksqlDB Landing S3

    AWS Lambda (event source) Amazon Aurora Amazon DynamoDB S3 Sink Transit Gateway Schema Registry Snowflake Amazon Glue Amazon Athena Analytics Source Connectors Transit Gateway Apps Amazon ECS Kafka Connect AWS Fargate Real-Time Apps Microservices SAP LT Replication Server ODQ ODP Framework Structure Mapping & Transformation Write Module SAP Extractor Read Module DB Trigger Logging Table Application Table Data stores Delta Lake Databricks Clean S3
  7. Apache Kafka 9 Apache Kafka is an open-source distributed streaming

    system used for stream processing, real-time data pipelines, and data integration at scale. ▸ Brokers ▸ Producers ▸ Consumers ▸ Connectors ▸ Topic
  8. Infrastructure as Code 10 IaC allows you to build, change,

    and manage your infrastructure in a safe, consistent, and repeatable way by defining resource configurations that you can version, reuse, and share. Portability Collaboration Declarative Reusability Consistency Collaboration Declarative Reusability Consistency Portability Declarative Reusability Consistency Portability Collaboration Reusability Consistency Portability Collaboration Declarative Consistency Portability Collaboration Declarative Reusability
  9. Event Sourcing 11 ▸ A State-Based system modifies the state

    of the application in-place using Create, Read, Update & Delete (CRUD). 13:00 13:15 14:00 13:00 14:00 13:01 - 13:59 ▸ An Event-Based Systems models the chronological state changes made by applications as an immutable sequence or “log” of events.
  10. Privacy by Design 12 Psuedoanonymisation ▹ replacing any information which

    could be used to identify an individual with a pseudonym
  11. Delta Lake 13 Delta Lake is an open-source storage layer

    designed to run on top of an existing data lake and improve its reliability, security, and performance. Features: ▸ ACID Transactions ▸ Time Travel ▸ Schema Evolution/Enforcement ENGINE Databricks AWS Glue
  12. Key Learnings ▸ Understand Kafka Fundamentals: Before diving into implementation.

    ▸ Align Use Cases: Identify specific use cases within your organization that can benefit from data streaming. ▸ Plan Infrastructure and Scaling: Design a Kafka cluster architecture that suits your performance, availability, and scalability requirements. ▸ Community and Resources: Leverage the community and available resources along with building an internal community. ▸ Continuous Improvement: Treat Kafka adoption as an ongoing journey 14
  13. 15 15 THANKS! Any questions? v Flavius Fernandes Technical Capability

    Lead - Essent v Sami Alashabi Associate Manager Data & AI - Accenture Solutions Architect - Essent