Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Microservices, DevOps & SRE

Microservices, DevOps & SRE

Building Cloud-Native App Series - Part 12 of 15
Microservices Architecture Series
DevOps
SRE - Site Reliability Engineering

Araf Karsh Hamid

June 01, 2022
Tweet

More Decks by Araf Karsh Hamid

Other Decks in Technology

Transcript

  1. @arafkarsh arafkarsh 8 Years Network & Security 6+ Years Microservices

    Blockchain 8 Years Cloud Computing 8 Years Distributed Computing Architecting & Building Apps a tech presentorial Combination of presentation & tutorial ARAF KARSH HAMID Co-Founder / CTO MetaMagic Global Inc., NJ, USA @arafkarsh arafkarsh 1 Microservice Architecture Series Building Cloud Native Apps User Stories / Agile Architecture & Design Testing Automation DevOps & SRE Part 12 of 15
  2. @arafkarsh arafkarsh Slides are color coded based on the topic

    colors. Design Thinking / Lean Startup / Agile / Stories 1 Architecture & Design 2 Test Automation 3 DevOps SRE 4 2
  3. @arafkarsh arafkarsh What is DevOps? 3 Is DevOps - A

    technology or collection of technologies? Answer: NO Is DevOps - A way of programming? Answer: NO is DevOps - A Process? Answer: NO Can you become a DevOps Engineer? Answer: NO - its not a skill set
  4. @arafkarsh arafkarsh I am confused! Then what is DevOps? 4

    It’s the Destination Let the Journey Begin!!!
  5. @arafkarsh arafkarsh Application Modernization – 3 Transformations 5 Monolithic SOA

    Microservice Physical Server Virtual Machine Cloud Waterfall Agile DevOps Source: IBM: Application Modernization > https://www.youtube.com/watch?v=RJ3UQSxwGFY Architecture Infrastructure Delivery
  6. @arafkarsh arafkarsh Application Modernization – 3 Transformations 6 Monolithic SOA

    Microservice Physical Server Virtual Machine Cloud Waterfall Agile DevOps Source: IBM: Application Modernization > https://www.youtube.com/watch?v=RJ3UQSxwGFY Architecture Infrastructure Delivery Modernization 1 2 3
  7. @arafkarsh arafkarsh Agile Scrum (4-6 Weeks) Developer Journey Monolithic Domain

    Driven Design Event Sourcing and CQRS Waterfall Optional Design Patterns Continuous Integration (CI) 6/12 Months Enterprise Service Bus Relational Database [SQL] / NoSQL Development QA / QC Ops 7 Microservices Domain Driven Design Event Sourcing and CQRS Scrum / Kanban (1-5 Days) Mandatory Design Patterns Infrastructure Design Patterns CI DevOps Event Streaming / Replicated Logs SQL NoSQL CD Container Orchestrator Service Mesh
  8. @arafkarsh arafkarsh Stages of DevOps Delivery Pipeline 8 Source: Sanjeev

    Sharma, IBM, DevOps for Dummies Application Release Management Development Build Package Repository Test Environment Stage Environment Production Environment Application Deployment Automation Cloud Provisioning mvn repository npm repository Docker hub
  9. @arafkarsh arafkarsh Software Specs • Design Thinking / Lean Startup

    • User Stories • Agile - Scrum / Kanban 9 1
  10. @arafkarsh arafkarsh Three Mindsets of Product Development 10 Design Thinking

    Lean Agile Source: Jonny Schneider, Thought Works Explore the Problem Build the right things Build the things right Hypothesis Validation New Business Requirements Product Evolutions Agile MVP
  11. @arafkarsh arafkarsh Agile Values 11 INDIVIDUALS AND INTERACTIONS OVER PROCESSESS

    AND TOOLS WORKING SOFTWARE COMPREHENSIVE DOCUMENTATION OVER CUSTOMER COLLABORATION OVER CONTRACT NEGOTIATION RESPONDING TO CHANGE OVER FOLLOWING A PLAN Source: Agile Manifesto - https://www.scrumalliance.org/resources/agile-manifesto
  12. @arafkarsh arafkarsh Scrum 12 4 – 8 People Complete Specs

    Stories Planned for a Sprint Max 8 Hours Max 15 Mins Multiple increments within a Sprint 1 Month Release
  13. @arafkarsh arafkarsh What is Kanban 13 Kanban is a method

    for managing the creation of products with an emphasis on • continual delivery (Daily / Hourly) while • not overburdening the development team. Like Scrum, Kanban is a process designed to help teams work together more effectively. Kanban is a visual management method that was developed by Toyota in the early 1940s. Kanban in Japanese means Card Microsoft Xbox One Team does multiple Daily releases using Kanban.
  14. @arafkarsh arafkarsh Three Principles of Kanban 14 Source: https://resources.collab.net/agile-101/what-is-kanban •

    Visualize what you do today (workflow): seeing all the items in context of each other can be very informative • Limit the amount of work in progress (WIP): this helps balance the flow- based approach, so teams don’t start and commit to too much work at once • Enhance flow: when something is finished, the next highest thing from the backlog is pulled into play
  15. @arafkarsh arafkarsh Kanban Board 15 Backlog Work breakdown Work In

    Progress Done Active Done Active Done Track Task blocked due to Dependency. Once the dependent Task is ready the blocked task will be moved to Active State To Do List Max items in WIP must be 1.4x of total Resources A Backlog item is broken down to tasks and each Task should NOT take more than 1-3 days at max. It’s a good practice to keep all the tasks of similar size. Tasks are assigned to respective team members.
  16. @arafkarsh arafkarsh Similarities between Kanban and Scrum 16 Task Breakdown

    Continuous Improvement Visible Workflow Both Scrum and Kanban supports Large Complex work to be broken down to smaller tasks and completed efficiently. Both place high focus on Continuous Improvement and process optimization and support a highly visible (Task) Workflows for the visibility to all the stake holders.
  17. @arafkarsh arafkarsh Release Cycles 17 Kanban Preparation Requirements Design Development

    Testing Release 1 – 4 Days Cycle Scrum 1 Month (Max) Cycle 1 or 2 Weeks Cycle also allowed
  18. @arafkarsh arafkarsh Kanban vs. Scrum 18 Kanban Scrum Roles &

    Responsibilities No prescribed roles Pre-defined roles of Scrum master, Product owner and team member Delivery Timelines Continuous Delivery (Daily/Hourly) Time boxed sprints (2-4 Weeks) Delegation & Prioritization Work is pulled through the system (single piece flow) Work is pulled through the system in batches (the sprint backlog) Modifications Changes can be made at any time No changes allowed mid-sprint Measurement of Productivity Cycle time Velocity When to Use? More appropriate in operational environments with a high degree of variability in priority More appropriate in situations where work can be prioritized in batches that can be left alone Source: https://leankit.com/learn/kanban/kanban-vs-scrum/
  19. @arafkarsh arafkarsh Benefits of Kanban 19 • Shorter cycle times

    can deliver features faster. • Responsiveness to Change: • When priorities change very frequently, Kanban is ideal. • Balancing demand against throughput guarantees that most the customer-centric features are always being worked. • Requires fewer organization / room set-up changes to get started • Reducing waste and removing activities that don’t add value to the team/department/organization • Rapid feedback loops improve the chances of more motivated, empowered and higher-performing team members
  20. @arafkarsh arafkarsh User Stories • User Stories • Behavior Driven

    Design • Writing Good Stories • Estimate and Planning • Case Study 20 Theme Epic User Story Sprint
  21. @arafkarsh arafkarsh ShopEasy – eCommerce Portal 21 Theme Epic User

    Story Sprint ShopEasy – eCommerce Application 1. Customer Management 2. Search Product 3. Catalogue 4. Shopping Cart 5. Order Processing 6. Payments 2. Search Product Release 1 1. Global Search Release 2 1. Search by Brand 2. Search by Price Range Release 3 1. Search by Model 2. Search by Rating Stories 1. Global Search 2. Search by Brand 3. Search by Price Range 4. Search by Model 5. Search by Rating
  22. @arafkarsh arafkarsh Epic – Customer 22 As a Consumer I

    want to register eCommerce Portal So that I can buy products Role-Feature-Reason Matrix User Story – 1 : Registration BDD Acceptance Criteria – 1: Save User Given The fields First Name, Last Name, DOB Address, Email Address, Phone No. When User enters values in the fields First Name, Last Name, DOB Address, Email Address, Phone No. Then If the following fields contains values First Name, Last Name, Address, Email Address and Phone No. AND Age is greater than 18 Save the Data. BDD Acceptance Criteria – 2 : Generate Password Given User Info Available When Email Address is a valid email Then Generate the password AND Send mail with user email address as login id the URL of the portal AND Send Password in a separate email address. AND Store data on mail status as mail send or failed. BDD Acceptance Criteria – 3 : Resend Mail Given User Registration mail status is available When The Mail status is failed. Then Send the mail again AND stored the attempt number.
  23. @arafkarsh arafkarsh User Journey / Story Map & Release Cycles

    23 Browse Products Add to Shopping Cart Select Shipping Address Confirm Order Make Payment Catalogue Shopping Cart Order Payment Customer View Product Search User Journey Search by Price Image Gallery Update Qty Use PayPal R2 Global Search Product Details Add to Cart Delete Item Select Address Confirm Order Pay Credit Card Make Payment R1 Registration Minimum Viable Product Scrum Sprint Cycle Search by Price Image Gallery Update Qty Use PayPal Kanban Cycle: Each of the Story can be released without waiting for other stories to be completed resulting in Shorter Releases as all the stories are independent!
  24. @arafkarsh arafkarsh Architecture & Design • Capability Centric Design •

    Domain Driven Design • Event Sourcing & CQRS • Microservices Architecture 24 2
  25. @arafkarsh arafkarsh Capability Centric Design 25 Business Centric Development •

    Focus on Business Capabilities • Entire team is aligned towards Business Capability. • From Specs to Operations – The team handles the entire spectrum of Software development. • Every vertical will have its own Code Pipeline, Build Pipeline Front-End-Team Back-End-Team Database-Team In a typical Monolithic way, the team is divided based on technology / skill set rather than business functions. This leads to not only bottlenecks but also lack of understanding of the Business Domain. QA Team QA = Quality Assurance PO = Product Owner Vertically sliced Product Team Front-End Back-End Database Business Capability 1 QA PO Ops Front-End Back-End Database Business Capability 2 QA PO Ops Front-End Back-End Database Business Capability - n QA PO Ops
  26. @arafkarsh arafkarsh Event Sourcing Intro 27 Standard CRUD Operations –

    Customer Profile – Aggregate Root Profile Address Title Profile Created Profile Address New Title Title Updated Profile New Address New Title New Address added Derived Profile Address Notes Notes Removed Time T1 T2 T4 T3 Event Sourcing and Derived Aggregate Root Commands 1. Create Profile 2. Update Title 3. Add Address 4. Delete Notes 2 Events 1. Profile Created Event 2. Title Updated Event 3. Address Added Event 4. Notes Deleted Event 3 Profile Address New Title Current State of the Customer Profile 4 Event store Single Source of Truth Greg Young
  27. @arafkarsh arafkarsh User Journey / CCD / DDD / Event

    Sourcing & CQRS 28 User Journey Bounded Context 1 Bounded Context 2 Bounded Context 3 1. Bounded Contexts 2. Entity 3. Value Objects 4. Aggregate Roots 5. Domain Events 6. Repository 7. Service 8. Factory Process 1 Commands 2 Projections 5 ES Aggregate 4 Events 3 Event Sourcing & CQRS Domain Expert Analyst Architect QA Design Docs Test Cases Code Developers Domain Driven Design Ubiquitous Language Core Domain Sub Domain Generic Domain Vertically sliced Product Team FE BE DB Business Capability 1 QA Team PO FE BE DB Business Capability 2 QA Team PO FE BE DB Business Capability n QA Team PO
  28. @arafkarsh arafkarsh Microservices Characteristics 29 We can scale our operation

    independently, maintain unparalleled system availability, and introduce new services quickly without the need for massive reconfiguration. — Werner Vogels, CTO, Amazon Web Services Modularity ... is to a technological economy what the division of labor is to a manufacturing one. W. Brian Arthur, author of e Nature of Technology The key in making great and growable systems is much more to design how its modules communicate rather than what their internal properties and behaviors should be. Alan Kay, 1998 email to the Squeak-dev list Components via Services Organized around Business Capabilities Products NOT Projects Smart Endpoints & Dumb Pipes Decentralized Governance & Data Management Infrastructure Automation Design for Failure Evolutionary Design
  29. @arafkarsh arafkarsh Shopping Portal 30 /ui /productms /auth /order Gateway

    Virtual Service Deployment / Replica / Pod Nodes Istio Sidecar - Envoy Load Balancer Kubernetes Objects Istio Objects Firewall P M C Istio Control Plane UI Pod N5 v2 Canary v2 User X = Canary Others = Stable A / B Testing using Canary Deployment v1 UI Pod UI Pod UI Pod UI Service N1 N2 N2 Destination Rule Stable / v1 EndPoints Internal Load Balancers Source: https://github.com/meta-magic/kubernetes_workshop Users Product Pod Product Pod Product Pod Product Service MySQL Pod N4 N3 Destination Rule EndPoints Review Pod Review Pod Review Pod Review Service N1 N4 N3 Service Call Kube DNS EndPoints
  30. @arafkarsh arafkarsh Deployment – Updates and rollbacks, Canary Release D

    ReplicaSet – Self Healing, Scalability, Desired State R Worker Node 1 Master Node (Control Plane) Kubernetes Architecture 31 POD POD itself is a Linux Container, Docker container will run inside the POD. PODs with single or multiple containers (Sidecar Pattern) will share Cgroup, Volumes, Namespaces of the POD. (Cgroup / Namespaces) Scheduler Controller Manager Using yaml or json declare the desired state of the app. State is stored in the Cluster store. Self healing is done by Kubernetes using watch loops if the desired state is changed. POD POD POD BE 1.2 10.1.2.34 BE 1.2 10.1.2.35 BE 1.2 10.1.2.36 BE 15.1.2.100 DNS: a.b.com 1.2 Service Pod IP Address is dynamic, communication should be based on Service which will have routable IP and DNS Name. Labels (BE, 1.2) play a critical role in ReplicaSet, Deployment, & Services etc. Cluster Store etcd Key Value Store Pod Pod Pod Label Selector selects pods based on the Labels. Label Selector Label Selector Label Selector Node Controller End Point Controller Deployment Controller Pod Controller …. Labels Internet Firewall K8s Virtual Cluster Cloud Controller For the cloud providers to manage nodes, services, routes, volumes etc. Kubelet Node Manager Container Runtime Interface Port 10255 gRPC ProtoBuf Kube-Proxy Network Proxy TCP / UDP Forwarding IPTABLES / IPVS Allows multiple implementation of containers from v1.7 RESTful yaml / json $ kubectl …. Port 443 API Server Pod IP ...34 ...35 ...36 EP • Declarative Model • Desired State Key Aspects N1 N2 N3 Namespace 1 N1 N2 N3 Namespace 2 • Pods • ReplicaSet • Deployment • Service • Endpoints • StatefulSet • Namespace • Resource Quota • Limit Range • Persistent Volume Kind Secrets Kind • apiVersion: • kind: • metadata: • spec: Declarative Model • Pod • ReplicaSet • Service • Deployment • Virtual Service • Gateway, SE, DR • Policy, MeshPolicy • RbaConfig • Prometheus, Rule, • ListChekcer … @ @ Annotations Names Cluster IP Node Port Load Balancer External Name @ Ingress
  31. @arafkarsh arafkarsh Microservices Testing Strategies 33 Ubiquitous Language Domain Expert

    Analyst Developers QA Design Docs Test Cases Code E2E Testing Integration Testing Contract Testing Component Testing Unit Testing Number of Tests Speed Cost Time Mike Cohen’s Testing Pyramid Test Pyramid: https://martinfowler.com/bliki/TestPyramid.html 70% 20% 10%
  32. @arafkarsh arafkarsh Other Testing Strategies or Anti Patterns 34 End

    2 End Testing Integration Testing Unit Testing Inverted Pyramid / Ice Cream Cone Strategy Unit Testing Integration Testing End 2 End Testing Hour Glass Strategy 70% 20% 10% 45% 45% 10%
  33. @arafkarsh arafkarsh Microservices Testing Strategy 35 Unit Testing A unit

    test exercises the smallest piece of testable software in the application to determine whether it behaves as expected. Source: https://martinfowler.com/articles/microservice-testing/#agenda Component Testing A component test limits the scope of the exercised software to a portion of the system under test, manipulating the system through internal code interfaces and using test doubles to isolate the code under test from other components. Integration Testing An integration test verifies the communication paths and interactions between components to detect interface defects Integration Contract Testing An Integration Contract test is a test at the boundary of an external service verifying that it meets the contract expected by a consuming service. End 2 End Testing An end-to-end test verifies that a system meets external requirements and achieves its goals, testing the entire system, from end to end Say NO to End 2 End Tests - Mike Walker April 22, 2015. Google Test Blog
  34. @arafkarsh arafkarsh Microservices Testing Scenarios / Tools 36 Testing Tools

    Contract Testing Scope Integration Testing Verifies the communication paths and interactions between components to detect interface defects Contract Testing It is a test at the boundary of an external service verifying that it meets the contract expected by a consuming service. Payment Mock Integration Contract Testing Scope Test Double Montebank Cart Component Testing Unit Testing Integration Testing Scope Order REST / HTTP or Events / Kafka Item ID, Quantity, Address.. Mock Order Component Testing A component test limits the scope of the exercised software to a portion of the system under test. Order Payment Unit Testing Firewall Integration Testing Scope REST / HTTP Payment Sandbox Component Testing U
  35. @arafkarsh arafkarsh Shift Right – Chaos Engineering 37 Cloud Chaos

    Monkey Randomly disables production instances Chaos Kong Similar to Chaos Monkey, simulates an outage of an entire Amazon availability zone. Doctor Monkey Kubernetes Checks CPU load, Memory usage and removes it from network if the health is bad. Janitor Monkey Kubernetes Search for unused resources and disposes them. Conformity Monkey Finds instances that don’t adhere to best- practices and shuts them down. Latency Money Service Mesh Induces Artificial delays Security Monkey Is an extension of Compliance Monkey. Find security vulnerabilities and terminates offending instances. Source: https://github.com/Netflix/SimianArmy/wiki Source: http://principlesofchaos.org/ Production Testing – Load / Stress / Performance
  36. @arafkarsh arafkarsh Behavior Driven Development 38 Source: https://dannorth.net/introducing-bdd/ As an

    insurance Broker I want to know who my Gold Customers are So that I sell more Given Customer John Doe exists When he buys insurance ABC for $1000 USD Then He becomes a Gold Customer BDD Construct Role-Feature-Reason Matrix As a Customer I want to withdraw Cash from ATM So that I don’t have to wait in line at the bank Given The account is in Credit AND the Card is Valid AND the dispenser contains Cash BDD Construct Role-Feature-Reason Matrix When The Customer requests Cash Then Ensure that the Account is debited AND Ensure cash is dispensed AND ensure that Card is returned.
  37. @arafkarsh arafkarsh Features of BDD 39 • Focus on Behavior

    of the System rather than tests. • Collaboration between Business Stake holders, Analysts, Developers, QA. • Ubiquitous Language • Driven By Business Value • Extends Test Driven Development Source: https://cucumber.io/ Cucumber merges specification and test documentation into one cohesive whole.
  38. @arafkarsh arafkarsh Testing Strategy Summary 40 1. Unit Testing A

    unit test exercises the smallest piece of testable software. 2. Component Testing A component test limits the scope of the exercised software to a portion of the system under test. 3. Contract Testing It is a test at the boundary of an external service verifying that it meets the contract expected by a consuming service 4. Integration Testing It verifies the communication paths and interactions between components to detect interface defects.
  39. @arafkarsh arafkarsh DevOps & SRE • DevOps • SRE •

    Best Practices • Case Studies 41 4 SRE: Site Reliability Engineering
  40. @arafkarsh arafkarsh DevOps o ITIL o Development and Operations –

    Silos o Lean Thinking o CALMS Framework o SpecOps – SDLC o 5 Principles of DevOps 42
  41. @arafkarsh arafkarsh ITIL – Service Life Cycle 43 Source: https://www.flycastpartners.com/itil-service-lifecycle-guide/

    • ITIL is a framework providing best practice guidelines on all aspects of end to end service management. • It covers complete spectrum of People, Processes, Products and use of Partners (v3). Service is a means of delivering value to customers by achieving customer's desired results while working within given constraints. Incident is defined as any disruption in IT service. Service Level Agreement. It is a commitment between a service provider and a client.
  42. @arafkarsh arafkarsh Development & Operations 44 Development Team Agility Operations

    Team Stability Developers Keep throwing releases over the wall and get pushed back by the operations team.
  43. @arafkarsh arafkarsh DevOps History 45 DevOps isn’t simply a process

    or a different approach to development — it’s a culture change. And a major part of a DevOps culture is collaboration. Source: https://www.atlassian.com/devops/what-is-devops/history-of-devops Patrick Debois Andrew C Shafer Coined the Term in 2009 DevOps
  44. @arafkarsh arafkarsh DevOps – Lean thinking 46 Source: Sanjeev Sharma,

    IBM, DevOps for Dummies Systems of Records: Critical Enterprise transactions and these Apps doesn’t require frequent changes. Systems of Engagement: With introduction of Rich Web Apps and Mobiles Apps, Systems of Records were augmented by Systems of Engagements. Customers directly engage with these Apps and these Apps requires Rapid Releases. DevOps Return on Investment 1. Enhanced Customer Experience 2. Increased Capacity to Innovate 3. Faster time to value
  45. @arafkarsh arafkarsh Management Pipeline Automation Design / Develop SpecOps Workflow

    - SDLC 49 Green Field Brown Field Domain Driven Design Event Sourcing / CQRS Migration Patterns Strangler Fig, CDC… Build Design Develop Test Stage Ops Cloud • Fault Tolerance • Reliability • Scalability • Traffic Routing • Security • Policies • Observability • Unit Testing • Component • Integration • Contract • Package Repositories • Mvn, npm, docker hub • Containers • Orchestration • Serverless • Traffic Routing • Security (mTLS, JWT) • Policies (Network / Security • Observability Infra Code • Feature Code • Configs Source Code Specs
  46. @arafkarsh arafkarsh Agile Values 51 INDIVIDUALS AND INTERACTIONS OVER PROCESSESS

    AND TOOLS WORKING SOFTWARE COMPREHENSIVE DOCUMENTATION OVER CUSTOMER COLLABORATION OVER CONTRACT NEGOTIATION RESPONDING TO CHANGE OVER FOLLOWING A PLAN Source: Agile Manifesto - https://www.scrumalliance.org/resources/agile-manifesto
  47. @arafkarsh arafkarsh 5 DevOps Principles – CALMS Framework 52 Source:

    https://www.atlassian.com/devops/frameworks/calms-framework DevOps isn’t a process, or a different approach to development. It’s a culture change. DevOps culture is collaboration. Build, Test, Deploy, and Provisioning automation are typical starting points for teams. Another major contribution of DevOps is “configuration as code.” Developers strive to create modular, composable applications because they are more reliable and maintainable. CULTURE AUTOMATION LEAN MEASUREMENT SHARING Continuous Improvement with Canary Releases and A/B Testing Continuous Improvement requires Data to measure the changes Sharing responsibility, success, failure goes a long way toward bridging that divide between Dev and Ops. You built it, You run it.
  48. @arafkarsh arafkarsh Implementing CALMS – DevOps Principles 53 Capability Centric

    Design Reduce Organization Silos CULTURE Leverage Tooling & Automation Tests, CI/CD Pipeline & Container Orchestration AUTOMATION Implement Gradual Change Microservices Architecture & Agile: Kanban LEAN Measure Everything Service Mesh: Observability MEASUREMENT Accept Failure as Normal Design for Failure SHARING Source: IBM DevOps Vs. SRE https://www.youtube.com/watch?v=KCzNd3StIoU Google: https://www.youtube.com/watch?v=uTEL8Ff1Zvk
  49. @arafkarsh arafkarsh Agile & DevOps 54 Build Design Develop Test

    Deploy Ops Specs Agile DevOps Go Live Support Specs / Design / Development CI/CD and Tests Automation Operations
  50. @arafkarsh arafkarsh class SRE implements DevOps o SRE o Service

    Levels - SLI / SLO o SRE Concept o SRE Responsibilities 55 Source: https://stackify.com/site-reliability-engineering/ - https://www.redhat.com/en/topics/devops/what-is-sre
  51. @arafkarsh arafkarsh class SRE implements DevOps – CALMS 56 Capability

    Centric Design Reduce Organization Silos CULTURE Leverage Tooling & Automation Tests, CI/CD Pipeline & Container Orchestration AUTOMATION Implement Gradual Change Microservices Architecture & Agile: Kanban LEAN Measure Everything Service Mesh: Observability MEASUREMENT Accept Failure as Normal Design for Failure SHARING ✓ Share Ownership ✓ SLOs & Blameless PM ✓ Canary Deployment, A/B Testing ✓ Automate this years Job ✓ Measure toil & reliability
  52. @arafkarsh arafkarsh Service Levels – SLI / SLO 57 SLI

    – Service Level Indicator For Web sites: SLI is a Percentage of requests responded in good health. SLI can be a Performance Indicator: Percentage of search results returned under 50 milli-seconds. SLO – Service Level Objective SLO is a goal built around SLI. It is usually a percentage and is tied to a period and it is usually measured in a number of nines. Time periods can be last 24 hours, last week, last 30 days, current quarter etc. uptime Last 30 Days 90% (1 nine of uptime): Meaning you were down for 10% of the period. This means you were down for three days out of the last thirty days. 99% (2 nines of uptime): Meaning 1% or 7.2 hours of downtime over the last thirty days. 99.9% (3 nines of uptime): Meaning 0.1% or 43.2 minutes of downtime. 99.99% (4 nines of uptime): Meaning 0.01% or 4.32 minutes of downtime. 99.999% (5 nines of uptime): Meaning 26 seconds or 0.001% of downtime.
  53. @arafkarsh arafkarsh SRE – Concept 58 ❑ Bridge the Gap

    between Development & Operations ❑ Developers wants to ship features as fast as possible ❑ Operations want stability in Production ❑ Empowers the Software Developers to own the operations of Applications in Production. ❑ Site Reliability Engineers spends 50% of their time in Operations. ❑ SRE has a deep understanding of the application, the code, how it runs, is configured and how it will scale. ❑ They monitor and manage the support apart from the development activities. Source: https://stackify.com/site-reliability-engineering/ - https://www.redhat.com/en/topics/devops/what-is-sre
  54. @arafkarsh arafkarsh SRE – Responsibilities 59 ❑ Proactively monitor and

    review application performance ❑ Handle on-call and emergency support ❑ Ensure software has good logging and diagnostics ❑ Create and maintain operational runbooks ❑ Help triage escalated support tickets ❑ Work on feature requests, defects and other development tasks ❑ Contribute to overall product roadmap Source: https://stackify.com/site-reliability-engineering/ - https://www.redhat.com/en/topics/devops/what-is-sre
  55. @arafkarsh arafkarsh DevOps Best Practices o Shift Left – CI/CD

    Automation o Infrastructure as a Code o Stages of Delivery Pipeline o Observability 60
  56. @arafkarsh arafkarsh Production Environment Continuous Monitoring Fully Automated Continuous Deployment

    Shift Left – Operational Concerns 61 • Operations Concerns move earlier in software delivery life cycle, towards development. • The Goal is to enable Developers and QC Team to Develop and Test the software that behave like Production System in fully automated way. Development Environment Build Build Build Test Environment Continuous Integration Unit Testing Component Testing Contract Testing Integration Testing Continuous Testing Shift Left moves operations earlier in development cycle. Stage Environment Acceptance Testing Pull Request / Merge Continuous Delivery GitOps – CD/CD
  57. @arafkarsh arafkarsh Infrastructure as a Code 62 • Infrastructure as

    a Code is a critical capability for DevOps • This helps the organizations to establish a fully automated pipeline for Continuous Delivery. • Infra as a Code is a software defined environment to manage the following: • Network Topologies, Roles, Relationship, Network Policies • Deployment Models, Workloads, Workload Policies & Behaviors. • Autoscaling (up & down) of the workloads
  58. @arafkarsh arafkarsh Stages of DevOps Delivery Pipeline 63 Source: Sanjeev

    Sharma, IBM, DevOps for Dummies Application Release Management Development Build Package Repository Test Environment Stage Environment Production Environment Application Deployment Automation Cloud Provisioning mvn repository npm repository Docker hub
  59. @arafkarsh arafkarsh Pillars of Observability 64 Immutable records of discrete

    events that happen over time Logs/events Numbers describing a particular process or activity measured over intervals of time Metrics Data that shows, for each invocation of each downstream service, which instance was called, which method within that instance was invoked, how the request performed, and what the results were Traces Source: A Beginners guide to Observability by Splunk
  60. @arafkarsh arafkarsh Observability in Kubernetes Worker Node 65 eBPF Programs

    Network Flow Log K-Probe Connection Tracker Linux Kernel Prometheus Envoy Proxy Log Collector FluentD Pods Pods Pods Pods Pods Pods Service Pods Pods Pods Pods Pods Pods Service Namespace Pods Pods Pods Pods Pods Pods Service Namespace Observability Tools
  61. @arafkarsh arafkarsh 100s Microservices 1,000s Releases / Day 10,000s Virtual

    Machines 100K+ User actions / Second 81 M Customers Globally 1 B Time series Metrics 10 B Hours of video streaming every quarter Source: NetFlix: : https://www.youtube.com/watch?v=UTKIT6STSVM 10s OPs Engineers 0 NOC 0 Data Centers So what do NetFlix think about DevOps? No DevOps Don’t do lot of Process / Procedures Freedom for Developers & be Accountable Trust people you Hire No Controls / Silos / Walls / Fences Ownership – You Build it, You Run it. 68
  62. @arafkarsh arafkarsh 50M Paid Subscribers 100M Active Users 60 Countries

    Cross Functional Team Full, End to End ownership of features Autonomous 1000+ Microservices Source: https://microcph.dk/media/1024/conference-microcph-2017.pdf 1000+ Tech Employees 120+ Teams 69
  63. @arafkarsh arafkarsh Benefits of DevOps 70 ✓ Velocity o Agile

    / Kanban, o Capability Centric Design o Domain Driven Design o Event Sourcing & CQRS o Microservices Architecture Code Build Manage Learn Idea ✓ Quality o Test Automation o Build Pipeline Automation o Continuous Integration o Continuous Delivery o Continuous Deployment o Observability People Process Tools
  64. @arafkarsh arafkarsh 71 Design Patterns are solutions to general problems

    that software developers faced during software development. Design Patterns
  65. @arafkarsh arafkarsh 72 Thank you DREAM | AUTOMATE | EMPOWER

    Araf Karsh Hamid : India: +91.999.545.8627 http://www.slideshare.net/arafkarsh https://speakerdeck.com/arafkarsh https://www.linkedin.com/in/arafkarsh/ https://www.youtube.com/user/arafkarsh/playlists http://www.arafkarsh.com/ @arafkarsh arafkarsh
  66. @arafkarsh arafkarsh References 75 Event Sourcing and CQRS 1. IBM:

    Event Driven Architecture – Mar 21, 2021 2. Martin Fowler: Event Driven Architecture – GOTO 2017 3. Greg Young: A Decade of DDD, Event Sourcing & CQRS – April 11, 2016 4. Nov 13, 2014 GOTO 2014 – Event Sourcing. By Greg Young 5. Mar 22, 2016 Building Micro Services with Event Sourcing and CQRS 6. Apr 15, 2016 YOW! Nights – Event Sourcing. By Martin Fowler 7. May 08, 2017 When Micro Services Meet Event Sourcing. By Vinicius Gomes Kafka 1. Understanding Kafka 2. Understanding RabbitMQ 3. IBM: Apache Kafka – Sept 18, 2020 4. Confluent: Apache Kafka Fundamentals – April 25, 2020 5. Confluent: How Kafka Works – Aug 25, 2020 6. Confluent: How to integrate Kafka into your environment – Aug 25, 2020 7. Kafka Streams – Sept 4, 2021 8. Kafka: Processing Streaming Data with KSQL – Jul 16, 2018 9. Kafka: Processing Streaming Data with KSQL – Nov 28, 2019
  67. @arafkarsh arafkarsh References 76 Microservices 1. Microservices Definition by Martin

    Fowler 2. When to use Microservices By Martin Fowler 3. GoTo: Sep 3, 2020: When to use Microservices By Martin Fowler 4. GoTo: Feb 26, 2020: Monolith Decomposition Pattern 5. Thought Works: Microservices in a Nutshell 6. Microservices Prerequisites 7. What do you mean by Event Driven? 8. Understanding Event Driven Design Patterns for Microservices Testing – TDD / BDD 1. An introduction to TDD 2. Component Software Testing 3. What is Component Testing? 4. Component Test By Martin Fowler 5. Contract Testing By Martin Fowler 6. Integration Testing By Martin Fowler 7. Testing Strategies in Microservices Architecture 8. Practical Test Pyramid By Ham Vocke
  68. @arafkarsh arafkarsh References 77 Cloud Architecture 1. Vmware: What is

    Cloud Architecture? 2. Redhat: What is Cloud Architecture? 3. Cloud Computing Architecture 4. Cloud Adoption Essentials: 5. Google: Hybrid and Multi Cloud 6. IBM: Hybrid Cloud Architecture Intro 7. IBM: Hybrid Cloud Architecture: Part 1 8. IBM: Hybrid Cloud Architecture: Part 2 9. Cloud Computing Basics: IaaS, PaaS, SaaS 10. IBM: IaaS Explained 11. IBM: PaaS Explained 12. IBM: SaaS Explained 13. IBM: FaaS Explained 14. IBM: What is Hypervisor? DevOps 1. IBM: What is DevOps? 2. IBM: Cloud Native DevOps Explained 3. IBM: Application Transformation 4. IBM: Virtualization Explained 5. What is DevOps? Easy Way 6. DevOps?! How to become a DevOps Engineer???
  69. @arafkarsh arafkarsh References 78 Databases: Big Data / Cloud Databases

    1. Google: How to Choose the right database? 2. AWS: Choosing the right Database 3. IBM: NoSQL Vs. SQL 4. A Guide to NoSQL Databases 5. How does NoSQL Databases Work? 6. What is Better? SQL or NoSQL? 7. What is DBaaS? 8. NoSQL Concepts 9. Key Value Databases 10. Document Databases 11. Graph Databases 12. Column Databases 13. Row Vs. Column Oriented Databases 14. Database Indexing Explained 15. MongoDB Indexing 16. AWS: DynamoDB Global Indexing 17. AWS: DynamoDB Local Indexing 18. Google Cloud Spanner 19. AWS: DynamoDB Design Patterns 20. Cloud Provider Database Comparisons 21. CockroachDB: When to use a Cloud DB?
  70. @arafkarsh arafkarsh References 79 Docker / Kubernetes / Istio 1.

    IBM: Virtual Machines and Containers 2. IBM: What is a Hypervisor? 3. IBM: Docker Vs. Kubernetes 4. IBM: Containerization Explained 5. IBM: Kubernetes Explained 6. IBM: Kubernetes Ingress in 5 Minutes 7. Microsoft: How Service Mesh works in Kubernetes 8. IBM: Istio Service Mesh Explained 9. IBM: Kubernetes and OpenShift 10. IBM: Kubernetes Operators 11. 10 Consideration for Kubernetes Deployments Istio – Metrics 1. Istio – Metrics 2. Monitoring Istio Mesh with Grafana 3. Visualize your Istio Service Mesh 4. Security and Monitoring with Istio 5. Observing Services using Prometheus, Grafana, Kiali 6. Istio Cookbook: Kiali Recipe 7. Kubernetes: Open Telemetry 8. Open Telemetry 9. How Prometheus works 10. IBM: Observability vs. Monitoring
  71. @arafkarsh arafkarsh References 80 CI / CD 1. What is

    Continuous Integration? 2. What is Continuous Delivery? 3. CI / CD Pipeline 4. What is CI / CD Pipeline? 5. CI / CD Explained 6. CI / CD Pipeline using Java Example Part 1 7. CI / CD Pipeline using Ansible Part 2 8. Declarative Pipeline vs Scripted Pipeline 9. Complete Jenkins Pipeline Tutorial 10. Common Pipeline Mistakes 11. CI / CD for a Docker Application
  72. @arafkarsh arafkarsh References 81 1. Lewis, James, and Martin Fowler.

    “Microservices: A Definition of This New Architectural Term”, March 25, 2014. 2. Miller, Matt. “Innovate or Die: The Rise of Microservices”. e Wall Street Journal, October 5, 2015. 3. Newman, Sam. Building Microservices. O’Reilly Media, 2015. 4. Alagarasan, Vijay. “Seven Microservices Anti-patterns”, August 24, 2015. 5. Cockcroft, Adrian. “State of the Art in Microservices”, December 4, 2014. 6. Fowler, Martin. “Microservice Prerequisites”, August 28, 2014. 7. Fowler, Martin. “Microservice Tradeoffs”, July 1, 2015. 8. Humble, Jez. “Four Principles of Low-Risk Software Release”, February 16, 2012. 9. Zuul Edge Server, Ketan Gote, May 22, 2017 10. Ribbon, Hysterix using Spring Feign, Ketan Gote, May 22, 2017 11. Eureka Server with Spring Cloud, Ketan Gote, May 22, 2017 12. Apache Kafka, A Distributed Streaming Platform, Ketan Gote, May 20, 2017 13. Functional Reactive Programming, Araf Karsh Hamid, August 7, 2016 14. Enterprise Software Architectures, Araf Karsh Hamid, July 30, 2016 15. Docker and Linux Containers, Araf Karsh Hamid, April 28, 2015
  73. @arafkarsh arafkarsh 82 References Domain Driven Design 16. Oct 27,

    2012 What I have learned about DDD Since the book. By Eric Evans 17. Mar 19, 2013 Domain Driven Design By Eric Evans 18. Jun 02, 2015 Applied DDD in Java EE 7 and Open Source World 19. Aug 23, 2016 Domain Driven Design the Good Parts By Jimmy Bogard 20. Sep 22, 2016 GOTO 2015 – DDD & REST Domain Driven API’s for the Web. By Oliver Gierke 21. Jan 24, 2017 Spring Developer – Developing Micro Services with Aggregates. By Chris Richardson 22. May 17. 2017 DEVOXX – The Art of Discovering Bounded Contexts. By Nick Tune 23. Dec 21, 2019 What is DDD - Eric Evans - DDD Europe 2019. By Eric Evans 24. Oct 2, 2020 - Bounded Contexts - Eric Evans - DDD Europe 2020. By. Eric Evans 25. Oct 2, 2020 - DDD By Example - Paul Rayner - DDD Europe 2020. By Paul Rayner
  74. @arafkarsh arafkarsh References 83 Event Sourcing and CQRS 26. Nov

    13, 2014 - GOTO 2014 – Event Sourcing. By Greg Young 27. Mar 22, 2016 - Spring Developer – Building Micro Services with Event Sourcing and CQRS 28. Apr 15, 2016 - YOW! Nights – Event Sourcing. By Martin Fowler 29. May 08, 2017 - When Micro Services Meet Event Sourcing. By Vinicius Gomes 30. July 15, 2015 – Agile is Dead : GoTo 2015 By Dave Thomas 31. Apr 7, 2016 - Agile Project Management with Kanban | Eric Brechner | Talks at Google 32. Sep 27, 2017 - Scrum vs Kanban - Two Agile Teams Go Head-to-Head 33. Feb 17, 2019 - Lean vs Agile vs Design Thinking 34. Dec 17, 2020 - Scrum vs Kanban | Differences & Similarities Between Scrum & Kanban 35. Feb 24, 2021 - Agile Methodology Tutorial for Beginners | Jira Tutorial | Agile Methodology Explained. Agile Methodologies
  75. @arafkarsh arafkarsh References 84 1. Jul 3, 2019 – Understanding

    Kafka 2. Aug 8, 2019 – Understanding RabbitMQ 3. Feb 6, 2020 – An introduction to TDD 4. Aug 14, 2019 – Component Software Testing 5. May 30, 2020 – What is Component Testing? 6. Apr 23, 2013 – Component Test By Martin Fowler 7. Jan 12, 2011 – Contract Testing By Martin Fowler 8. Jan 16, 2018 – Integration Testing By Martin Fowler 9. Testing Strategies in Microservices Architecture 10. Practical Test Pyramid By Ham Vocke
  76. @arafkarsh arafkarsh References 85 SQL Vs NoSQL 36. Jun 29,

    2012 – Google I/O 2012 - SQL vs NoSQL: Battle of the Backends 37. Feb 19, 2013 - Introduction to NoSQL • Martin Fowler • GOTO 2012 38. Jul 25, 2018 - SQL vs NoSQL or MySQL vs MongoDB 39. Oct 30, 2020 - Column vs Row Oriented Databases Explained 40. Dec 9, 2020 - How do NoSQL databases work? Simply Explained!
  77. @arafkarsh arafkarsh References 86 27. MSDN – Microsoft https://msdn.microsoft.com/en-us/library/dn568103.aspx 28.

    Martin Fowler : CQRS – http://martinfowler.com/bliki/CQRS.html 29. Udi Dahan : CQRS – http://www.udidahan.com/2009/12/09/clarified-cqrs/ 30. Greg Young : CQRS - https://www.youtube.com/watch?v=JHGkaShoyNs 31. Bertrand Meyer – CQS - http://en.wikipedia.org/wiki/Bertrand_Meyer 32. CQS : http://en.wikipedia.org/wiki/Command–query_separation 33. CAP Theorem : http://en.wikipedia.org/wiki/CAP_theorem 34. CAP Theorem : http://www.julianbrowne.com/article/viewer/brewers-cap-theorem 35. CAP 12 years how the rules have changed 36. EBay Scalability Best Practices : http://www.infoq.com/articles/ebay-scalability-best-practices 37. Pat Helland (Amazon) : Life beyond distributed transactions 38. Stanford University: Rx https://www.youtube.com/watch?v=y9xudo3C1Cw 39. Princeton University: SAGAS (1987) Hector Garcia Molina / Kenneth Salem 40. Rx Observable : https://dzone.com/articles/using-rx-java-observable
  78. @arafkarsh arafkarsh References – Microservices – Videos 87 41. Martin

    Fowler – Micro Services : https://www.youtube.com/watch?v=2yko4TbC8cI&feature=youtu.be&t=15m53s 42. GOTO 2016 – Microservices at NetFlix Scale: Principles, Tradeoffs & Lessons Learned. By R Meshenberg 43. Mastering Chaos – A NetFlix Guide to Microservices. By Josh Evans 44. GOTO 2015 – Challenges Implementing Micro Services By Fred George 45. GOTO 2016 – From Monolith to Microservices at Zalando. By Rodrigue Scaefer 46. GOTO 2015 – Microservices @ Spotify. By Kevin Goldsmith 47. Modelling Microservices @ Spotify : https://www.youtube.com/watch?v=7XDA044tl8k 48. GOTO 2015 – DDD & Microservices: At last, Some Boundaries By Eric Evans 49. GOTO 2016 – What I wish I had known before Scaling Uber to 1000 Services. By Matt Ranney 50. DDD Europe – Tackling Complexity in the Heart of Software By Eric Evans, April 11, 2016 51. AWS re:Invent 2016 – From Monolithic to Microservices: Evolving Architecture Patterns. By Emerson L, Gilt D. Chiles 52. AWS 2017 – An overview of designing Microservices based Applications on AWS. By Peter Dalbhanjan 53. GOTO Jun, 2017 – Effective Microservices in a Data Centric World. By Randy Shoup. 54. GOTO July, 2017 – The Seven (more) Deadly Sins of Microservices. By Daniel Bryant 55. Sept, 2017 – Airbnb, From Monolith to Microservices: How to scale your Architecture. By Melanie Cubula 56. GOTO Sept, 2017 – Rethinking Microservices with Stateful Streams. By Ben Stopford. 57. GOTO 2017 – Microservices without Servers. By Glynn Bird.
  79. @arafkarsh arafkarsh References – DevOps / SRE (Site Reliability Engineering)

    88 58. Amazon: https://www.youtube.com/watch?v=mBU3AJ3j1rg 59. NetFlix: https://www.youtube.com/watch?v=UTKIT6STSVM 60. DevOps and SRE: https://www.youtube.com/watch?v=uTEL8Ff1Zvk 61. SLI, SLO, SLA : https://www.youtube.com/watch?v=tEylFyxbDLE 62. DevOps and SRE : Risks and Budgets : https://www.youtube.com/watch?v=y2ILKr8kCJU 63. SRE @ Google: https://www.youtube.com/watch?v=d2wn_E1jxn4
  80. @arafkarsh arafkarsh 89 1. Simoorg : LinkedIn’s own failure inducer

    framework. It was designed to be easy to extend and most of the important components are plug‐ gable. 2. Pumba : A chaos testing and network emulation tool for Docker. 3. Chaos Lemur : Self-hostable application to randomly destroy virtual machines in a BOSH- managed environment, as an aid to resilience testing of high-availability systems. 4. Chaos Lambda : Randomly terminate AWS ASG instances during business hours. 5. Blockade : Docker-based utility for testing network failures and partitions in distributed applications. 6. Chaos-http-proxy : Introduces failures into HTTP requests via a proxy server. 7. Monkey-ops : Monkey-Ops is a simple service implemented in Go, which is deployed into an OpenShift V3.X and generates some chaos within it. Monkey-Ops seeks some OpenShift components like Pods or Deployment Configs and randomly terminates them. 8. Chaos Dingo : Chaos Dingo currently supports performing operations on Azure VMs and VMSS deployed to an Azure Resource Manager-based resource group. 9. Tugbot : Testing in Production (TiP) framework for Docker. Testing tools