Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The Importance of Being Well-Architected

The Importance of Being Well-Architected

Success in the cloud begins with good architecture choices. This content was delivered as a webinar.

You'll learn how to make good design choices, and the benefits of working with an AWS partner to review your platform against good Well-Architected practices, taking into account operations, security, cost, performance, and reliability.

For a limited time, the video of this talk is available at https://us02web.zoom.us/rec/share/vpwscIHo9nlJZLftsn7FQohwJtj8T6a82nQd__sLnU__jPVvs3ayi3ZOwOeZ-63h
Password: 0h*[email protected]

The Scale Factory

April 29, 2020
Tweet

More Decks by The Scale Factory

Other Decks in Technology

Transcript

  1. THE IMPORTANCE OF BEING WELL-ARCHITECTED_ JON TOPPER | @jtopper |

    he/him/his
  2. $ whoami Founder/CEO/CTO The Scale Factory Working in hosting/infrastructure for

    20 years Infrastructure / AWS / DevOps
  3. None
  4. Leading Well-Architected Partner Worldwide >200 Reviews Completed Since April 2018

    Book a Well-Architected review today https:/ /scalefactory.com/services/well-architected/ $5,000 funding available to support improvement work
  5. 0 45 90 135 180 Mar-2018 May-2018 Jul-2018 Sep-2018 Nov-2018

    Jan-2019 Mar-2019 May-2019 Jul-2019 Sep-2019 Nov-2019 Jan-2020 REVIEWS RUN_
  6. THE TEAM_

  7. OUR CLIENTS_

  8. TODAY’S AGENDA_ What is Well-Architected? What is a Well-Architected Review?

    Pillar overview Common review findings
  9. WHAT IS WELL-ARCHITECTED?_

  10. WELL ARCHITECTED ORIGINS_ Catalogue of emergent good practices Observed by

    AWS Field Solutions Architects Codified and shared Platform agnostic*
  11.          

      !     !" #   $# #!  !! %! #  %  ! % "  $ White Papers Review Tool
  12. Performance Efficiency Cost Optimisation Operational Excellence Reliability Security

  13. IoT (Internet of Things) High Performance Computing Serverless Applications Lenses

    Machine Learning
  14. USING WELL-ARCHITECTED_ Gap analysis / planning Teaching Team alignment

  15. WHAT IS A WELL-ARCHITECTED REVIEW?_

  16. WELL ARCHITECTED REVIEW_ Foundational questions Up to 4 hours Qualitative

  17. Performance Efficiency Cost Optimisation Operational Excellence Reliability Security Well Architected

    Core Serverless Applications High Performance Computing IoT (Internet of Things) 9 11 9 8 9 2 3 2 1 1 4 3 3 4 2 4 11 6 10 4 46 9 16 35 Machine Learning 7 3 5 2 4 21
  18. QUESTION OPS 1_ • Evaluate external customer needs • Evaluate

    internal customer needs • Evaluate compliance requirements • Evaluate threat landscape • Evaluate tradeoffs • Manage benefits and risks • None of these How do you determine what your priorities are?
  19. USING WELL-ARCHITECTED_ Gap analysis / planning Teaching Team alignment

  20. PILLAR OVERVIEW_

  21. Performance Efficiency Cost Optimisation Operational Excellence Reliability Security

  22. MAIN AREAS_ Operational priorities Software Delivery Lifecycle Risk reduction Monitoring

    / Telemetry Documentation Continuous Improvement
  23. CONTINUOUS DELIVERY PIELINE_ Linting Unit Tests Artefact Build SAST Deploy

    to Test Integration Test Performance Test DAST Deploy to UAT Deploy to Live
  24. https:/ /services.google.com/fh/files/misc/state-of-devops-2019.pdf

  25. CENTRAL MONITORING_ User monitoring Application monitoring (RED) Infrastructure monitoring Make

    dashboards available
  26. CENTRAL LOGGING_ Structured logs Add relevant IDs (transaction, user, etc)

    Make dashboards available Represent “events” on dashboards
  27. Performance Efficiency Cost Optimisation Operational Excellence Reliability Security

  28. "Low performers take weeks to conduct security reviews and complete

    the changes identified. In contrast, elite performers build security in and can conduct security reviews and complete changes in just days." http:/ /services.google.com/fh/files/misc/state-of-devops-2018.pdf
  29. MAIN AREAS_ Identity and access management Detective controls Infrastructure protection

    Data protection Incident response
  30. Somebody Else's Problem

  31. QUESTION SEC 11_ • Identify key personnel and external resources

    • Identify tooling • Develop incident response plans • Automate containment capability • Identify forensic capabilities • Pre-provision access • Pre-deploy tools • Run game days • None of these How do you respond to a [security] incident? WA CI High Risk 75% NI 51% HRI Rank: 2 WA WA NI 27% 39% 0% 11% 27% 10% 3% 35% NI NI NI (93%)
  32. QUESTION SEC 8_ • Define data classification requirements • Define

    data protection controls • Implement data identification • Automate identification and classification • Identify the types of data • None of these How do you classify your data? WA CI High Risk 75% HRI Rank: 3 WA WA 61% 39% 17% 4% 59% 23% NI NI (88%)
  33. TEAMS NEED TO DO BETTER AT SECURITY_ Poor hygiene around

    patching Limited data classification Mediocre human access control Bad programmatic access control Low adoption of security monitoring tools
  34. Performance Efficiency Cost Optimisation Operational Excellence Reliability Security

  35. MAIN AREAS_ Foundations Change Management Failure Management

  36. AVAILABILITY DESIGN_ Clustering / Failover Autoscaling Multi-AZ (and Multi-Region) operation

    Caching Asynchronous processing Backpressure / Circuit breakers
  37. PLAN FOR FAILURE_ Build a list of potential failure scenarios

    Understand how your platform will react Game days (tabletop) Mitigate / Document Game days (live failure injection)
  38. None
  39. QUESTION REL 8_ • Use playbooks for unanticipated failures •

    Conduct root cause analysis and share results • Inject failures to test resiliency • Conduct game days regularly • None of these How do you test resilience? WA CI High Risk 67% HRI Rank: 5 WA 25% NI (92%) NI 73% 6% 0% 16%
  40. QUESTION REL 9_ • Define recovery objectives for downtime and

    data loss • Use defined recovery strategies to meet the recovery objectives • Test disaster recovery implementation to validate the implementation • Manage configuration drift on all changes • Automate recovery • None of these How do you plan for disaster recovery? WA CI High Risk 79% NI 33% HRI Rank: 1 WA WA NI 33% 25% 39% 16% 31% (87%)
  41. TEAMS ARE BAD AT THINKING ABOUT FAILURE MODES_ Not considering

    business requirements No risk analysis of failure modes Poor documentation Almost no attempt to rehearse outages
  42. Performance Efficiency Cost Optimisation Operational Excellence Reliability Security

  43. MAIN AREAS_ Choosing the right components Configuring things correctly Reviewing

    these choices regularly Monitoring for performance
  44. EC2 Instances Containers on k8s or ECS Containers on Fargate

    Lambda Compute Most Security effort required Least Security effort required Least Serverless Most Serverless Least Opinionated Most Opinionated Least suitable for microservices Most suitable for microservices
  45. Data RDS PostgreSQL RDS MySQL Aurora RedShift DynamoDB MongoDB DocumentDB

    AWS Neptune Neo4J Cassandra Amazon Timestream InfluxDB Relational Key-Value Document Graph Time Series Quantum Ledger Ledger Hyperledger Fabric Ethereum MySQL PostgreSQL
  46. Performance Efficiency Cost Optimisation Operational Excellence Reliability Security

  47. QUESTION COST 9_ • Establish a cost optimisation function •

    Develop a workload review process • Review and implement services in an unplanned way • Review and analyse this workload regularly • Keep up to date with new service releases • None of these How do you evaluate new services? WA CI High Risk 71% HRI Rank: 4 WA 34% 26% 84% NI NI (79%) NI 43% 63% 1%
  48. MAIN AREAS_ Governance Monitoring for usage Decommissioning unused resources Matching

    supply/demand Using pricing models
  49. WHAT NEXT?_ Read the white papers: https:/ /aws.amazon.com/architecture/well-architected/ Run your

    own review(s) https:/ /aws.amazon.com/well-architected-tool/ Engage a AWS Well-Architected partner https:/ /scalefactory.com/services/well-architected/ (funding available)
  50. EVERYONE IS BETTER AT BUILDING PLATFORMS THAN THEY ARE AT

    SECURING OR RUNNING THEM_
  51. TALK TO US ABOUT: CONSULTANCY TRAINING WELL-ARCHITECTED MIGRATION

  52. Leading Well-Architected Partner Worldwide >200 Reviews Completed Since April 2018

    Book a Well-Architected review today https:/ /scalefactory.com/services/well-architected/ $5,000 funding available to support improvement work
  53. BREAKFAST OPS_ Monthly hosted discussion For CTOs and tech decision

    makers
  54. Q&A_

  55. KEEP IN TOUCH_ http:/ /www.scalefactory.com/ @scalefactory [email protected]m