Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Monoliths to the cloud!

Monoliths to the cloud!

How can you take an existing monolith to the cloud with very minimal effort? In this talk we will explore an architecture that can help you to achieve that while focusing on scalability and resilience.

F3a6662b3cd161c3c2f13604965ed0f2?s=128

Luciano Mammino

February 15, 2022
Tweet

More Decks by Luciano Mammino

Other Decks in Technology

Transcript

  1. Monoliths to the cloud 🚀 Luciano Mammino ( ) @loige

    loige.link/mono-cloud 2022-02-15 twitch.tv/fabio_biondi 1
  2. @loige Get the slides! 👇 loige.link/mono-cloud 2

  3. I'm Luciano 👋 Senior Architect @ fourTheorem (Dublin ) nodejsdp.link

    📔 Co-Author of Node.js Design Patterns 👉 Let's connect! (blog) (twitter) (twitch) (github) loige.co @loige loige lmammino 3
  4. We are business focused technologists that deliver. | | Accelerated

    Serverless AI as a Service Platform Modernisation We are hiring do you want to ? work with us loige 4
  5. I co-host a podcast about AWS with ! @eoins @loige

    awsbites.com Pleaze, subscribe 😇 5
  6. @loige Story time 6

  7. @loige Story time 7

  8. @loige Story time 8

  9. @loige Story time 9

  10. Business Summary @loige SaaS CMS for legal practices 1 founder

    + 1 developer 💸 Bootstrapped business 🙌 Good MVP, getting attention in the market 💪 Started a TRIAL with a big customer 10
  11. Current problems @loige 📈 The company is growing ⚖ but

    the technology does not scale! 📦 1 monolithic server 🔥 Frequent failures = 🤬 unhappy customers 😥 The business is at risk! 11
  12. Desired State @loige ⚖ More reliable & scalable infrastructure 👌

    Minimal amount of change required* * the team is not skilled with the cloud & containers, we need to keep cognitive load low 12
  13. @loige 🤔 "I heard that the cloud is great but

    we don't have the time and the skills to re-architect everything as micro-services!" 13
  14. 🧐 What can we recommend? @loige 14

  15. ⛏ Let's start to dig deeper! @loige 15

  16. Example use cases @loige A user logs in the application

    and they should be able to see all their previously uploaded legal documents A user can upload new documents and organize them by providing specific tags (client id, case number, etc.) A user might search for documents containing specific keywords or tags 16
  17. Current Architecture* @loige Virtual Private Server * a beautiful monolith

    ❤ 17
  18. Can we take the monolith to the cloud and make

    it resilient & scalable? @loige 18
  19. Load Balancer App Server(s) File Storage Session Storage Database Target

    architecture @loige 19
  20. 🧐 Let's get more specific...* @loige * Mostly AWS from

    here 20
  21. S3 ElastiCache RDS Application Load Balancer EC2 Instance(s) Target architecture

    on AWS @loige 21
  22. ✋ Let's pause for a second... @loige 22

  23. 🤔 What is the cloud, really? @loige 23

  24. @loige 24

  25. ... a little more involved than that! 😅 @loige 25

  26. ☁ The "cloud" is built... @loige To scale To be

    resilient 26
  27. OK, but how?! 😒 @loige 27

  28. Region @loige A physical location around the world (e.g. North

    Virginia, Ireland or Sydney) where AWS hosts a group of data centers. Regions help to provision infrastructure that is closer to the customers, so that our applications can have low latency and feel responsive. 28
  29. Availability Zone (AZ) @loige Discrete data center with redundant power,

    networking, and connectivity in an AWS Region. Data centers in different availability zones are disjointed from one another, so if there’s a serious outage, that’s rarely affecting more than one availability zone at the same time. 29
  30. Availability Zone (AZ) @loige It’s good practice to spread redundant

    infrastructure across different availability zones in a given region to guarantee high availability. 30
  31. VPC @loige A virtual (private) network provisioned in a given

    region for a given AWS account. It is logically isolated from other virtual networks in AWS. Every VPC has a range of private IP addresses organised in one or more subnets. 31
  32. Subnet @loige A range of IPs in a given VPC

    and in a given availability zone that can be used to spin up and connect resources within the network. Subnets can be public or private. A public subnet can be used to run instances that can have a public IP assigned to them and can be reachable from outside the VPC itself. 32
  33. Subnet @loige It’s good practice to keep front-facing servers (or

    load balancers) in public subnets and keep everything else (backend services, databases, etc.) in private subnets. Traffic between subnets can be enabled through routing tables to allow for instance a load balancer in a public subnet to forward traffic to backend instances in a private subnet. 33
  34. Quick Recap @loige Region: physical location with data centers Availability

    Zone: data center in a region VPC: a virtual private network in a region Subnet: range of IPs in a VPC in a given AZ 34
  35. @loige Region AZ1 AZ2 AZ3 VPC Subnet Resource (e.g. EC2

    instance) 35
  36. ✏ TODO List @loige ☐ Create an AWS account ☐

    Select a region ☐ Create and configure a VPC 36
  37. Our VPC @loige 37

  38. ☑ Create an AWS account ☑ Select a region ☑

    Create and configure a VPC ✏ TODO List @loige 38
  39. ✏ TODO List @loige ☐ Load Balancer ☐ EC2 ☐

    S3 ☐ RDS ☐ ElastiCache ☐ Route 53 39
  40. Application Load Balancer (ALB) @loige The entry point to all

    the application traffic. Layer 7 Load Balancer (HTTP, HTTPS, WebSocket, gRPC). Highly available: replicated in all our public subnets. 40
  41. Application Load Balancer (ALB) @loige Scalable: can handle millions of

    request per second. Managed service: we don't need to configure the OS or install software patches. Can be integrated with ACM (AWS Certificate Manager) to support HTTPS. 41
  42. Application Load Balancer (ALB) @loige Target group 42

  43. Application Load Balancer (ALB) @loige Target group 🔥 /health ✅

    ❌ /health /health ✅ Unhealty targets won't get any traffic 43
  44. Application Load Balancer (ALB) @loige Targets can be added dynamically.

    We can scale targets automatically using autoscaling groups. E.g. Add or remove instances based on num requests in-flight or on avg CPU of the current instances. 44
  45. How does it scale? @loige Being a managed service, scalability

    is mostly handled out of the box by AWS. 45
  46. Resiliency @loige A load balancer can distribute traffic to multiple

    AZs, so if one of them becomes unavailable it will keep distributing traffic to the remaining ones. 46
  47. ✏ TODO List @loige ☑ Load Balancer ☐ EC2 ☐

    S3 ☐ RDS ☐ ElastiCache ☐ Route 53 47
  48. EC2 - Virtual Machine @loige Virtual machine running all the

    necessary software for the service (Nginx, Node.js, app code, etc.) They need to use Security Groups (allow traffic) and IAM Roles (allow them to access other AWS resources like S3). 48
  49. EC2 - Virtual Machine @loige We will need to provision

    multiple machines dynamically. Challenges: Consistency 🐮 Cattle vs 🙀 Pet mindset Stateless applications 49
  50. Consistency @loige All our virtual machines have to be the

    same: we need to build an AMI (Amazon Machine Image). An AMI contains OS, libraries, software and source code. You can use an AMI to start a new instance. 50
  51. Consistency @loige While we can build an AMI manually, it's

    better to use tools to automate the work: Hashicorp Packer EC2 Image Builder 51
  52. 🐮 Cattle vs 🙀 Pet mindset @loige Once an instance

    has been launched we shouldn't change it anymore (e.g update the OS, install new softare, update the code, etc.) If we need to change something, we build a new image and deploy new instances. Instances are disposable! 52
  53. Stateless @loige We are load balancing traffic so a user

    might be served by different instances during their session. A single instance should not store any state (e.g. user sessions, uploaded files, etc.) State should be stored outside instances (ElastiCache, S3, RDS, etc). 53
  54. Stateless @loige Making an application stateless might require a good

    amount of code change. A shortcut to this might be to enable in the ALB, but it's not recommended for scalability and resiliency. sticky sessions 54
  55. How does it scale? @loige Every instance will be able

    to handle a certain number of requests per second. We can scale by adding more instances when the traffic grows. 55
  56. Resiliency @loige We should have at least 1 instance per

    availability zone. If there is an AWS outage, the instances on the healthy availability zone will keep handling requests. We can use an autoscaling group to make sure that unhealthy instances are replaced. 56
  57. ✏ TODO List @loige ☑ Load Balancer ☑ EC2 ☐

    S3 ☐ RDS ☐ ElastiCache ☐ Route 53 57
  58. Simple Storage Service (S3) @loige One of the very first

    AWS services and (probably) the most famous one. Object storage service: Allows you to store any amount of data durably. You need to use the SDK to read and write data. 58
  59. Simple Storage Service (S3) @loige Data can be organised in

    logical containers called Buckets. Key/value model: Inside a bucket you can store data by providing a key and the content. 59
  60. Simple Storage Service (S3) @loige const AWS = require('aws-sdk') const

    s3 = new AWS.S3() const params = { Bucket: 'my-bucket', Key: 'my-first-s3-file.txt', Body: Buffer.from('Hello, AWS') } s3.upload(params, (err) => { if (err) { console.error(err) } else { console.log('Upload successful') } }) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 const AWS = require('aws-sdk') 1 2 const s3 = new AWS.S3() 3 4 const params = { 5 Bucket: 'my-bucket', 6 Key: 'my-first-s3-file.txt', 7 Body: Buffer.from('Hello, AWS') 8 } 9 10 s3.upload(params, (err) => { 11 if (err) { 12 console.error(err) 13 } else { 14 console.log('Upload successful') 15 } 16 }) 17 const s3 = new AWS.S3() const AWS = require('aws-sdk') 1 2 3 4 const params = { 5 Bucket: 'my-bucket', 6 Key: 'my-first-s3-file.txt', 7 Body: Buffer.from('Hello, AWS') 8 } 9 10 s3.upload(params, (err) => { 11 if (err) { 12 console.error(err) 13 } else { 14 console.log('Upload successful') 15 } 16 }) 17 const params = { Bucket: 'my-bucket', Key: 'my-first-s3-file.txt', Body: Buffer.from('Hello, AWS') } const AWS = require('aws-sdk') 1 2 const s3 = new AWS.S3() 3 4 5 6 7 8 9 10 s3.upload(params, (err) => { 11 if (err) { 12 console.error(err) 13 } else { 14 console.log('Upload successful') 15 } 16 }) 17 s3.upload(params, (err) => { if (err) { console.error(err) } else { console.log('Upload successful') } }) const AWS = require('aws-sdk') 1 2 const s3 = new AWS.S3() 3 4 const params = { 5 Bucket: 'my-bucket', 6 Key: 'my-first-s3-file.txt', 7 Body: Buffer.from('Hello, AWS') 8 } 9 10 11 12 13 14 15 16 17 const AWS = require('aws-sdk') const s3 = new AWS.S3() const params = { Bucket: 'my-bucket', Key: 'my-first-s3-file.txt', Body: Buffer.from('Hello, AWS') } s3.upload(params, (err) => { if (err) { console.error(err) } else { console.log('Upload successful') } }) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 60
  61. Simple Storage Service (S3) @loige Too much code to change?

    A first migration could be done by using a something like to create a "virtual filesystem" that allows you to read/write to S3 seamlessly. s3fs-fuse 61
  62. How does it scale? @loige S3 is a managed service

    which automatically scales to thousands of read/write operations per second. 62
  63. Resiliency @loige S3 is provisioned in multiple AZs by default

    and it makes multiple copies of your data. All of this happens transparently, no special configuration required. 63
  64. ✏ TODO List @loige ☑ Load Balancer ☑ EC2 ☑

    S3 ☐ RDS ☐ ElastiCache ☐ Route 53 64
  65. Relational Database Service (RDS) @loige Managed relational database service for

    MySql, PostgreSQL, MariaDB, Oracle & SQL Server. Being a managed service, AWS takes care of most common concerns like backups and updates (configurable). 65
  66. How does it scale? @loige RDS PostgreSQL supports Read Replicas:

    you can provision additional instances to which you can distribute heavy read-only queries. 66
  67. Resiliency @loige RDS PostgreSQL can be configured to work in

    Multi-AZ mode: this means that there will be one or two standby copies of the database in different AZs. If the primary DB instance or the primary AZ have an outage, one of the standby copies are promoted to become "the primary" instance. 67
  68. Resiliency @loige Failover is fast but not instantaneous (60-120 seconds),

    so we need to make sure to plan for possible connectivity failures in your app and show clear error messages to the users. 68
  69. ✏ TODO List @loige ☑ Load Balancer ☑ EC2 ☑

    S3 ☑ RDS ☐ ElastiCache ☐ Route 53 69
  70. ElastiCache @loige Managed in-memory caching service supporting Redis and Memcached.

    Meant to be used for use cases that don't require durability like data cache, session stores, gaming leaderboards, streaming, and analytics. AWS takes care of maintenance. 70
  71. How does it scale? @loige A single instance of Redis

    (with enough memory) can scale to significant amounts of traffic. If you need more, you can run ElastiCache Redis in Cluster Mode and shard your data across multiple Redis instances. 71
  72. Resiliency @loige ElastiCache Redis can operate in Multi-AZ mode. Similarly

    to RDS, in case of failures, there might be some downtime while the new master is promoted. We need to make sure the app accounts for Redis connection failures. 72
  73. ✏ TODO List @loige ☑ Load Balancer ☑ EC2 ☑

    S3 ☑ RDS ☑ ElastiCache ☐ Route 53 73
  74. Route53 @loige Highly available and scalable cloud DNS service. Can

    be used to direct traffic on a given domain to our Application Load Balancer. 74
  75. ✏ TODO List @loige ☑ Load Balancer ☑ EC2 ☑

    S3 ☑ RDS ☑ ElastiCache ☑ Route 53 75
  76. Infrastructure as Code (IaaC) @loige We could provision everything "manually"

    from the web console, but... It will be hard to create consistent environments for development and QA It will be hard to change things incrementally How would we test and review changes before applying them in production? 76
  77. Infrastructure as Code (IaaC) @loige It's better to define all

    the infrastructure using code. There are several tools that can help us with that: CloudFormation Hashicorp Terraform Cloud Development Kit (CDK) Pulumi 77
  78. { "AWSTemplateFormatVersion" : "2010-09-09", "Description" : "AWS CloudFormation Sample Template

    EC2InstanceWithSecurityGroupSample: Create an Amazon EC2 instance running the A "Parameters" : { "KeyName": { "Description" : "Name of an existing EC2 KeyPair to enable SSH access to the instance", "Type": "AWS::EC2::KeyPair::KeyName", "ConstraintDescription" : "must be the name of an existing EC2 KeyPair." }, "InstanceType" : { "Description" : "WebServer EC2 instance type", "Type" : "String", "Default" : "t2.small", "AllowedValues" : [ "t1.micro", "t2.nano", "t2.micro", "t2.small", "t2.medium", "t2.large", "m1.small", "m1.medium", "m1.large" , "ConstraintDescription" : "must be a valid EC2 instance type." }, "SSHLocation" : { "Description" : "The IP address range that can be used to SSH to the EC2 instances", "Type": "String", "MinLength": "9", "MaxLength": "18", "Default": "0.0.0.0/0", "AllowedPattern": "(\\d{1,3})\\.(\\d{1,3})\\.(\\d{1,3})\\.(\\d{1,3})/(\\d{1,2})", "ConstraintDescription": "must be a valid IP CIDR range of the form x.x.x.x/x." } }, "Mappings" : { "AWSInstanceType2Arch" : { "t1.micro" : { "Arch" : "HVM64" }, "t2.nano" : { "Arch" : "HVM64" }, "t2.micro" : { "Arch" : "HVM64" }, @loige Example of CloudFormation template 78
  79. import * as cdk from '@aws-cdk/core' import * as ec2

    from '@aws-cdk/aws-ec2' export class CdkUbuntuEc2Stack extends cdk.Stack { constructor (scope: cdk.Construct, id: string, props?: cdk.StackProps) { super(scope, id, { env: { account: process.env.CDK_DEPLOY_ACCOUNT || process.env.CDK_DEFAULT_ACCOUNT, region: process.env.CDK_DEPLOY_REGION || process.env.CDK_DEFAULT_REGION }, ...props }) const defaultVpc = ec2.Vpc.fromLookup(this, 'VPC', { isDefault: true }) const userData = ec2.UserData.forLinux() userData.addCommands( 'apt-get update -y', 'apt-get install -y git awscli ec2-instance-connect', 'until git clone https://github.com/aws-quickstart/quickstart-linux-utilities.git; do echo "Retrying"; done', 'cd /quickstart-linux-utilities', 'source quickstart-cfn-tools.source', 'qs_update-os || qs_err', 'qs_bootstrap_pip || qs_err', 'qs_aws-cfn-bootstrap || qs_err', 'mkdir -p /opt/aws/bin', 'ln -s /usr/local/bin/cfn-* /opt/aws/bin/' ) const machineImage = ec2.MachineImage.fromSSMParameter( '/aws/service/canonical/ubuntu/server/focal/stable/current/amd64/hvm/ebs-gp2/ami-id', ec2.OperatingSystemType.LINUX, userData ) const myVmSecurityGroup = new ec2.SecurityGroup(this, 'myVmSecurityGroup', { @loige Example of a CDK stack (TypeScript) 79
  80. An article about CDK (with examples) @loige loige.link/cdk-article 80

  81. Switch over @loige 81

  82. Switch over @loige 🙀 How do we migrate the data?

    😥 How do we switch the traffic to the new infrastructure? 82
  83. Streamlined data migration @loige Update the "old" code-base to save

    every new file also to S3. Copy all the existing file to the S3 bucket (S3 sync). 83
  84. Streamlined data migration @loige AWS Database Migration service allows you

    to replicate all the data from the old database to the new one. It will also keep the 2 Databases in sync during the switch over! 84
  85. Switching traffic @loige Request a new certificate using AWS Certificate

    Manager (ACM). Can be validated by email or DNS. Point your DNS to the new Load Balancer in AWS! 85
  86. WE ARE LIVE! 🎉 @loige Now what? 86

  87. New challenges 🤨 @loige Observability Testing Building & Deployment 87

  88. New opportunities 😊 @loige We can scale dynamically! As the

    team grows and the system gets more complicated we can start to think about micro- services. We can start to play with other AWS services (E.g. SQS + Lambda for background task processing). 88
  89. 💸 Cost @loige 89

  90. 💸 Cost @loige calculator.aws/#/estimate?id=1e0779adb67305166c01a583f5d7f61f7c92b029 Load Balancer $ 24.24 EC2 (6

    instances) $ 85.87 ElastiCache Redis (3 instances) $ 78.84 RDS PostgreSQL (3 instances) $ 155.48 S3 (1TB) $ 32.55 TOT $ 376.98 90
  91. 💸 Cost @loige Cost estimates are always a bit of

    a "gamble"... I selected some arbitrary instance sizes (EC2, RDS, ElastiCache). I am not accounting for auto-scaling. I am not accounting for network traffic. Better to look at cost in production and try to optimise when needed. Rule of thumb: try to balance cost with your revenue. Rule of thumb (2): consider the ! total cost of ownership 91
  92. ☐ Create an AWS Account ☐ Select a tool for

    IaaC ☐ Create and configure a VPC in a region (3 AZs, Public / Private subnets) ☐ Create an S3 bucket ☐ Update the old codebase to save every new file to S3 ☐ Copy all the existing files to S3 ☐ Spin up the database in RDS (Multi-AZ) ☐ Migrate the data using Database Migration Service ☐ Provision the ElastiCache Redis Cluster (Multi-AZ) ✏ Bonus: a TODO list for the migration @loige ☐ Create an AMI for the application ☐ Create a security groups and an IAM policy for EC2 ☐ Make the application stateless ☐ Create an health check endpoint ☐ Create an autoscaling group to spin up the instances ☐ Create a certificate in ACM ☐ Provision an Application Load Balancer (public subnets) ☐ Configure Https, Targets and Health Checks ☐ Configure Route53 ☐ Traffic switch-over through DNS 🤞 📝 Great guide to cloud migrations: 6 strategies for migrating applications to the cloud 92
  93. The cloud is a journey not a destination The cloud

    is a journey not a destination @loige 93
  94. ☝ nodejsdp.link Thank you! Special thanks to: , ❤ Photos

    by @micktwomey @eoins Unsplash @loige loige.link/mono-cloud 94