Disaster Recovery using Lenovo ThinkAgile VX and VMware Cloud on AWS
This was the presentation that was used for the VMworld US 2018 conference focussing on how you can built a reliable disaster recovery solution using Lenovo ThinkAgile VX and VMware Cloud on AWS with VMware Site Recovery Service
of DR Expertise – Reliance on sophisticated, complex technology – Technology (and data) are deployed to more locations – Building your own DR solution can be manual and complex Disaster Recovery Disaster Recovery (DR) is about preparing for and recovering from a disaster!
amount of data loss measured in time. • Recovery Time Objective(RTO): The time it takes after a disruption to restore a business process to its service level. RPO RTO Lost Data Lost Time Objective Cost effective lowest RPO / RTO
AWS Cloud powered by VMware Cloud Foundation™ – VMware vSphere® – VMware vSAN™ – VMware NSX® – vCenter Server® • Running on elastic, bare-metal AWS Infrastructure • 4 – 16 node configuration – Dual Socket with 18 cores running at 2.3GHz – 512GB Memory – Eight NVMe devices for a total of 10TB raw capacity - Eight drives are distributed across two disk groups with one cache and three capacity drives per disk group. – RAID 1 by default, but RAID 5 or RAID 6 possible for higher node counts AWS cloud ESXi vSAN NSX AWS Infrastructure vCenter Server NSX Manager Platform Services Controller VM VM VM VM VM
Library Enables effortless distribution and automatic synchronization of content – OVAs, ISOs, etc. Integration with AWS services VMC provides high bandwidth, low latency connectivity to AWS services like S3, EC2 Compliance ISO 27001, ISO 27017, ISO 27018, SOC 1(SSAE18 / ISAE 3402), SOC 2, SOC 3, and HIPAA, and General Data Protection Regulation (GDPR) VMware vCenter® Server Hybrid Linked Mode Single pane of glass monitoring for Hybrid Cloud management Workload Mobility Live Migration between On- Premises and VMC using vMotion
Site Application Agnostic protection Low Recovery times with single click failover and failback Highly predictable recovery objectives Centralized management of recovery plans
and objectives Active Production Site running Applications Secondary Site sitting idle until needed for recovery Active-Passive Secondary Site running low- priority test/dev workloads usually powered off as part of the recovery plan Active-Active Production Applications operating on both sites Supports protecting virtual machines in both directions Bi-Directional
Most Critical, but least frequently used Disaster Avoidance Preventative Failover Graceful Shutdown of VMs, Full Replication of Data and ordered startup ensuring app-consistency and zero data loss Upgrade and Patch Testing Identical Environment Can use the secondary environment with complete copies of VMs to test new updates or patches
be a part of one or more Protection Groups Virtual Machines part of the same Protection Group are recovered together Recovery Plan can have one or more Protection Groups. Flexibility to test or recover an individual or a group of apps
innovation, automation of vSAN running on the worlds most reliable hardware Lower Risk Simple and easy installation Latest innovations to power business Performance and scalability
Appliance ThinkAgile™ VX Certified Node XClarity® Management VX Installer ThinkAgile™ Advantage Support Prequalified Components Lifecycle Managed Single Point of Contact
Easy to Order No need to check HCL Only certified firmware No assembly required Easy to Install ThinkAgile™ VX installer Standardized Deployments Guaranteed firmware compatibility Easy to Manage Utilize existing management tools Best recipe firmware releases Rolling firmware upgrades With Lenovo ThinkAgile™ VX Appliances and Certified Nodes
Buckets Customer AWS environment Alternate Reality ESXi + vSAN VM VM VM VM VM VM ThinkAgile™ VX Cluster Customer Datacenter AWS Storage Gateway Manual Deployment and Management
Manager Edge Gateway Appliance vSphere Replication Service VM VM VM VM VM ESXi + vSAN VM VM VM ESXi + vSAN ESXi + vSAN ESXi + vSAN Lenovo ThinkAgile™ VX Infrastructure VMs User VMs
Manager MGW vSphere Replication Service VM VM VM VM VM ESXi VM VM VM ESXi ESXi ESXi Infrastructure VMs User VMs VMware Cloud® on AWS vCenter Server Platform Services Controller Customer Datastore CGW
VM VM VM VM VM VM ThinkAgile™ VX Cluster Customer Datacenter Workload Cluster Management Cluster Internet Internet gateway NSX VM VM VM Internet Gateway CGW MGW IPsec VPN vCenter Server Platform Services Controller Site Recovery Manager vSphere Replication Service NSX Edge VMware Cloud® on AWS VMware vCenter®
Network Configuration – IPsec Tunnel between NSX Edge(On-Prem) to Management Gateway(VMC) - Used to enable access to vCenter, VM Migrations, Content Libraries – IPsec Tunnel between NSX Edge(On-Prem) to Compute Gateway(VMC) - L2VPN used to extend layer 2 networks across the tunnel - Used to deploy User Virtual Machines and assign public IP addresses – Firewall Rules - Ability to define Firewall Rules for both the Management and Compute Networks • Site Recovery Manager Configuration – Configure the Firewall rules for SRM and vSphere Replication traffic – Create Site Pair to connect your On-Prem SRM and VMC SRM – Resource Mapping between the two sites – Create Replications, Protection Groups, and Recovery Plans
the VMware Compatibility matrices before installing the VMware components Network Services All the components should point to the same DNS and NTP servers to avoid any configuration drifts Management Traffic Isolate the Management or System Traffic from the Virtual Machine Network Traffic. Database Servers Use Separate Database Server instances for vCenter and Site Recovery Manager. Network Configuration No asymmetric network configurations in your Datacenter. VPN Tunnel Configuration If your NSX Edge appliance is behind a firewall, you must configure the following firewall rules to forward IPsec VPN protocol traffic UDP Port 500 to allow Internet Security Association and Key Management Protocol (ISAKMP) traffic to be forwarded through the firewall Set IP protocol ID 50 to allow IPsec Encapsulating Security Protocol (ESP) traffic to be forwarded through the firewall Set IP protocol ID 51 to allow Authentication Header (AH) traffic to be forwarded through the firewall Site Recovery Manager (SRM) Configuration After creating the Site Pair between the SRM instances On-Prem and in VMC, create the resource mapping such that you still have access to the VMC SRM instance and your applications in case of a disaster.