Slide 1

Slide 1 text

Many Layers of Availability how to achieve so many nines in practice 99,99% 99,999%

Slide 2

Slide 2 text

1. Migrate to microservices 2. ??? 3. PROFIT

Slide 3

Slide 3 text

Ilya Kaznacheev, PhD Cloud-Native App Architect Ex. GDE on Google Cloud 12 years in software engineering

Slide 4

Slide 4 text

So, what about the nines?

Slide 5

Slide 5 text

Application level 1

Slide 6

Slide 6 text

Architecture • independent modules • avoid single point of failure

Slide 7

Slide 7 text

Redundancy • component replication • concurrent computing • resource scaling

Slide 8

Slide 8 text

Load Balancing • request distribution • smart balancing • dynamic scaling

Slide 9

Slide 9 text

Reducing Dependencies • interaction over APIs • eliminating implicit depd. • consolidating features

Slide 10

Slide 10 text

Sync. vs Async. • message-based • event-based

Slide 11

Slide 11 text

application

Slide 12

Slide 12 text

Platform level 2

Slide 13

Slide 13 text

Clusters • preventing outage • load distribution • network replication

Slide 14

Slide 14 text

Orchestration • lifecycle control • autoscaling • anti-af fi nity

Slide 15

Slide 15 text

Self-Healing • machine replacement • health checks (ext) • health monitoring (int)

Slide 16

Slide 16 text

application platform

Slide 17

Slide 17 text

Data level 3

Slide 18

Slide 18 text

Redundancy • database replicas • msg broker replicas • backups

Slide 19

Slide 19 text

Caching • application caching • database caching • standalone caching • content delivery network

Slide 20

Slide 20 text

application platform data

Slide 21

Slide 21 text

Infrastructure level 4

Slide 22

Slide 22 text

Compute • hot migration • instance groups

Slide 23

Slide 23 text

Networking • software de fi ned network • load balancing

Slide 24

Slide 24 text

Disk • replicated storage • network fi le system

Slide 25

Slide 25 text

application platform data infrastructure

Slide 26

Slide 26 text

Hardware level 5

Slide 27

Slide 27 text

Hardware • RAID • rack distribution • internet channel backup

Slide 28

Slide 28 text

application platform data infrastructure hardware

Slide 29

Slide 29 text

Global Availability level 0

Slide 30

Slide 30 text

Cloud and DC • multi DC disaster recovery • cloud multizone regions

Slide 31

Slide 31 text

Global Server Load Balancing • BGP Anycast • multiregion load balancing

Slide 32

Slide 32 text

application platform data infrastructure hardware global availability

Slide 33

Slide 33 text

Summing Up • availability is made up of many levels • each level must cooperate with other levels • need to consider all levels when developing HA systems

Slide 34

Slide 34 text

🤗

Slide 35

Slide 35 text

Thank You!