Scaling infrastructure beyond containers

Slide 1

Slide 1 text

@wendigo Mateusz „Serafin” Gajewski • AWS UG Meetup Scaling infrastructure beyond containers

Slide 2

Slide 2 text

@wendigo Agenda • Evolution of infrastructure at Allegro, • Why Apache Mesos™? • Apache Mesos key concepts, • Future of datacenter and cloud computing?

Slide 3

Slide 3 text

@wendigo History of scaling infrastructure @ Allegro

Slide 4

Slide 4 text

@wendigo Infrastructure 1.0

Slide 5

Slide 5 text

@wendigo Job allocation problem

Slide 6

Slide 6 text

@wendigo Web Scale Resource management 100s dots 100s dots

Slide 7

Slide 7 text

@wendigo Infrastructure 2.0 1000s dots another 1000s dots

Slide 8

Slide 8 text

@wendigo Infrastructure 2.1 1000s dots another 1000s dots

Slide 9

Slide 9 text

@wendigo Challenges • cloud not used as cloud ;), • high cost of virtualization, • effective resource utilization, • microservice architecture, • spread of new technologies, • heterogenous resources, • scalability, fault tolerance & HA, • performance isolation, • data processing at scale

Slide 10

Slide 10 text

@wendigo Beyond cloud computing

Slide 11

Slide 11 text

@wendigo Holly Grail of TCO

Slide 12

Slide 12 text

@wendigo Infrastructure 3.0 A Platform for Fine-Grained Resource Sharing in the Data Center

Slide 13

Slide 13 text

@wendigo Scheduling

Slide 14

Slide 14 text

@wendigo Cluster scheduling

Slide 15

Slide 15 text

@wendigo Mesos architecture

Slide 16

Slide 16 text

@wendigo Mesos frameworks

Slide 17

Slide 17 text

@wendigo Offers

Slide 18

Slide 18 text

@wendigo Execution isolation

Slide 19

Slide 19 text

@wendigo External Containerizers

Slide 20

Slide 20 text

@wendigo Mesos HA • master election/failover with ZooKeeper, • master maintains soft-state, • framework state reconciliation, • slave checkpointing, • slave recovery, • framework checkpointing

Slide 21

Slide 21 text

@wendigo Beyond offers • offer filters (constraints), • static (pre-startup) reservations, • dynamic (post-startup) reservations, • oversubscription, • persistent volumes, • pluggable allocator scheduling policy (fair, priority based)

Slide 22

Slide 22 text

@wendigo Mesos frameworks

Slide 23

Slide 23 text

No content

Slide 24

Slide 24 text

@wendigo Mesos recap • „programming against the datacenter", • distributed datacenter kernel, • two-level multi-resource scheduler, • scalable, highly-available & fault-tolerant, • performance isolation with containers, • exposes homogeneous resources, • elastic, dynamic partitioning, • high resource utilization

Slide 25

Slide 25 text

@wendigo Future Datacenter

Slide 26

Slide 26 text

@wendigo IaaC

Slide 27

Slide 27 text

@wendigo Efficient utilization

Slide 28

Slide 28 text

@wendigo Google’s Omega source: Omega: flexible, scalable schedulers for large compute clusters

Slide 29

Slide 29 text

No content

Slide 30

Slide 30 text

Questions?

Slide 31

Slide 31 text

http://meetup.com/allegro.tech http://allegro.tech @AllegroTechBlog Work with us