Slide 1

Slide 1 text

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. How to Build a Digital Bank Using AWS How Monzoprocesses payments and operates a bank in the cloud Chris Evans Platform Team Lead Suhail Patel Platform Engineer

Slide 2

Slide 2 text

No content

Slide 3

Slide 3 text

No content

Slide 4

Slide 4 text

No content

Slide 5

Slide 5 text

No content

Slide 6

Slide 6 text

Platfrom Team Lead @evnsio Senior Platform Engineer @suhailpatel We’re part of the Platform Team at Monzo Bank in the UK

Slide 7

Slide 7 text

Use your card at a store or online One of our DCs gets a process message Messages get passed through AWS Direct Connect VPC AWS Cloud Auto Scaling group K8s Worker Instances Auto Scaling group Cassandra Cluster Auto Scaling group Messages go through dozens of microservices running on Kubernetes Services use Cassandra for persistent data storage Services leverage etcd for distributed locking and coordination NSQ is used for unordered queuing and event publishing Prometheus is used for monitoring and alerting etcd Cluster Auto Scaling group Prometheus Cluster Auto Scaling group NSQ Cluster Kafka is used in microservices for ordered queuing

Slide 8

Slide 8 text

Gateways to payment networks Storing the bank’s data in Cassandra Distributed locks with etcd Asynchronous Processing with NSQ/Kafka Our microservice platform on Kubernetes Metrics and alerting with Prometheus

Slide 9

Slide 9 text

No content

Slide 10

Slide 10 text

No content

Slide 11

Slide 11 text

Encrypted Messages are transferred via AWS Direct Connect Pods running in Kubernetes on EC2 receive messages and process them Mastercard Monzo VPC AWS Cloud Auto Scaling group K8s Worker Instance K8s Worker Instance K8s Worker Instance

Slide 12

Slide 12 text

Response is sent back to our servers to relay back to Mastercard Messages are processed through a myriad of microservices before the response is sent back Mastercard Monzo VPC AWS Cloud Auto Scaling group K8s Worker Instance K8s Worker Instance K8s Worker Instance

Slide 13

Slide 13 text

Gateways to payment networks Storing the bank’s data in Cassandra Distributed locks with etcd Asynchronous Processing with NSQ/Kafka Our microservice platform on Kubernetes Metrics and alerting with Prometheus

Slide 14

Slide 14 text

VPC AWS Cloud K8s Worker Auto Scaling group K8s Worker Instance K8s Worker Instance K8s Worker Instance K8s Worker Instance K8s Worker Instance K8s Worker Instance K8s Worker Instance K8s Worker Instance K8s Worker Instance K8s Worker Instance

Slide 15

Slide 15 text

No content

Slide 16

Slide 16 text

No content

Slide 17

Slide 17 text

Kubernetes Cluster Engineers trigger a deployment using a local tool called Shipper Deployment Service validates code, does static analysis and builds a container image Rolling deployments are invoked and completed via Kubernetes

Slide 18

Slide 18 text

No content

Slide 19

Slide 19 text

Service Communication Kubernetes Worker on EC2 service.transaction service.account Kubernetes Worker on EC2 service.account

Slide 20

Slide 20 text

service.account service.account Service Discovery service.transaction Kubernetes Worker on EC2 service.balance service.pot Kubernetes Worker on EC2 envoy config provider Kubernetes Master on EC2 K8s API Server Kubernetes Worker on EC2 service.account

Slide 21

Slide 21 text

Service Communication service.transaction Kubernetes Worker on EC2 service.account service.account Kubernetes Worker on EC2 service.account service.balance service.pot Service Discovery and Routing Retries / Timeouts / Circuit Breaking Observability

Slide 22

Slide 22 text

Gateways to payment networks Storing the bank’s data in Cassandra Distributed locks with etcd Asynchronous Processing with NSQ/Kafka Our microservice platform on Kubernetes Metrics and alerting with Prometheus

Slide 23

Slide 23 text

Storing Data service.transaction Kubernetes Worker txn_00000123456

Slide 24

Slide 24 text

Storing Data Cassandra Ring service.transaction Kubernetes Worker txn_00000123456 Replication factor: 3 Quorum: Local

Slide 25

Slide 25 text

Storing Data service.transaction Kubernetes Worker txn_00000123456 Replication factor: 3 Quorum: Local

Slide 26

Slide 26 text

Storing Data service.transaction Kubernetes Worker txn_00000123456 Replication factor: 3 Quorum: One

Slide 27

Slide 27 text

Storing Data service.transaction Kubernetes Worker txn_00000123456 Replication factor: 3 Quorum: Local

Slide 28

Slide 28 text

Gateways to payment networks Storing the bank’s data in Cassandra Distributed locks with etcd Asynchronous Processing with NSQ/Kafka Our microservice platform on Kubernetes Metrics and alerting with Prometheus

Slide 29

Slide 29 text

Many things can occur asynchronously rather than a direct blocking RPC. Message queues like NSQ and Kafka provide asynchronous flows with at least once message delivery semantics. Asynchronous Messaging service.transaction Kubernetes Worker service.balance service.pot Kubernetes Worker kafka NSQ Auto-scaling Group NSQ service.transaction Kubernetes Worker service.txn-enrichment

Slide 30

Slide 30 text

No content

Slide 31

Slide 31 text

Gateways to payment networks Storing the bank’s data in Cassandra Distributed locks with etcd Asynchronous Processing with NSQ/Kafka Our microservice platform on Kubernetes Metrics and alerting with Prometheus

Slide 32

Slide 32 text

Distributed Locking service.transaction txn_00000123456

Slide 33

Slide 33 text

Distributed Locking

Slide 34

Slide 34 text

Gateways to payment networks Storing the bank’s data in Cassandra Distributed locks with etcd Asynchronous Processing with NSQ/Kafka Our microservice platform on Kubernetes Metrics and alerting with Prometheus

Slide 35

Slide 35 text

Prometheus and Thanos Prometheus is a flexible time-series data store and query engine. Thanos allows us to treat many Prometheus servers as one single one, with infinite retention. • RPC Request/Response cycles • CPU / Memory / Network use • Asynchronous processing • C* and Distributed Locking • Cloudwatch Data • Social Media

Slide 36

Slide 36 text

A Global View of Monzo service.account Kubernetes Worker on EC2 Prometheus Services service.account Kubernetes Worker on EC2 Prometheus Infra service.account Kubernetes Worker on EC2 Prometheus Cassandra Kubernetes Worker on EC2 Thanos Query Thanos Store

Slide 37

Slide 37 text

No content

Slide 38

Slide 38 text

Use your card at a store or online One of our DCs gets a process message Messages get passed through AWS Direct Connect VPC AWS Cloud Auto Scaling group Auto Scaling group Cassandra Cluster Auto Scaling group Messages go through dozens of microservices to process the message etcd Cluster Auto Scaling group Prometheus Cluster Auto Scaling group NSQ Cluster Over a VPN tunnel terminating in our K8s cluster K8s Worker Instances Data will be written to Cassandra to record what’s happened Cassandra will replicate the data to multiple nodes Some services will use distributed locks for exclusive processing

Slide 39

Slide 39 text

VPC AWS Cloud Auto Scaling group Auto Scaling group Cassandra Cluster Auto Scaling group We approve the transaction etcd Cluster Auto Scaling group Prometheus Cluster Auto Scaling group NSQ Cluster K8s Worker Instances We publish an event about the transaction We return the approval message

Slide 40

Slide 40 text

VPC AWS Cloud Auto Scaling group Auto Scaling group Cassandra Cluster Auto Scaling group etcd Cluster Auto Scaling group Prometheus Cluster Auto Scaling group NSQ Cluster K8s Worker Instances The event is consumed A push notification is sent

Slide 41

Slide 41 text

No content

Slide 42

Slide 42 text

No content

Slide 43

Slide 43 text

VPC AWS Cloud – eu-west-1 Auto Scaling group K8s Worker Instances Auto Scaling group Cassandra Cluster Auto Scaling group etcd Cluster Auto Scaling group K8s Worker Instances Auto Scaling group Cassandra Cluster Auto Scaling group etcd Cluster VPC AWS Cloud – us-east-2 or us-west-1 (TBD) Auto Scaling group K8s Worker Instances Auto Scaling group Cassandra Cluster Auto Scaling group etcd Cluster Auto Scaling group K8s Worker Instances Auto Scaling group Cassandra Cluster Auto Scaling group etcd Cluster

Slide 44

Slide 44 text

Thank you! © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Thank you! © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Christopher Evans - @evnsio Suhail Patel - @suhailpatel