Real World Microservices

Real World Microservices Christoph Leiter Vienna Microservices Meetup, April 6th
2017

Agenda 1 Introduction 2 Microservices 3 Creating the infrastructure 4
Implementation of microservices 5 Lessons learned 2

Introduction

starjack A platform for buying ski tickets online: 4

starjack II Customers sign up and order their personal keycard
Tickets of various lift operators can be booked and are available within seconds No more standing in line for a ticket 5

starjack III Created with microservices from the ground up Everything
is a REST interface Frontend uses ES6 and React/Redux 100% Open Source components Development infrastructure Self hosted Gitlab (git, issues, Docker registry) Jenkins - Builds are executed on DigitalOcean and pushed to the Docker registry 6

Microservices

Why Microservices Architectural choice with advantages and drawbacks + Scalability,
Reliability, Isolation − Operational complexity starjack communicates with many diﬀerent 3rd party systems Uses multiple protocols like REST and SOAP Used to isolate services from each other If one service has problems or is down not everything is aﬀected Should one service get compromised an attacker does not get access to all data Reduces complexity of individual services It’s easier to manage 12 services with 2k LOC than 1 service with 24k LOC When you change one system you can only break so much 8

Our Microservices I starjack-auth Holds user data and handles logins
starjack-dta Gets tickets from lift operators using the DTA interface from SkiData AG starjack-axess Gets tickets from lift operators using the Axess interface from Axess AG starjack-liftoperator Manages our lift operators and works as a facade for dta and axess starjack-order Veriﬁes and processes customer orders. Creates invoices and sends emails starjack-payment Handles payments with the Mpay24 PSP 9

Our Microservices II starjack-keycards Orders new keycards from a 3rd
party supplier and updates status once produced starjack-weather Retrieves current weather for lift operators, currently using OpenWeatherMaps starjack-maps Used to get map locations for lift operators and for travel duration estimation. Uses Google Maps starjack-faq Used to manage FAQ entries starjack-mail Sends mails using Mailgun mail service As the system grows we will have more services instead of growing one monolithic system without bounds 10

Our Microservices III 11

starjack Deployment starjack uses AWS as its deployment platform Oﬀers
tons of services, everything is fully automatable Very high reliability possible if your architecture supports it Great security properties because you get your own software deﬁned network Everything is deployed in three availability zones We use many AWS services: EC2, S3, CloudFront, RDS, ElastiCache, SQS, Route53, CloudWatch, . . . 12

Deployment II Logical view on AWS 13

Deployment III Deployment view on AWS 14

Creating the Infrastructure

The Basics So, what do we need to get started
with a microservices architecture? Very good fully automated infrastructure management is key, see Martin Fowler’s “You need to be this tall to use microservices” Fowler says you need to be able to rapidly provision servers have very good monitoring and logging infrastructure have deployment automated Bonus points if you can programatically recreate your infrastructure from scratch 16

Infrastructure as Code I Terraform allows you to specify your
whole infrastructure as simple HCL ﬁles Supports AWS, Google Cloud, Mailgun and dozens of other services When you run Terraform it will compare your current state with the desired state and apply the needed changes Creates dependency graph between your resources and modiﬁes them in the right order No more clicking around in AWS web console Every change is documented and versioned, manual changes will be reverted on next run 17

Infrastructure as Code II 18

Infrastructure as Code III 19

Infrastructure as Code IV We use Terraform for the whole
basic infrastructure VPC, Firewall rules EC2 instances, ELB, S3, CloudFront, Route 53, SQS RDS Cluster, Redis Cluster Mailgun Very easy to use and works really well 20

Server Provisioning Now we have our bare EC2 instances running
and we need to install some software on them Terraform is not a provisioner – we need another tool to automate that We chose Ansible to provision our servers Works over SSH and doesn’t have requirements for the clients besides python We tag our instances by role with Terraform. Automatic inventory file by using ec2.py Configures EC2 instances, creates databases and users, defines DigitalOcean Jenkins Slave, . . . 21

Service Scheduling We use a cluster as our microservice deployment
platform. A scheduler is needed to make decisions on where in your cluster your services should run. We chose Nomad In comparison to other schedulers easy to get started Just one binary to install Relatively new and not as mature as other solutions Requires three servers, should be deployed to diﬀerent availability zones 22

Service Discovery Now that our services are deployed somewhere we
need to find them That’s the job of a service discovery service We chose Consul because it plays well together with Nomad Whenever a new service is deployed with Nomad it will register its endpoints in Consul A tool called consul-template updates the nginx configuration file as soon as changes happen and reloads nginx 23

Deployment I Nomad needs a job description ﬁle to know
what to schedule where Nginx needs to be conﬁgured so it knows which endpoints should be routed to which services We deploy our services using our small custom YAML DSL which is processed by Ansible 24

Deployment II 25

Deployment III Deployment steps 1 Modify services DSL file 2
Trigger deployment with Ansible 3 Creates Nomad job specification files and triggers scheduling 4 Nomad does a rolling update of the services 5 Nomad worker nodes pull new Docker images and start them 6 As new versions are rolled out Consul and Nginx get updated 26

Infrastructure Components In summary we have these infrastructure components 27

Implementation of Microservices

Microservices Details Microservices are written in Kotlin and are based
on spring-boot & Hibernate Each service has its own git repository One common library which services can use Be careful not to introduce unwanted dependencies between services! Treat as API and don’t break it Only used for cross cutting concerns Communication between microservices by using REST for synchronous and a queue for asynchronous communication Not covered in detail, your implementation will be diﬀerent anyways :) 29

Authentication User requests need be authenticated We don’t want to
query the authentication service for every request Would create a lot of load and a potential bottleneck Instead we use JSON Web Tokens (JWT) Authentication service creates cryptographically signed token for the user using its private key Services have a public key and can check whether the token is legitimate Since there’s no invalidation of a token we use a low TTL and a refresh mechanism 30

JWT 31

Interservice Communication If an immediate response is needed communication is
simply done via REST HTTP requests Passes Authentication header if it’s required to identify the user For asynchronous communication we use a queue Decouples services Messages don’t get lost if other service is down Automatic retries Increases reliability of your system Coordination between multiple instances of the same type is done with Redis 32

Logging I When you have dozens of services running you
need a good centralized logging solution AWS oﬀers CloudWatch Logs Docker can natively log to CloudWatch Every log message should only be one event (one line) We use awslogs as a “remote grep” tool for CloudWatch 33

Logging II Allows searching in multiple log groups Time based
restrictions with -s and -e 34

Logging III Logs are useless if nobody looks at them.
You need a quick overview and notifications for errors We have an AWS Lambda function which subscribes to our application logs Filters logs for events we are interested in Errors and warnings Whitelisted info events Uses slack API to push log messages to Slack Different channels based on severity and type Adds Slack notifications for errors 35

Logging IV 36

Monitoring We need external monitoring to notify us in case
a system is down We expose simple status endpoints from our services and let StatusCake monitor those If a service is unreachable StatusCake uses Pushover to send us a push notiﬁcation 37

Lessons Learned

Log Everything When things fail you need to be able
to debug the problem It helps to log as much as you can Application logs Web server access/error logs Queue messages Linux system logs Frontend logs For application logs think about how you will search for messages later, i.e. include relevant data For system logs use blacklists for messages you are not interested in, get notiﬁed for everything you don’t expect 39

Distributed Systems A distributed system might not exactly behave as
you’d think, see “Fallacies of Distributed Computing” Even when something has a failure likelihood of << 1% if that runs thousands of times it will eventually go wrong Expect that things will go wrong and make your system as robust as possible 40

DB Connections Usually DBs have a connection limit set Once
the limit is reached you won’t get a new connection Think about how many connections you will have, it might be more than you think connections = services × instances × poolsize 41

System Resources Really hard to know your service memory requirements
beforehand Services will probably consume more resources than you assume as they also need to have their runtime environment in memory No memory sharing/deduplication if you use Docker Be careful if you do hard memory limit enforcement 42

Summary The biggest challenge when doing microservices isn’t programming microservices
but the infrastructure Everything has to be automated. You need: Automatic infrastructure setup: Terraform, CloudFormation, Heat, . . . A provisioner: Ansible, Puppet, Chef, . . . Automatic builds: Jenkins, Gitlab, Travis CI, . . . A scheduler: Nomad, Kubernetes, Docker Swarm, . . . A discovery service: Consul, etcd, Zookeeper, . . . An easy way to deploy services Centralized logging: CloudWatch, ELK, Graylog, . . . Monitoring: StatusCake, Nagios, sensu, . . . 43

Summary II You only need to invest in infrastructure knowledge
once Makes development and evolution of your system easier Dramatically reduces complexity of individual services You get rewarded with a distributed highly reliable system 44

Kotlin Meetup If you’re interested in Kotlin please join our
meetup! Next meetup is on April 18th. 45

End Thanks for your attention Questions? Contact me at [email protected]
46

References starjack: https://starjack.at/ MicroservicePrerequisites: https://martinfowler.com/ bliki/MicroservicePrerequisites.html Terraform: https://www.terraform.io/ Ansible: https://www.ansible.com/
Nomad: https://www.nomadproject.io/ Consul: https://www.consul.io/ consul-template: https://github.com/hashicorp/consul-template Fallacies of distributed computing: https://en.wikipedia. org/wiki/Fallacies_of_distributed_computing 47

Real World Microservices

Real World Microservices

More Decks by Christoph Leiter

Other Decks in Programming

Featured

Transcript