Why Sidecars? Moving the Stack out of the Monolith

Slide 1

Slide 1 text

Why Sidecars? Moving the Stack out of the Monolith Greg Poirier Sensu, Inc. Hi friends! My name is Greg Poirier, and I work at Sensu on the open source monitoring platform. I’ve spent the last 15 years building systems for ISPs, *air quotes* “big” data companies, the US government, and SaaS providers. Today, I want to talk about one of my favorite infrastructure design patterns: the sidecar.

Slide 2

Slide 2 text

Wat am Sidecar? At this point, I’d like to think that everyone has heard of a sidecar, but I’ve been proven wrong a number of times, so I will very brieﬂy talk about the sidecar pattern.

Slide 3

Slide 3 text

Last year, Brendan Burns and David Oppenheimer published a paper titled “Design patterns for container-based distributed systems.”

Slide 4

Slide 4 text

“Sidecars extend and enhance the main container.” - Burns, Oppenheimer 2016 Sidecars extend and enhance the main container. In this case, the “main container” is the actual application. But I want to frame our discussion of sidecars a little so that they maybe make a little more sense in the context of a holistic architecture. First, though…

Slide 5

Slide 5 text

Trigger Warning Trigger warning! I use some words in this talk that are gonna make some people angrily @ me on Twitter or something after this, and I want to preface their use with a quick and simple explanation of my intentions.

Slide 6

Slide 6 text

Monolith When I say “monolith” in this talk, I’m not using the word as an insult to your codebase. I’m not talking about a monolith.

Slide 7

Slide 7 text

MONOLITHS ARE TERRIBLE This isn’t a talk telling you why one architecture is better than another. I don’t know your requirements. I don’t know anything about your product, and I’m not religious about architecture or design patterns.

Slide 8

Slide 8 text

Choose Your Own Architecture Building software is like reading a super expensive choose your own adventure book that only ends when you run out of money. So, my suggestion is to make choices that mean you don’t run out of money trying to engineer your way out of a hole you dug yourself.

Slide 9

Slide 9 text

We have all been this happy about the hole we’ve dug ourselves into. At least, I hope that’s the case. I know I have on a number of occasions.

Slide 10

Slide 10 text

Your founding technical team has left post- IPO, and your sprawling organization is now faced with addressing the mountains of technical debt that made you so successful. You: 1) Find more tires (pg. 9,345,160) 2) Reorganize people and code in a way that makes microservices make sense. (pg. 263,400) But maybe you’re reading along in your choose your own adventure book, and you ﬁnd yourself at this page.

Slide 11

Slide 11 text

CONGRATULATIONS! You live! You decide to reorg and you are building out your brave new services oriented future. After doing so, you start to have conversations about how to break apart the codebase in a way that makes sense.

Slide 12

Slide 12 text

So you take a look at your architecture diagram, and you see immediately, “Ah yes! Of course! I see how this easily decomposes into services. Why don’t we reorganize our teams around this very, very obvious decomposition so that all of our development eﬀorts are aligned with a great new vision.” And you get to work right away on some new architecture diagrams.

Slide 13

Slide 13 text

And then you ﬁgure out how to make it look like the death star. And that death star is made up of reasonable components that look something like:

Slide 14

Slide 14 text

This. Everyone agrees upon a standardized architecture for all new services: - HTTP for transport - Rate limiting to provide backpressure to noisy internal users - A sensible authentication and authorization layer that allows for security and accountability - Uniform request logging across all services in a structured logging format that helps with request tracing through the new system - A core service discovery mechanism that makes it easy to find the nearest copy of another service - A configuration system to deliver durable configuration data to applications - And a standardized metrics format and storage mechanism Everyone’s really happy about this new world, but then a few teams get together, and start to think, “Hey wait! We could build some of these things once, and then every can benefit from that work!” And now it’s time to make a decision.

Slide 15

Slide 15 text

I just like how in every picture Nick Jonas looks like he’s trying to make a decision. In this one, he’s clearly trying to decide between some shared libraries or reusable external processes that add functionality to each of your organization’s deployed applications! Wait. What was that last sentence, that sounds familiar!

Slide 16

Slide 16 text

Oh my God, Nick Jonas knows what it is!

Slide 17

Slide 17 text

SIDECARS! Everyone gets really excited about sidecars, and your well-staﬀed organization devotes adequate resources to the building and maintenance of some shared sidecars that help developers rapidly iterate on business logic. At this point, I want to step back, though, because did you notice how I haven’t even said containers yet?

Slide 18

Slide 18 text

Not just for containers The sidecar pattern doesn’t have to be used in The Great Land of Containers. We deploy Sidecars in VMs all the time. Recall in our architecture diagram:

Slide 19

Slide 19 text

There’s the HTTP listener and rate limiting? That could be a locally deployed nginx reverse proxy. That is totally a sidecar. Nginx can provide your service with buﬀering, caching, rate limiting, and load balancing. That’s additional functionality that you didn’t have to implement yourself.

Slide 20

Slide 20 text

Reuse And that’s the key: reuse. We have existing code that does something we need it to do, and by deploying it alongside another application, we add functionality to the application without having to build it ourselves.

Slide 21

Slide 21 text

But shared libraries provide reuse. Our ﬁrst nerd anger moment is going to be about shared libraries. It’s true, shared libraries provide code reuse as well, but there are a couple of key factors here. First, sidecars don’t care what language your service is written in. Second, sidecars are runtime dependencies, not build dependencies. Runtime dependencies can be built, tested, and deployed in isolation. So while you may have to deploy your sidecar to the world for a critical update, you won’t have to deploy the whole world. Solid integration and regression testing for your sidecars allow for a high conﬁdence that they can be deployed safely in isolation of the services that depend on them. And that’s the key to being successful.

Slide 22

Slide 22 text

Good testing and Clearly deﬁned interfaces You need to make sure that your sidecars adhere to a well-speciﬁed interface, and you need to test that interface with every build. Even if it’s Nginx, the way that applications interact with that process needs to be tested.

Slide 23

Slide 23 text

But why can’t those live somewhere else? Our next angry Twitter moment is: “Why does this have to live on the same machine as my service? Why can’t I have a pool of other hosts make this functionality available for every service?” Well, you can go that route, and many companies do so successfully, but I prefer the sidecar approach for its next set of beneﬁts. In order to talk about those beneﬁts, let’s consider these two choices.

Slide 24

Slide 24 text

Let’s work with a concrete example: a configuration service. In our imaginary architecture, configuration could have been a client library provided for applications, but we’ve decided to pull it out of the monolithic stack into a sidecar. If you’ve never used or built a library like this, all it effectively does is call out to an external key-value store like Zookeeper, Consul, or Etcd to get configuration data at runtime. It might even support hot reloading of configuration data by watching those values and restart or reconfiguring the application when they change.

Slide 25

Slide 25 text

Your durable configuration data has to live somewhere. On the left we have what it would look like if there was a centralized configuration service that applications spoke to, we have the customer and order services talking directly to an external configuration service. Again, this could be the key-value store directly or we could have fronted it by some kind of service providing a simpler interface. On the right, we’ve introduced a sidecar that acts as a smart proxy to our persistent data store backing the configuration service. Their implementations aren’t significantly different—the sidecar would behave pretty similarly to an external configuration service, BUT! You could implement identical interfaces, if you so wanted. There’s a very important, key difference here: the sidecar is local and shares a the physical location of the customer and order services, respectively. So instead of requiring network traversal, the two processes can communicate via any means available to the host locally. In our example, we’re going to have them communicate via the filesystem.

Slide 26

Slide 26 text

PUT /config/service/orderService GET /config/service/orderService Our mythical configuration service has two methods that it accepts, put and get, to a path. Your HTTP requests to the fictional HTTP configuration service would look like those with some JSON in the body. Our reimagined configuration service that is built using a sidecar might look a little different.

Slide 27

Slide 27 text

conf put -i ./orderService.json \ /config/services/orderService Our new config service will provide a CLI that lets you put the contents of a file into a specified key in our key-value store. The CLI is also what’s running in the sidecar, just with different arguments.

Slide 28

Slide 28 text

conf get -o /config/orderService.json \ /config/services/orderService In the sidecar, our conf application is told to get the value of a key and output its contents to a file on disk.

Slide 29

Slide 29 text

How the sidecar and CLI talk to the persistent storage is localized to this one artifact that the maintainers of the configuration service ship to your laptop and to production. Regardless of the interfaces exposed, this affords us considerable flexibility when implementing the configuration service, because now we can do fancier things! For example, we can cache a local copy of the configuration. Or we can make the sidecar poll our key-value store and update the local copy on disk whenever the value changes.

Slide 30

Slide 30 text

How is this different? But wait. How is this any different than a client library talking to our configuration service? Well, what if someone working on a new service puts configuration retrieval too early in startup. Let’s pretend for a second that they’re not using whatever framework has been provided for the lingua franca of your organization. They’ve had to write a service in Go, and everything else there is in Java. They whip together a new service that talks to a centralized configuration service, but they put that action really early in startup—before any other dependencies are satisfied. Now let’s pretend that right after they talk to the configuration service, they panic trying to test for their first runtime dependency. The service restarts, causing it to poll the configuration service again. And again. And again. Once a millisecond or so. And now your configuration service is DOS’d. Even with the rate limiting you’ve put in place, it’s taking up sockets, because the process is leaving sockets around in FIN_WAIT on your config service hosts. Well.

Slide 31

Slide 31 text

If you had used a sidecar. If you’d used our sidecar, the service would just be opening and reading a file on disk. The sidecar is happily doing its thing, polling the key-value store occasionally or waiting on events saying the value has changed. And those are our final, two, key benefits of sidecars:

Slide 32

Slide 32 text

Locality and Isolation Locality and Isolation. Sidecars isolate the interaction between processes and localize them to a single host. Your application may abuse the local copy of the sidecar only, and requests to the sidecar do not have to traverse the network. And, a local process has many more communication primitives to work with than a remote process. Every process having its own instance of our ﬁctional conﬁguration service means that no single service or process can dominate the pool of shared resources.

Slide 33

Slide 33 text

That’s why sidecars: Reuse Locality Isolation That’s the short of why sidecars. Reuse. Locality. Isolation.

Slide 34

Slide 34 text

And now for the fun! In my proposal for this talk, I promised some practical examples. So I wandered the earth looking for some super sweet sidecars. Some of these were pointed out to me by friends, and I even wrote one myself in like a quick second as an example of a sidecar with a well-deﬁned, tested interface for a conﬁguration provider.

Slide 35

Slide 35 text

github.com/Netflix/Prana Sidecar for your Netflix PaaS based Applications and Services The first sidecar I want to talk about is Prana, from Netflix. This is the most I think one of the most practical sidecars I found.

Slide 36

Slide 36 text

Prana works by providing functionality that would have been available via Netﬂix’s rich suite of JVM client libraries over an HTTP interface for an application. This is exactly the kind of approach to sidecar development that I’m advocating for today. They have a well-deﬁned, well-formed software stack that they want to leverage across multiple programming languages. So they just pulled the stack right out and built a sidecar service for applications that can’t use client libraries.

Slide 37

Slide 37 text

github.com/bitly/oauth2_proxy A simple OAuth2 proxy that provides authentication against providers like Google. The next sidecar I want to talk about is this oauth2 proxy. This is really cool, because it’s a such great example of an external process that provides a very speciﬁc functionality to your application. In this case, a complete OAuth2 workﬂow, and I don’t know if you’ve ever added oauth to an application, but this is sort of an annoying thing to do over and over again.

Slide 38

Slide 38 text

I know that I could do this with a third-party library, but with this sidecar, the experience of implementing the OAuth2 workﬂow for applications is standardized without having to ﬁgure out how to do it for whatever language I’m developing in. I can easily ship a completely uniform OAuth experience across every externally-facing application.

Slide 39

Slide 39 text

https://github.com/JrCs/docker-letsencrypt-nginx-proxy- companion A LetsEncrypt certiﬁcate manager sidecar. This is another totally sweet sidecar that I love. Have you ever wanted to just throw an nginx proxy in front of your project and have it manage its own SSL certiﬁcates with LetsEncrypt. Yeah, well, there you go. Now you can, and it’s totally easy. This does use docker-gen though, which requires access to the Docker socket, so there’s room for improvement there, but this is totally workable immediately for experimentation.

Slide 40

Slide 40 text

My face when I found that container.

Slide 41

Slide 41 text

https://linkerd.io/in-depth/deployment/ Linkerd: service-to-service routing in a sidecar. If you haven’t seen the stuﬀ form buoyant.io, I highly encourage you to look at linkerd, linkerd-tcp which was recently announced, and namerd. Linkerd service-to-service routing sidecars that were built by the some of the team members that helped build Finagle at Twitter. This is a powerful construct, because it allows you to centralize all of your service discovery, load balancing, and failure handling in a single location.

Slide 42

Slide 42 text

https://github.com/lyft/envoy If this piques your interest, I also highly recommend checking out Envoy from the engineering team at Lyft.

Slide 43

Slide 43 text

Speaking of Lyft…

Slide 44

Slide 44 text

github.com/lyft/metadataproxy If you’re on AWS, I highly encourage you to take a look at metadataproxy. You can use metadataproxy for a few different things. I’ve used it for local development of applications that I want to talk to the EC2 metadata service. In a container runtime environment where you’re deploying with Docker, you could actually route requests from containers to the metadataproxy with firewall rules and have metadataproxy deliver STS credentials for specific services. If you’re using ECS, this isn’t necessary, because ECS provides per-container IAM roles already, but keep this in mind if you’re building a container platform.