(ContribEx) and Python (webmaster@) » SRE/Platform for Geomagical Labs, part of IKEA » We do CV/AR for the home SCaLE Kubernetes Community Day 2023 2 Hi there, I'm Noah Kantrowitz. I'm an SRE at Geomagical Labs. We do computer vision and augmented reality stuff for IKEA.
and management of containerized applications.” SCaLE Kubernetes Community Day 2023 3 Let's jump right in, what is Kubernetes? Straight from kubernetes dot io, it's an open source system for automating deployment, scaling, and management of containerized applications. Simple, right?
OS functions » CFEngine - 1993 - Desired state configuration » Puppet/Chef/Salt/Ansible - 2005-2012 - More! » Terraform - 2014 - Same but for infra » Kubernetes - 2014 - All of the above SCaLE Kubernetes Community Day 2023 5 If you talk to the core maintainer team, a common answer you'll hear is "Kubernetes is an API for systems". Just like POSIX is the standard API to access things like file I/O, Kubernetes is the API for your infrastructure. It's not quite the same as POSIX though, rather than being imperative with one function call after another, it's built on top of desired state configuration. This is similar to tools like Puppet and Chef, and CFEngine before them. Ditto for Terraform, which brought this model to cloud infrastructure. This isn't a complete history, that would be its own whole talk, but you can see Kubernetes as the next in this kind of line of tooling.
an API isn't just a local thing though, it's an HTTP- based REST API. And as a bonus, this API can extend itself, declaring new API types and operations using the API itself.
B also cares about load balancers » A and B don't have to know about each other SCaLE Kubernetes Community Day 2023 7 This in turn leads to extreme modularity. Two tools can both listen to changes on the same API without caring about each other. They don't even have to be written in the same programming language, as long as they both speak HTTP.
with lots of microservices and lots of developers and [insert more marketing buzzwords here].” SCaLE Kubernetes Community Day 2023 9 The standard answer I hear to this is that Kubernetes is complex and difficult to learn, so you should only use it for the biggest of the big systems. Millions of containers across thousands of servers, there's not a lot of other games in town for that kind of scale.
with lots of microservices and lots of developers and [insert more marketing buzzwords here].” Yes! (Probably!) SCaLE Kubernetes Community Day 2023 10 But I'm not here for that. As the title said, I'm here to be the contrarian who says you can and should probably use Kubernetes even with small projects or teams.
Community Day 2023 11 But first, let's talk about some alternatives because I'm sure many of you immediately thought "no way". Plain old Docker is about the simplest you can get with containers. There's some similar tools now like Podman but they all have the same issue, they aren't desired state systems, at least not at scale. If you have 10 servers, you would have to docker run on each of them with all the right arguments separately. Great for local development but not lovely for production, even when small.
Compose? Tricky for remote use, not extensible SCaLE Kubernetes Community Day 2023 12 Compose adds desired state features to plain Docker, but in turn it's even less remote-API friendly. And beyond that, it's a relatively simplistic system that can't be extended by third parties so if you outgrow any element of it, you have to start over.
Compose? Tricky for remote use, not extensible » Ansible/Terraform? Intermittent convergence SCaLE Kubernetes Community Day 2023 13 Ansible and Terraform both have container plugins and are plenty extensible, but their main downfall for me is they aren't continually convergent. Or put another way, they only update when you run them, the rest of the time it's the wild west. This can lead to drift between the desired state and your actual systems. Many of us have had a moment of horror when we run tf plan and see hundreds of lines of diff.
Compose? Tricky for remote use, not extensible » Ansible/Terraform? Intermittent convergence » ECS/Fargate/Cloud Run? Expensive, vendor lock-in SCaLE Kubernetes Community Day 2023 14 Hosted container run platforms are available from both the main cloud vendors and many third parties. These cover a really wide range but speaking generally, they are a lot more expensive given the same resources and many of them have some level of vendor lock-in, usually in the form of proprietary workflow tools.
Compose? Tricky for remote use, not extensible » Ansible/Terraform? Intermittent convergence » ECS/Fargate/Cloud Run? Expensive, vendor lock-in » Lambda/FaaS? More lock-in and limited architecture SCaLE Kubernetes Community Day 2023 15 Going further down that path you get functions-as-a-service tools like Lambda. These are similarly expensive and level up the vendor lock-in into your actual code in most cases, as well as limiting architectural choices to things compatible with the platform.
forever, we all know this » PaaS and FaaS platforms are high-quality but limited SCaLE Kubernetes Community Day 2023 16 I don't want to be overly harsh on any of these tools. A common theme I'll come back to again and again in this talk is that while small projects start small, they rarely stay small for long. And while "this is just a prototype, I'll rewrite it before the Real Version" is a lovely dream I've had many times myself, it rarely happens. High-level hosted services can be good, just understand the limitations and level of lock-in before you commit and make sure you're okay with the downsides. If you are okay with those, feel free to skip the rest of this talk with my blessing.
monolith web app or 2-3 services » MVP or a small standalone product » No ops team, probably just one "full stack" team » Cost sensitive but not shoestring SCaLE Kubernetes Community Day 2023 17 Let's lay out a baseline persona to talk about, a developer with a small team, 1-4 people, probably a single web application or maybe 2 or 3 smaller ones. The kind of situtation where one of them might think "maybe we should just get a VPS and apt-get install some things?" as a starting point. There's no dedidated ops team, and while they have some budget to work with they are relatively cost sensitive. Many of you have probably been on this team at some point, maybe some of you are right now.
project » Huge ecosystem of tools » Modular design means you swap components later » High-level APIs let you code only what you care about » Avoid the future lift-and-shift SCaLE Kubernetes Community Day 2023 18 The biggest single reason I think Kubernetes makes sense, even early on, is that it can grow as the project does. Adding more servers and applications is the most direct form of that but it goes further. You can do things like replace your storage provider with just a few lines of code or swap in a new networking model without a greenfields rewrite. But I also can't understate the value of the ecosystem tools. Want a monitoring system? Installing prometheus-stack is one command and includes everything you need. Need automatic TLS handling? We've got 5. The modularity of the API makes it easy to drop things in.
Community Day 2023 19 Hopefully by now you are at least open to the idea that Kubernetes can be a good idea for small teams and projects. But the next hurdle is to actually do it. And isn't Kubernetes really hard to use?
have to be! SCaLE Kubernetes Community Day 2023 20 So finally we come to the really contrarian part: Kubernetes doesn't have to be hard to use and we just do a really bad job explaining it to new folks. Kubernetes is a hugely flexible and powerful tool, it can do many things. But just like your car can go 100 miles per hour, that doesn't mean you should, and it definitely doesn't mean you start out a new learner at that speed.
installer » Curlbash for systemd or k3d for existing Docker » Defaults to SQLite for easy single-node » But supports Postgres and MySQL too » That VPS server you were going to use? Install k3s first SCaLE Kubernetes Community Day 2023 21 This isn't a talk about just K3s so I'm not going to dive deep into but it really is the cornerstone of a minimalist Kubernetes setup. It's an all-in-one build that removes a lot of legacy cruft you wouldn't notice and adds support for simpler database options. Is Etcd great for when you need a million containers? Absolutely. But for our tiny case it's major overkill. For the simple case of 1-5 servers, absolutely just stick with the defaults of making one of them run the control plane using SQLite as the database and the rest as workers. You don't need dedicated control plane nodes, computers are pretty fast these days, when you notice things starting to get limited by your resources then think about it.
vendor hosted Kubernetes like GKE, EKS, and AKS? Also good, but you'll pay a bit more because of their default services taking up some RAM on every server. It's not that much in the grand scheme of things though. This does mean a more complex setup if you aren't already using a cloud vendor, but in general they are cost competitive and an easy place to start if you can.
Services, Ingresses » No really, that's it SCaLE Kubernetes Community Day 2023 23 The real key to making Kubernetes viable for small use cases is to cut down the surface area. Yes, there's a million more features sitting there, waiting to be used. Let them wait. These 4 things are all you need to get started.
without Kubernetes » But on Kubernetes » Does the thing have Docker install instructions? Done » Is there a community Docker Hub image? Use it » Find a guide with apt-get install something? Copy that into a Dockerfile and roll with it SCaLE Kubernetes Community Day 2023 24 We'll dive more into how to actually use those 4 things in a moment but first, the overarching concept. If you don't know how to do something with Kubernetes, then do it how you would have before, but on Kubernetes. Take the simple approaches, use the nice- looking tutorials. Will it be perfect? Probably not, but it will be in a place you improve incrementally as you learn more.
2023 25 Some subject matter experts may already be thinking "that's ridiculous, if you don't use feature X you'll have problem Y later on". And you're probably right. But sometimes these things have to be a journey and if you expect perfection from a new team on a new project with minimal support and budget then Kubernetes isn't the problem in this equation. Let people do things the simple way when that's the only place they can start. Even when you as the senior engineer can already see 20 moves ahead and know the traps they will probably fall in to, they aren't there yet and unless you can jump in and help, just pointing out the problems and expecting immediate solutions is rarely productive.
connecting stuff » Storage - keeping stuff SCaLE Kubernetes Community Day 2023 26 As we dive into specifics, I'm going to group them into three big categories. Workloads are how you run your stuff on the servers, networking so they can talk to each other, and finally storage because a fully stateless system is frequently not an option so we have to put data somewhere.
a running container somewhere » Yes Pods have a million more options but simple for now » Deployments == run N copies of a Pod » N is frequently 1, that's okay SCaLE Kubernetes Community Day 2023 27 Enough abstract stuff, let's talk about what you need to know to actually use Kubernetes! Things run inside a Pod, but to start just every time you hear "Pod" think "container". It's more complicated than that deep down but as I keep saying, you can just ignore all of that for now. You can create Pods directly but you almost never want to because Deployments are a thin wrapper around Pods which simplify some annoying stuff. Deployments are one level up from Pods and take not just the container info but also how many to run, which can be 1 but can also be more in cases that need it. And importantly going from 1 to 10 replicas later on is one command.
you need them » apt-get install cron works too! SCaLE Kubernetes Community Day 2023 28 If you're looking through Kubernetes tutorials you'll probably see mentions of other workload-related things like stateful sets, daemon sets, and jobs. They all do important and special things and you can almost certainly worry about them later. CronJobs I will give a slight pass to, I won't cover them here but if I had to line things up in order you'll probably need to learn them, CronJobs are definitely near the front. But never forget rule one, do it the way you would without Kubernetes, with Kubernetes: you can install boring old cron in a container and run it like a service.
» --port 1234 - expose a port » --env "FOO=bar" - set environment variables » --replicas 5 - run multiple copies SCaLE Kubernetes Community Day 2023 29 Most Kubernetes tutorials, and this talk from here on out, talk about Kubernetes objects using their YAML forms. But before I start in on that I want to at least mention kubectl run. You can think of it as an upgraded docker run. You give it info about the container and it does the thing. Even in smaller cases I mostly don't use this as it can get frustrating for iterative development. A common pattern when writing a Deployment is starting with the basics and then adding pieces one by one. With kubectl run that means a lot of delete, up arrow in terminal, edit command, enter. Workable, but I like editing complex things in my text editor much more than in a shell input.
SCaLE Kubernetes Community Day 2023 30 Every Kubernetes YAML starts with these 3 bits of information, the API version and kind which determine the type of object this YAML is for, and the object's name. API versions are their own complex topic that ties into upgrades and compatibility promises and lots more stuff you don't need yet so for now just look up what the right value is for the thing you're doing and move forward. The kind is pretty self explanatory, it's what kind of thing this is. And we need names so if we have two of a thing, we can tell them apart.
labels: app: myapp SCaLE Kubernetes Community Day 2023 31 Next up, the selector thing. There are technical reasons for this but to be honest it's mostly a legacy thing that's still around. Just copy these 8 lines over each time and don't worry about it. The label value doesn't have to match the Deployment name but why make things harder for ourselves?
spec: selector: matchLabels: app: myapp template: metadata: labels: app: myapp spec: containers: - name: myapp image: mycompany/myapp:v1.2.3 SCaLE Kubernetes Community Day 2023 32 So we've got a bit of boilerplate but here's a full, real example. It's a bit redundant to have to put the myapp name 4 times but this isn't rocket science, any team I've ever worked with can do this. Again, a Deployment specification can easily be hundreds of lines of YAML, setting up a dozen more features but most of them just aren't important enough that you should stall out your progress until you learn them.
"main.py"] env: - name: PASSWORD value: secret SCaLE Kubernetes Community Day 2023 33 If you want a tiny bit more than the minimum, here's how you override the command built in to the container and same with environment variables. In a lot of cases you don't need these but if you do, again it's not rocket science and anyone can learn this in an afternoon.
easy, flat network, just need DNS » Inside -> Outside - outgoing traffic, default allow » Outside -> Inside - the spicy one SCaLE Kubernetes Community Day 2023 34 Being able to run software is cool but usually if we can't talk to it over a network, it's not going to be very useful. In Kubernetes the main thing you need to know about networking is it works kind of like your LAN at home. There's an internal private network where containers can talk to each other, they can talk to the internet, but for the internet to talk inwards to a container we need to poke a hole in the private network, more on that in a minute.
you care what a CNI is? NOPE! » Everything is open but dynamic IPs » Need DNS to help things find each other SCaLE Kubernetes Community Day 2023 35 The private network inside Kubernetes is flat. Every container can talk to every other container using the same IPs that they see for themselves. Exactly how this flatness is achieved is complicated and has lots of specification documents and plugins and protocols and you can, again, completely ignore all of that and just accept it as a big flat network, moving on. Like a home network, any time a container starts it gets a dynamic IP address. It's not actually DHCP under the hood but it has the same effect of making it impossible to hardwire IP addresses in places since they might change later. And so we need to use DNS, but how?
app: myapp ports: - port: 8000 SCaLE Kubernetes Community Day 2023 36 A simple service definition like this does one thing, it adds an entry to the DNS server used by all the containers so "myapp" will resolve to the IPs of those containers. Then you can use that hostname in your configuration files or whatever else so they can talk to each other. Yay we have internal networking!
» Load Balancer - any TCP/UDP port, cloud vs. on-prem » Node Port - works anywhere, weird, avoid SCaLE Kubernetes Community Day 2023 37 Containers talking to other containers is great but I'm selfish, I want to be able to talk to containers too. And here I have to introduce some complexity. Starting with the simple and most common case, something using HTTP or any protocol based on HTTP then you use an Ingress. For things that aren't HTTP, first be sad that you have to deal with some weird old protocol, then probably use a Load Balancer mode Service. If you followed my advice to use K3s then you have at least a basic implementation for load balancers but if you're running on a cloud provider, you'll usually want to integrate with their native support instead, which is out of scope for this talk.
- host: example.com http: paths: - path: / pathType: Prefix backend: service: name: myapp port: number: 8000 SCaLE Kubernetes Community Day 2023 38 Our biggest YAML blob yet and I'm sorry for that. Ingresses are an abstraction for an HTTP routing balancer that understands hostnames and paths. It's a plugin based system but K3s includes one for you so to start with you don't have to worry about the implementation. You can see at the bottom that it works together with the Service we saw before, the Ingress has the HTTP details and the Service points to the actual containers
Address: 40.155.110.208 ... SCaLE Kubernetes Community Day 2023 39 Once created the one thing we need to retrieve is the public-facing IP to use to talk to it. There are automated ways to handle this that are lovely and fancy and we don't need any of that right now because we can grab it using kubectl and throw it into public DNS or a hosts file or whatever else you use.
topic of networking, I very often get questions like "do I need a service mesh?". I'm not here to praise or vilify them. Service mesh tools offer some interesting features, the usual big one being automatic internal TLS for container to container traffic. If you're in a regulated industry that could be worth the complexity but otherwise I would firmly put them in the "not yet" bucket. Start small, you can always add one in later without major changes, that's the whole idea.
Host files - store things in a folder, like we used to » Cloud volumes - what the vendor wants you to use SCaLE Kubernetes Community Day 2023 42 Data storage is easily the hardest of the three main topics we're going to cover. If at all possible, just skip it. If your web app just needs a SQL database of some kind and you're already on a cloud provider, they probably offer a hosted option. Similarly things like S3 or GCS can be used for storing uploading files or similar. If you have the budget, that's a great way to simplify things. But that's not always an option so the two other we'll cover are host storage and cloud volumes.
path: /var/lib/postgresql/data name: data volumes: - name: data hostPath: path: /pgdata SCaLE Kubernetes Community Day 2023 43 Host path mounts connect a folder from the host system into a container. This is simple and effective, and matches our rallying cry of "do it the same way you would before Kubernetes". Have data? Put it in a folder. Problem over. If you have a single server, like the "just get one VPS" example we talked about at the start then 100% this is fine to start with.
nodeName: mynode1 $ kubectl get nodes SCaLE Kubernetes Community Day 2023 44 However if you have more than one server, you have a problem again, since the container could end up running on any of them by default and if the data isn't there then you will not go to space today. Any easy out there is to override the automatic container allocation for those cases and just pin it to one server. Not a good forever solution but it can be enough as you are growing from one server to two. For a more flexible solution either to local or cloud storage we need to introduce a new player to the game.
accessModes: - ReadWriteOnce storageClassName: local-path resources: requests: storage: 50Gi SCaLE Kubernetes Community Day 2023 45 And here my lies at the start that you only need to learn 4 resource types come crumbling down. If the simpler approaches for storage aren't enough, you'll need to learn about a 5th thing and some of its friends. A PVC is a request for a storage volume of some kind. It defines what kind of storage it should be and how big and then something will try to create it, hopefully. K3s includes a provisioner for "local-path" type storage which is similar to the host path solution we saw before but automatically understands which server the data is on.
- path: /var/lib/postgresql/data name: data volumes: - name: data persistentVolumeClaim: claimName: myapp-storage SCaLE Kubernetes Community Day 2023 46 Using a claim is overall similar to the host path approach too, we just swap out the volume information to point at the claim instead.
a lot of cloud and storage vendors » Take the problem and push it somewhere else » Vendors can own their plugin » Cloud controllers? SCaLE Kubernetes Community Day 2023 47 PVCs are also how you set up a cloud volume generally, but to do that we need to learn about one more thing, CSI plugins. In the old days, Kubernetes had a list of volume providers it understood and that was it, all you could use. That sucked for everyone so another plugin system to the rescue. CSI plugins teach Kubernetes about how to use new types of storage. With that we also usually bundle a bit of code that knows how to take the request info from a PVC and talk to an API somewhere to make it real. Installing these is usually easy, one or two commands, but it will vary for each vendor so if you need it, search for "name of vendor csi plugin" and follow their instructions.
Lots more SCaLE Kubernetes Community Day 2023 48 If you aren't running on a cloud system or even if you are but you just don't feel like dealing with their storage systems, you can run your own version instead. K3s doesn't include one directly but the same team works on one called Longhorn and it's what they recommend. And it's fine, perfectly good to get you started. There's a lot of other options here though and as I said before storage is hard. If you need one 10 gigabyte volume for a database, don't worry just use whatever is easy. By the time you get up in the terabytes, maybe compare the strengths and weaknesses of each. Remember, the whole point is you can swap out components later without rebuilding the entire system.
one » kubectl get <type> - list things » kubectl describe <type> - show details » kubectl delete <type> - what it sounds like SCaLE Kubernetes Community Day 2023 49 We briefly touched on kubectl run, but it warrants a little more time. Kubectl is the command line tool for interacting with Kubernetes. Like everything else we've covered, it has approximately 8 million features and of those we need maybe 4.
single most important and most used kubectl command is apply. It handles both creating and updating things. You can give it either a path to a single YAML file or a folder with many YAML files. Not very complicated, but a workhorse of any Kubernetes cluster.
all pods » kubectl get service myapp - list a single service » kubectl describe service myapp - details on one » kubectl delete pod myapp-5d5d5fc579-6kl82 » kubectl delete -f myapp.yaml SCaLE Kubernetes Community Day 2023 51 The other commands we care about all take the form of verb, type, and maybe name. Get is the list command, mostly useful for checking quick overview data or finding the names of pods as they are randomized. Describe gets you detailed data on one thing, we saw that before with ingresses but it works on all types. Delete deletes things, and notably you can use dash f like with apply to clean up everything from a YAML file.
2023 52 And so we've covered the basic trifecta you need to use Kubernetes as a small team. But obviously there's far more that I didn't cover. I'll repeat my claim that starting small and doing things poorly is okay and you shouldn't feel like you have to do everything all at once. But I would like to give you at least a few directions to go in once you are ready for them.
management (Secrets, sealed-secrets) » Access control (RBAC) » Monitoring and alertings (Prometheus and Grafana) SCaLE Kubernetes Community Day 2023 53 We started from a baseline of "just get a VPS and install some software" and we've inherited no shortage of limitations from that. The most obvious being that anything past a prototype probably needs more than one server. Easy enough to pay for a few more instances and point K3s at them. But now we've got a few things to worry about, it's not enough to just have two servers, we need certain containers to be spread over both of them so if one goes down, the others are still there. You've probably got things like database passwords and API keys to store and while "they live in a YAML file on my laptop" is a place to start, it will get in the way more as your team grows. Access permissions and monitoring are less important when you have one application and everyone works on it but eventually that won't be the case. The things I've listed here are all directions you can go in once you reach those points. Not today, not tomorrow, but they will be there when you need them.
54 And what about all the other fancy things? How will you know when you should look into them? It's a hard question to answer, as are all unknown unknowns. The whole thrust of this talk has been that a small team doesn't need to boil the ocean to get started. But once you get things going it would probably be a good idea for one or two of your team to spend at least a little time going through some more in-depth tutorials or videos or books, learn all the more complex bits at least a little and start to introduce them into your setup gradually. There's no one order or list of things to learn that will be optimal for everyone, look for features which solve problems you've had, or vice versa when you have a problem go see if there's a flag or config option or tool which would prevent it next time, there likely is.
SCaLE Kubernetes Community Day 2023 55 So let's summarize a bit, Kubernetes is a solid choice for small teams if you start small and build up your complexity as your project needs it and your team can support it. Grow into the tools rather than expecting perfection on day one. Embrace the modular nature and don't be afraid to swap components as you need more from them. And most of all, treat Kubernetes as something you will just keep learning about as you go rather than a thing to master before you begin. Put all that together and you've got a solid place for any team to begin.
Kubernetes Community Day 2023 56 This is not at all related to the thrust of the talk but I'm hiring an MLOps engineer and I would be remiss to not take this opportunity to plug myself. Geomagical.com has more info.
just hype? No » Systems as APIs » POSIX » Salt/Ansible/Chef/Puppet » Kubernetes » Modular systems » Promise theory, convergent systems » "Should I use Kubernetes?" » "Only for big systems" - The standard answer » It can be small! » Why not X? » Docker? Docker on its own is not convergent » Compose? Low ecosystem, minimal remote API SCaLE Kubernetes Community Day 2023 58