How to configure nagios? • Who is in the memcache pool? • Which nodes provide the X service? Context 3 Tuesday, October 29, 13 For some context, at HashiCorp we manage an internal infrastructure as well as helping other to do the same. One of the challenges was things like adding a node to an LB. Or adding new nodes to nagios when they joined the cluster. Updating our pool of memcache nodes. Even things like asking where is our API server?
Server, Puppetmaster • Fabric, MCollective • Want: Simple, Robust, Automatic, Fast 4 Tuesday, October 29, 13 To address these problems, we started looking at the tools that exist. And what we found is that there are none! To clarify, when I say tool, I’m really talking about a simple CLI tool that to some degree works out of the box. There are plenty of solution to our problems, and many different tools in the space. There are ZooKeeper and friends, the config management tools, and very simple tools like Fabric. However, none of them satisfied us. We wanted something simple, robust, automatic, and fast. On top of that, it had to be something that works well with immutable infrastructure.
that is lightweight, highly available, and fault tolerant. http://serfdom.io 5 Tuesday, October 29, 13 So to satisfy all those things we built Serf. Yes this does sound like buzz word soup and I’m sorry about that.
To help clarify a bit, lets talk about what serf does and how it works. Serf provides 3 basic features. Membership, who is in my cluster. Failure detection: get fast feedback about dead nodes. And User events, which allow arbitrary application messages to be sent. Serf itself is not doing anything crazy, and is built around a Gossip protocol. The specific algorithms come from a paper called “SWIM: Scalable Weakly Consistent Infection-style Membership”. We make use of the gossip layer along with lamport clocks to provide the custom user event layer.
• Broadcast mechanism • Transport layer • Embeddable library What Serf Is 7 Tuesday, October 29, 13 We’ve talked about features, but those are not necessarily illuminating as to what Serf actual is or does. You can think of it as providing a very basic service discovery mechanism, event triggering, or as a network that provides broadcast and serves as a transport layer for higher level applications. It can also be used as an embedded library in other Go apps.
• Advanced Service Discovery • Dynamic roles • Health checks 8 Tuesday, October 29, 13 And to be super clear, we are going to talk about what Serf is NOT as there has been some confusion about this. It is NOT a configuration storage system. But it CAN be used to notify about a change to a config value! It is NOT a deployment system, but it can be used to signal a deploy is happening, and it is NOT a full blown service discovery mechanism with dynamic roles and application level health checks. Ijust want it to be clear that none of these features ship out of the box, but they can be built on top of Serf.
Serf v0.2 shipping this week • Security • Protocol versioning • More scalable • Bug fixes 9 Tuesday, October 29, 13 So all that said, where are we with Serf. It is a very very young tool still, as we just release last week. However, we are getting very solid interest from the community and a much improved version 0.2 ships this week. This version is really a big step as it adds security to make it usable on untrusted networks, protocol versioning so that we can make changes moving forward, scalability fixes and many bug fixes.
• ./scripts/start_cluster.sh • 1000+ node clusters tested 10 Tuesday, October 29, 13 If you want to try serf out, the easiest path is to use our HAProxy + Apache demo which is inside the repo. It uses cloudformation so you just upload a single JSON file to get started. It will start a single HAProxy and you can configure how many web nodes to spin up. You should see all the nodes get added to HAProxy, and then you can go around killing random nodes and watch them get removed. Alternatively, there is a ghetto script that will start a hundred or so serf’s on a loopback network locally. We’ve had some users on IRC report that they’ve spun up clusters of over a thousand nodes without issue already, so some promising signs.
System • “serf plugin install haproxy” 11 Tuesday, October 29, 13 Although Serf 0.2 ships in the short term, we have big plans for the road to 1.0 First off, we want to have an API for the Serf Agents sooner rather than later. This will enable programatic access instead of using the CLI we have (which does use a non-exposed API). Once that is in place, we can lay the ground work for a plugin system. The goal of the plugin system is to use Serf as a low-level transport and build higher level features around it. For example, it would be great to just “plugin install haproxy”, setup a few configurations and bam, automatically have Serf manage adding and removing nodes from various HAProxy pools.
• Avoid “reinventing the wheel” 12 Tuesday, October 29, 13 Beyond plugins, we also hope to build higher level tools ourselves as well as to see the community building them. The goal with serf, and its API and plugin system is to enable composability and reusabiliy. There is no need to constantly reinvent the wheel. If we can build a community around Serf, then we can focus our energies on building ever more interesting applications on top, instead of having every company waste time building a “works well enough” service discovery framework.
Mailing list • Get involved on Github! • http://github.com/hashicorp/serf 13 Tuesday, October 29, 13 If you’d like to learn more about Serf I highly recommend the website. We also have an IRC channel and mailing list you can contact if you have any questions. Lastly, if you are excited about hacking on Serf, it is completely open source and we would love any contributions to the project.