Slide 1

Slide 1 text

5 YEARS OF CLOJURE 5 YEARS OF CLOJURE PIERRE-YVES RITSCHARD ( PIERRE-YVES RITSCHARD ( ) ) @PYR @PYR 1

Slide 2

Slide 2 text

HALLO HALLO : Three-line Bio CTO & Co-founder at Distributed systems and monitoring enthusiast Open-Source developer Clojure Libraries, OpenBSD, Riemann, Collectd, and more. @pyr Exoscale 2 . 1

Slide 3

Slide 3 text

5 YEARS OF CLOJURE 5 YEARS OF CLOJURE Building better infrastructure with parentheses 3 . 1

Slide 4

Slide 4 text

EXOSCALE EXOSCALE Infrastructure as a service Zones in Frankfurt, Vienna, Zürich, Geneva 4 . 1

Slide 5

Slide 5 text

EXOSCALE EXOSCALE 5 . 1

Slide 6

Slide 6 text

EXOSCALE EXOSCALE provider "exoscale" { api_key = "${var.exoscale_api_key}" secret_key = "${var.exoscale_secret_key}" } resource "exoscale_instance" "web" { template = "Ubuntu 17.04" disk_size = "50g" profile = "medium" ssh_key = "production" } 6 . 1

Slide 7

Slide 7 text

I THOUGHT THIS WAS A CLOJURE I THOUGHT THIS WAS A CLOJURE CONFERENCE! CONFERENCE! 7 . 1

Slide 8

Slide 8 text

WHAT'S IN A CLOUD PROVIDER WHAT'S IN A CLOUD PROVIDER Datacenter operations So ware development 8 . 1

Slide 9

Slide 9 text

SOFTWARE AT EXOSCALE SOFTWARE AT EXOSCALE Virtual machine instance orchestrator Object storage controller Network controller (SDN) Customer management Metering system Billing Web portal 9 . 1

Slide 10

Slide 10 text

ISN'T ALL OF THIS BASH, PERL, AND ISN'T ALL OF THIS BASH, PERL, AND YAML? YAML? 10 . 1

Slide 11

Slide 11 text

CLOJURE NOT AN OBVIOUS CHOICE CLOJURE NOT AN OBVIOUS CHOICE The JVM had/has bad press with infrastructure folk 11 . 1

Slide 12

Slide 12 text

CLOJURE AT EXOSCALE: A TIMELINE CLOJURE AT EXOSCALE: A TIMELINE 12 . 1

Slide 13

Slide 13 text

2012: THE EARLY DAYS 2012: THE EARLY DAYS 13 . 1

Slide 14

Slide 14 text

WE STARTED WITH WE STARTED WITH 3 people A bit of time A product idea 14 . 1

Slide 15

Slide 15 text

A DIFFERENT CLOUD PROVIDER A DIFFERENT CLOUD PROVIDER Not yet another virtual datacenter product Integration with automation tooling Integration in language-specific libraries Focus on horizontally-scalable applications Local storage Security groups 15 . 1

Slide 16

Slide 16 text

THINGS THAT DIDN'T EXIST IN 2012 THINGS THAT DIDN'T EXIST IN 2012 Ansible Terraform Docker 16 . 1

Slide 17

Slide 17 text

THINGS THAT DIDN'T EXIST IN 2012 THINGS THAT DIDN'T EXIST IN 2012 Television Wifi 17 . 1

Slide 18

Slide 18 text

OUR MINIMAL STACK OUR MINIMAL STACK Apache Cloudstack Puppet Good old MySQL A third-party customer management tool Python + AngularJS Riemann 18 . 1

Slide 19

Slide 19 text

OUR MINIMAL STACK OUR MINIMAL STACK 19 . 1

Slide 20

Slide 20 text

RIEMANN RIEMANN The common saying back then was monitoring sucks Push-based model was a great fit for our use case Riemann was in a rough state back then A great opportunity to contribute 20 . 1

Slide 21

Slide 21 text

2013: GOING LIVE 2013: GOING LIVE 21 . 1

Slide 22

Slide 22 text

BACKEND DEVELOPERS DOING BACKEND DEVELOPERS DOING FRONTEND FRONTEND 22 . 1

Slide 23

Slide 23 text

THINGS OUR EARLY ADOPTERS ENJOYED THINGS OUR EARLY ADOPTERS ENJOYED Vagrant support Security groups instead of firewalling A public IP per instance 23 . 1

Slide 24

Slide 24 text

IMPROVING RELEASE AUTOMATION IMPROVING RELEASE AUTOMATION 24 . 1

Slide 25

Slide 25 text

WARP WARP 25 . 1

Slide 26

Slide 26 text

WARP WARP 26 . 1

Slide 27

Slide 27 text

WARP WARP Open Source TLS client certificate-based authentication IRC support Haskell Go agent Prefigured our inclination for Clojure at the orchestration layer 27 . 1

Slide 28

Slide 28 text

TROUBLE KICKS IN TROUBLE KICKS IN Late payments Bitcoin mining on free credit 28 . 1

Slide 29

Slide 29 text

SOLVING ABUSE SOLVING ABUSE Need to pull data from a bunch of places Standard FSM type of problem 29 . 1

Slide 30

Slide 30 text

A NEW FAVORITE: A NEW FAVORITE: CORE.MATCH CORE.MATCH (match [state new-state unpaid-invoices?] [:ok :warning _ ] :warn! [:ok :critical _ ] :suspend! [:warning :critical _ ] :suspend! [:warning :ok _ ] :active! [:critical :ok false ] :active! [:critical :warning false ] :active! [_ _ _ ] nil) 30 . 1

Slide 31

Slide 31 text

SOME THINGS WE LEARNED SOME THINGS WE LEARNED Running Clojure processes in good old cron is perfect Logback's logging context is a huge plus 31 . 1

Slide 32

Slide 32 text

2014: THE YEAR OF STORAGE 2014: THE YEAR OF STORAGE 32 . 1

Slide 33

Slide 33 text

OBJECT STORAGE OBJECT STORAGE The obvious choice for our crowd Architecturally simpler than distributed block storage A good complement to our local storage backed instances 33 . 1

Slide 34

Slide 34 text

OBJECT STORAGE NEEDS OBJECT STORAGE NEEDS S3 is the sole player in that field: we need API compatibility The only alternative at the time was bad HTTP extensions 34 . 1

Slide 35

Slide 35 text

OBJECT STORAGE IN THE WILD OBJECT STORAGE IN THE WILD Ceph Riak-CS Swi Costly vendor-backed solutions 35 . 1

Slide 36

Slide 36 text

WRITING AN OBJECT STORE WRITING AN OBJECT STORE We focused on how to store large objects Tempted by a description of the (non-OpenSource) approach by Datastax on top of Cassandra 36 . 1

Slide 37

Slide 37 text

CHOOSING CASSANDRA CHOOSING CASSANDRA Great library support, thanks @mpenet! Simple for us to operate Very few moving parts Our implementation could remain fully stateless 37 . 1

Slide 38

Slide 38 text

WE WERE (ALMOST) YOUNG AND (WAY WE WERE (ALMOST) YOUNG AND (WAY TOO) NAIVE TOO) NAIVE How are could it be? 38 . 1

Slide 39

Slide 39 text

WHAT WE DIDN'T ANTICIPATE WHAT WE DIDN'T ANTICIPATE It's not all about actual data storage The S3 API is a beast The S3 API is under specified The S3 API is not versioned The S3 API client landscape is a mess 39 . 1

Slide 40

Slide 40 text

A QUICK DIGRESSION: S3 REQUESTS A QUICK DIGRESSION: S3 REQUESTS Operation: put object foo in bucket bar: PUT /foo Host bar.sos-ch-dk-2.exo.io Authorization: AWS .... <...> 40 . 1

Slide 41

Slide 41 text

A QUICK DIGRESSION: S3 REQUESTS A QUICK DIGRESSION: S3 REQUESTS Operation: update acl for object foo in bucket bar: PUT /foo?acl Host bar.sos-ch-dk-2.exo.io Authorization: AWS .... X-Amz-ACL: bucket-owner-full-control 41 . 1

Slide 42

Slide 42 text

A QUICK DIGRESSION: S3 REQUESTS A QUICK DIGRESSION: S3 REQUESTS Operation: Copy object bim from bucket bam to object foo in bucket bar: PUT /foo Host bar.sos-ch-dk-2.exo.io Authorization: AWS .... X-Amz-Copy-Source: /bim/bam X-Amz-Copy-Source-If-Unmodified-Since: ARE YOU KIDDING ME? 42 . 1

Slide 43

Slide 43 text

BY THE WAY BY THE WAY Storing terrabytes of data on off-the-shelf hardware doesn't come by easy either Input and output payloads of arbitrary lengths aren't easy Compojure, Ring, and usual suspects are out 43 . 1

Slide 44

Slide 44 text

SOME THINGS WE LEARNED SOME THINGS WE LEARNED This was our largest application to date Component didn't exist We built a hacky similar thing based on plain maps Maintenance of the application starts becoming an issue Maps can lead to threading malformed data for a while 44 . 1

Slide 45

Slide 45 text

2015: SCALING UP 2015: SCALING UP 45 . 1

Slide 46

Slide 46 text

THINGS ARE RUNNING SMOOTHLY THINGS ARE RUNNING SMOOTHLY Load on the platform is increasing We have a lot of event generating systems Tons of logs Tongs of metrics 46 . 1

Slide 47

Slide 47 text

WE CAN'T DO EVERYTHING WITH CRON WE CAN'T DO EVERYTHING WITH CRON So we install a Kafka cluster 47 . 1

Slide 48

Slide 48 text

WHY KAFKA? WHY KAFKA? Partition-isolated consistency Disaggregating memory 48 . 1

Slide 49

Slide 49 text

WHY KAFKA? WHY KAFKA? 49 . 1

Slide 50

Slide 50 text

A FIRST CANDIDATE: BANDWIDTH A FIRST CANDIDATE: BANDWIDTH METERING METERING Traffic accounting on hypervisors, with a small C agent 30 second aggregates sent over to Kafka A Clojure Kafka consumer on the other end 50 . 1

Slide 51

Slide 51 text

KEY TAKEWAY KEY TAKEWAY Non-glue Clojure code is around 150 loc Altogether around 500 lines It seems as though Clojure was written to write Kafka consumers 51 . 1

Slide 52

Slide 52 text

THIS HAMMER NEEDS NEW NAILS THIS HAMMER NEEDS NEW NAILS We have a recurring issue with DNS updates and need more flexibility building zones 52 . 1

Slide 53

Slide 53 text

AN EXPERIMENT: BLOG POST DRIVEN AN EXPERIMENT: BLOG POST DRIVEN DEVELOPMENT DEVELOPMENT

Slide 54

Slide 54 text

53 . 1

Slide 55

Slide 55 text

LOG COMPACTION LOG COMPACTION 54 . 1

Slide 56

Slide 56 text

LOG COMPACTON LOG COMPACTON 55 . 1

Slide 57

Slide 57 text

KALZONE: DYNAMIC DNS WITH KAFKA KALZONE: DYNAMIC DNS WITH KAFKA Works great across a large number of clients Great foundation for more infrastructure inventory solutions Kafka log compaction is a huge plus 56 . 1

Slide 58

Slide 58 text

2016: FAST GROWTH 2016: FAST GROWTH 57 . 1

Slide 59

Slide 59 text

SECURED FUNDING IN LATE 2015 SECURED FUNDING IN LATE 2015 58 . 1

Slide 60

Slide 60 text

USE OF PROCEEDS USE OF PROCEEDS People A new datacenter 59 . 1

Slide 61

Slide 61 text

SELLING ON THE WEB SELLING ON THE WEB We simplify our online funnel A drip process 60 . 1

Slide 62

Slide 62 text

DRIP PROCESS DRIP PROCESS core.match to the rescue again Yet another reason to write a cron 61 . 1

Slide 63

Slide 63 text

BILLING ISSUES BILLING ISSUES The cron based approach to billing is showing its limit Hard to keep it at a hourly rate because it takes too long 62 . 1

Slide 64

Slide 64 text

AT A CROSSROADS AT A CROSSROADS 63 . 1

Slide 65

Slide 65 text

AT A CROSSROADS AT A CROSSROADS 64 . 1

Slide 66

Slide 66 text

AT A CROSSROADS AT A CROSSROADS 65 . 1

Slide 67

Slide 67 text

KAFKA TO THE RESCUE KAFKA TO THE RESCUE A full rewrite of our billing stack Sub 1k loc 66 . 1

Slide 68

Slide 68 text

KEY TAKEWAYS KEY TAKEWAYS Incredible reliability The system can weather temporary failures with no billing impact Transducers fit in perfectly with Kafka We wrote a few of our own 67 . 1

Slide 69

Slide 69 text

2017: TOO MUCH DATA 2017: TOO MUCH DATA 68 . 1

Slide 70

Slide 70 text

SUDDEN S3 PICKUP IN USAGE SUDDEN S3 PICKUP IN USAGE Our initial implementation limits the throughput Tail latencies go through the roof Cassandra is just not great at doing dense nodes We knew this going in We hit the wall hard 69 . 1

Slide 71

Slide 71 text

WE NEED A NUMBER OF NEW API WE NEED A NUMBER OF NEW API CAPABILITIES CAPABILITIES V4 signatures are becoming the norm for S3 Better ACL support is needed The docker registry exercises all weird properties of the API 70 . 1

Slide 72

Slide 72 text

WE FIND A GOOD PAPER WE FIND A GOOD PAPER Ambry attacks the same problem space The paper lays out a great strategy 71 . 1

Slide 73

Slide 73 text

LET'S WRITE A DISTRIBUTED SYSTEM LET'S WRITE A DISTRIBUTED SYSTEM FROM SCRATCH FROM SCRATCH What could go wrong? 72 . 1

Slide 74

Slide 74 text

BETTING ON BETTING ON CORE.ASYNC CORE.ASYNC To better understand netty internals we settle on writing our own facade This brings less baggage than aleph A storage agent in C Zookeeper for agent discovery We keep Cassandra for metadata storage 73 . 1

Slide 75

Slide 75 text

NEW THINGS NEW THINGS Component Spec A larger reagent frontend app 74 . 1

Slide 76

Slide 76 text

UI UI 75 . 1

Slide 77

Slide 77 text

KEY LEARNINGS KEY LEARNINGS Component is our go-to daemon structuring tool Netty is hard Reconciling byte buffer manipulation with the immutable Clojure world can be tricky Transducers were a life saver against memory leaks Test on sequences Runs against core.async channels Spec helps a lot with reliability and maintenance We still don't do enough generative testing 76 . 1

Slide 78

Slide 78 text

2018: WORLD DOMINATION! 2018: WORLD DOMINATION! 77 . 1

Slide 79

Slide 79 text

OUR CURRENT STATE OUR CURRENT STATE 78 . 1

Slide 80

Slide 80 text

GOOD CORE LIBRARIES GOOD CORE LIBRARIES Unilog Kinsky Net Reporter Raven Uncaught Signal 79 . 1

Slide 81

Slide 81 text

WHAT WE'RE MISSING WHAT WE'RE MISSING A good daemon template Some goverance around our library A clojure for systems developement 80 . 1

Slide 82

Slide 82 text

BUILDING ON KUBERNETES BUILDING ON KUBERNETES We previously bet on Mesos Recent changes make running Clojure apps on Kubernetes nice and easy Upcoming library for configuration of Kubernetes applications Upcoming library to build Kubernetes controllers in Clojure 81 . 1

Slide 83

Slide 83 text

AN API GATEWAY AN API GATEWAY The frontdoor to our infrastructure Leverages all our work around asynchronous networking A great way to put spec to work Will give us great capabilities to do smart RBAC 82 . 1

Slide 84

Slide 84 text

FRONTEND FRONTEND We use it for internal tooling already It's time to switch our main console Re-frame gives us great confidence in making the jump 83 . 1

Slide 85

Slide 85 text

LOOKING BACK LOOKING BACK 84 . 1

Slide 86

Slide 86 text

WHAT WE DON'T DO IN CLOJURE WHAT WE DON'T DO IN CLOJURE SQL-backed APIs Low-level development 85 . 1

Slide 87

Slide 87 text

THE USUAL QUESTIONS THE USUAL QUESTIONS Community Hiring 86 . 1

Slide 88

Slide 88 text

THANKS THANKS We need help building all of this! 87 . 1