Slide 1

Slide 1 text

A Little Clojure at Netflix Dave Ray @darevay Software Engineer, Netflix October 2013

Slide 2

Slide 2 text

Agenda ● Netflix Culture ● Why Clojure? ● Our Path ● The Bad ● The Ugly ● The Good

Slide 3

Slide 3 text

Freedom and Responsibility ... “We hire smart people, give them hard problems and get out of their way. We strive to increase the freedom of our employees as we grow, enabling them to move quickly as the industry evolves. With that freedom comes increased responsibility. High performers thrive in that environment and make great choices for Netflix.” http://jobs.netflix.com/who-we-are.html

Slide 4

Slide 4 text

… Best Tool For The Job

Slide 5

Slide 5 text

Our Team ● Netflix Social Infrastructure ○ Social data storage/analysis, stateless web services ○ Support Social APIs used by many Netflix UIs/devices ● 3 engineers ● 1 supportive manager ● 1 medium-size-ish existing Java codebase

Slide 6

Slide 6 text

(about :clojure) ● Modern LISP ● JVM ● Functional ● Dynamic ● Opinionated You’re doing it wrong

Slide 7

Slide 7 text

Data 1, 3.14, 1/2 ; Numbers "A string" this ; A symbol, used to name things :a-keyword ; Used for enumerations and map keys {:key1 "value1" :key2 "value2"} ; HashMap [:a :vector 1 2 3] ; Like java.util.ArrayList '(this is :a :list 1 2 3) ● All data objects are immutable by default ● Data is composable.

Slide 8

Slide 8 text

Naming things ; def names a global thing (def pi 3.14) ; Use let to bind values to local ; names (let [r 3.0 c (* 2.0 Math/PI r) a (* Math/PI (* r r))] {:radius r :circumference c :area a}) ;=> {:radius 3.0, :circumference 18.84955592153876, :area 28.274333882308138}

Slide 9

Slide 9 text

Functions ; Use fn to create an ; anonymous function (fn [x] (* x x)) ; Use defn to define a named function (defn square [x] (* x x)) ; Call a function (square 3) ;=> 9 ; Pass a function as an argument (map square [1 2 3]) ;=> [1 4 9]

Slide 10

Slide 10 text

Macros ; A macro can control evaluation, ; e.g. short-circuiting and expression (and (even? a) (odd? b)) -> (if (even? a) (if (odd? b) true false) false) ● Clojure code is data ● A macro is a function that takes code and returns new code ● Invoked by the compiler ; A macro can also reduce boilerplate (rx/fn [a] (* 2 a)) -> (reify rx.Func1 (call [this a] (* 2 a)))

Slide 11

Slide 11 text

Factors ● Opinionated ● Abstraction ● Data transformation ● Interactive development ● Java interop Non-factors ● Concurrency, STM, agents, etc. Choosing Clojure

Slide 12

Slide 12 text

Our Path to Clojure ● > 1 year of toe dipping ● External Tools ● Diagnostic services ● Greenfield production service ● Real production code

Slide 13

Slide 13 text

Joyspring - “Netflix REPL” More sane/powerful than sh+curl (def s (subscriber/subscriber+ 12345)) ;=> { ... user data from subscriber service ...} ; Check if they're in A/B test 4567 (ab/allocation+ s 4567) ;=> nil ; Forcibly allocate them to test 4567 cell 2 (ab/allocate+ s 4567 2) ; Get social info (social/profile+ s :netflix) ;=> { ... social connection status ...} Separate tool, free from constraints of Netflix platform

Slide 14

Slide 14 text

Diagnostic Services ● Non-critical, "WTF is going on?!?" services. ● One-off maintenance jobs like database migrations ● Dip toes into Netflix platform from Clojure ● Dip toes into Netflix build infrastructure (Ant, Ivy, and Jenkins, oh my!)

Slide 15

Slide 15 text

Non-Critical Greenfield Service ● What’s a greenfield?! ● Pure Clojure implementation of small, low risk service, in production ● Learn about: app structure, DI issues, testing, build, deploy

Slide 16

Slide 16 text

Production - Bite the Bullet ● New production features in Clojure ● This is where it gets scary ● More people, more code

Slide 17

Slide 17 text

Java, meet Clojure ● It’s easy! ● Add clojure.jar ● Put .clj files on classpath (like in src/main/resources) ● Done ● Yeah, but...

Slide 18

Slide 18 text

Java, meet Clojure What to keep, what to rewrite? ● Take it easy ● No need to throw out working code ● Clojure has good Java interop ● When you need to write new Java, write Clojure instead ● Tastefully add abstractions as you go

Slide 19

Slide 19 text

Java, meet Clojure ● Escape Java as fast as you can! ● Collect args, call a single entry point Be careful with caching. Breaks interactive model. ; Some Clojure code (ns com.netflix.mine) (defn func-to-call [x] (* 2 x)) // Invoke it from Java final Var require = RT.var("clojure.core", "require"); require.invoke(Symbol.intern("com.netflix.mine")); final Var funcToCall = RT.var("com.netflix.mine", "func-to-call"); assertEquals(198L, funcToCall.invoke(99L));

Slide 20

Slide 20 text

Java, meet Clojure … but ● Where do tests go? ○ JUnit? ○ clojure.test? ● Data structures? ○ Keep existing objects? ○ Map from Clojure maps and back? ● Classes? What about classes? ● Dependency Injection

Slide 21

Slide 21 text

Where do tests go? ● We write tests in clojure.test ○ Still a lot of Java so we have some helper macros for Mockito ● Custom JUnit4 test runner that finds and runs Clojure tests ○ Run tests from Eclipse or wherever ○ Tests magically appear in jenkins ● There are many options here. Our approach is pragmatic, bowing to playing nicely within Netflix build infrastructure

Slide 22

Slide 22 text

Data Structures? ● But I typed in all these dumb Pojos already! ● Again, we’ve been pragmatic ● For existing code, for the most part, stick with existing Java objects ● For new code, use plain maps and simple Clojure types ● Occasional conversion functions where “old meets new”

Slide 23

Slide 23 text

Dependency Injection? ● AKA Passing parameters around ● IMHO DI doesn’t magically go away in a dynamic language ○ I still need to get the Cassandra Keyspace object to functions that use it ○ It’s still DI even if the dependency is a simple function instead of an object implementing and interface ● We currently take the “big context map” approach. Have other ideas we’d like to try

Slide 24

Slide 24 text

The Bad

Slide 25

Slide 25 text

(Lack of) type system EOM

Slide 26

Slide 26 text

Refactoring ● One of the few things Java (tooling) is good at, but still isn’t perfect ● … but I have broken prod by moving a Java class ref'd by property files ● 1/10th the code that does 10x as much. Maybe it's not so bad? ● Still a pain point, especially if test coverage is low...

Slide 27

Slide 27 text

Working with Others ● Clojure code is dense, especially someone else’s clojure code ● Given an undocumented function with arbitrary args, what does it accept/produce? ● Need more discipline about documentation, pre/post-conditions, schemas

Slide 28

Slide 28 text

Existing Java Libraries ● Annotation fetish makes life difficult ● Singleton fetish makes life difficult

Slide 29

Slide 29 text

http://perevodik.net/en/posts/39/

Slide 30

Slide 30 text

java.lang.ClassCastException: java.lang.Long cannot be cast to clojure.lang.IFn at joyspring.main$eval2203.invoke(NO_SOURCE_FILE:1) at clojure.lang.Compiler.eval(Compiler.java:6619) at clojure.lang.Compiler.eval(Compiler.java:6582) at clojure.core$eval.invoke(core.clj:2852) at clojure.main$repl$read_eval_print__6588$fn__6591.invoke(main.clj:259) at clojure.main$repl$read_eval_print__6588.invoke(main.clj:259) at clojure.main$repl$fn__6597.invoke(main.clj:277) at clojure.main$repl.doInvoke(main.clj:277) at clojure.lang.RestFn.invoke(RestFn.java:1096) at clojure.tools.nrepl.middleware.interruptible_eval$evaluate$fn__1610.invoke (interruptible_eval.clj:56) at clojure.lang.AFn.applyToHelper(AFn.java:159) at clojure.lang.AFn.applyTo(AFn.java:151) at clojure.core$apply.invoke(core.clj:617) at clojure.core$with_bindings_STAR_.doInvoke(core.clj:1788) at clojure.lang.RestFn.invoke(RestFn.java:425) at clojure.tools.nrepl.middleware.interruptible_eval$evaluate.invoke(interruptible_eva 41) at clojure.tools.nrepl.middleware.interruptible_eval$interruptible_eval$fn__1651$fn__1 invoke(interruptible_eval.clj:171) at clojure.core$comp$fn__4154.invoke(core.clj:2330) at clojure.tools.nrepl.middleware.interruptible_eval$run_next$fn__1644.invoke (interruptible_eval.clj:138) at clojure.lang.AFn.run(AFn.java:24) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) at java.lang.Thread.run(Thread.java:680) OMG, the Stacktraces!?!?!?!?

Slide 31

Slide 31 text

No content

Slide 32

Slide 32 text

The Ugly

Slide 33

Slide 33 text

Mocking (defn viewing-history [user] ... returns a seq of viewing record maps ... [{:video 1234 :start 0 :end 60 :duration 60} ...]) (def videos-watched [user] (->> user viewing-history (filter (fn [{:keys [start end duration]}] (> (/ (- end start) duration) .75)))) (deftest test-videos-watched (let [mock-viewing-history [{:video ...} {:video ...} ...] (with-redefs [viewing-history (constantly mock-viewing-history)] ... call videos-watched and test some stuff ...))) ; ... now we make a change to viewing-history … (defn viewing-history [user] ... returns a seq of maps ... [{:video 1234 :percent-watched 0.5} ...]) ; re-run videos-watched test. Success! WAT?!? Mocking "freezes" data and interactions. brrrr….

Slide 34

Slide 34 text

Thawing Mocks clojure.core.typed (require '[clojure.core.typed :refer [ann]]) (ann pi Double) ; annotate a constant (def pi "Pi, more or less" 3.14) ; annotate a function (ann area [Double -> Double]) (defn area [r] (* pi (* r r))) https://github.com/clojure/core.typed A la carte static type checking for Clojure Powerful, but invasive

Slide 35

Slide 35 text

Thawing Mocks Prismatic Schema (use 'clojure.core.contracts) (def area (with-constraints (fn [r] (* pi (* r r))) (contract area-contract [r] [number? => (number? %)]))) (require [schema.core :as s]) (s/defn area :- Double [r :- Double] (* pi (* r r))) clojure.core.contracts Structured, composable runtime assertions https://github.com/prismatic/schema https://github.com/clojure/core.contracts

Slide 36

Slide 36 text

So, Integration Tests ● Exercise module boundaries ○ This is where stuff breaks ● Requires ○ Dedicated test/staging environment ○ Occasional diagnostic endpoints for setup ● Write them in Clojure! ○ clojure.test + Joyspring

Slide 37

Slide 37 text

So, Integration Tests ● Slower? ○ (but interactive development) ● Brittle? ○ False negatives due to state ○ False negatives due to normal failures ● Occasional Hacks ○ Services tuned for production use. Differs from integration test patterns ○ Caches!

Slide 38

Slide 38 text

Integration Test Failure Fail (deftest test-link-visitor-to-facebook-id-failure (testing "A failure while linking fb id to customer leaves user in not_connected state" (fixture/ensure-test-user-disconnected) (sabot/inject (at hystrix.linkVisitorToFacebookId eval (throw (RejectedExecutionException. "test-link-visitor-to-facebook-id-failure"))) => (try+ (fixture/ensure-test-user-facebook-connected) (throw (Exception. "Request unexpectedly succeeded.")) (catch (comp #{503} :status) e (let [s (subscriber/subscriber+ (:customer-id fixture/netflix-user))] ; make sure status is restored in subscriber and fb id is removed (is (= "not_connected" (get-in s [:social :connection-status]))))))))) Sabot is a Clojure library for injecting specific, fine-grained failures into a request Magic here!

Slide 39

Slide 39 text

The Good Not Pictured

Slide 40

Slide 40 text

Interactive Workflow ● Write integration test ● Start server ● Connect REPL ● Edit, reload, test ○ You can unit test in here too!

Slide 41

Slide 41 text

Web REPL http://blog.jayfields.com/2012/06/clojure-production-web-repl.html ● Explore instance state ● Quickly sanity check function behavior ● Easy back-of-envelope performance characteristics in real world conditions (region!)

Slide 42

Slide 42 text

Abstraction ● Clojure is extremely expressive ● Build a language for your problem

Slide 43

Slide 43 text

Abstraction ● Not about lines of code. About clearly expressing intent ● Take it easy ○ Programmers love to wrap, especially Java ● You'll get it wrong at least once ● Wrapping or abstracting is language design, i.e. hard

Slide 44

Slide 44 text

Abstracting Hystrix ● https://github.com/Netflix/Hystrix ● Resilience via thread pools, circuit breakers, fallbacks // Define a command in Java public class GetUserCommand extends HystrixCommand { private final HttpClient client; private final long id; public GetUserCommand(HttpClient client, long id) { this.client = client; this.id = id; } @Override protected User run() { return client.get("/user/" + id, User.class); } @Override protected User getFallback() { return User.missing(id); } } // ... and use it .... new GetUserCommand(client, id).execute();

Slide 45

Slide 45 text

Abstracting Hystrix (defn get-user [client id] (.get client id User)) (require '[com.netflix.hystrix.core :as hystrix]) (hystrix/defcommand get-user {:hystrix/fallback-fn (fn [client id] (User/missing id))} [client id] (.get client id User)) ; ... and use it. OMG, just a fn call!! (get-user client 12345) This is what we’re doing, just defining a function So why does the Hystrix command have to look so different? No boilerplate here...

Slide 46

Slide 46 text

Abstracting Pig ● http://pig.apache.org/ ● Map-reduce platform/language ● Already a (bad) abstraction! REGISTER 'pp-example.py' USING jython AS func; users = LOAD 'users.tsv' as (user_id:long, image50x50:chararray, image145x145:chararray, image400x400:chararray); user_images = FOREACH users GENERATE TOBAG(image50x50, image145x145, image400x400) AS image_bag:bag{t:tuple(image: chararray)}; user_images_flat = FOREACH user_images GENERATE FLATTEN(image_bag); domains = FOREACH user_images_flat GENERATE func.extractDomain(image_bag::image); distinct_images = DISTINCT user_images; store distinct_images into 'domains'; import re @outputSchema('image:chararray') def extractDomain(image): match = re.search('(https?://.*?)/', image) return match.group(0);

Slide 47

Slide 47 text

Abstracting Pig - Pigpen (require '[pigpen.pig :as pig] '[pigpen.exec :as exec]) (defn users [] (pig/load-clj "users.tsv")) (defn domains [] (->> (users) (pig/mapcat (juxt :image50x50 :image145x145 :image400x400)) (pig/map (fn [url] (if url (second (re-find #"(https?://.*?)/" url))))) (pig/distinct))) (defn domains-script [f] (->> (domains) (pig/store-pig "domains") (exec/write-script f))) ; What's this? A test? (deftest test-domains (with-redefs [users (constantly [ ... mock data ... ])] (is (= (exec/debug (domains)) ["http://facebook.com" "https://facebook.com" nil]))])) ● Clojure is pretty awesome at data manipulation. ● Also a real language. ● Also, we already know it ● Use it! Compile this Clojure code to Pig!

Slide 48

Slide 48 text

Functions over Macros Consider combining 3 RxJava Observables An Observable is an asynchronous sequence ; Raw RxJava (Observable/zip (reify rx.util.functions.Func3 (call [this a b c] (+ a b c))) stream-1 stream-2 stream-3) https://github.com/Netflix/RxJava Initially, we use raw Java interop to implement the “Func3” interface. This is tedious.

Slide 49

Slide 49 text

Functions over Macros Consider combining 3 RxJava Observables An Observable is an asynchronous sequence ; With a macro (Observable/zip (rx/fn [a b c] (+ a b c))) stream-1 stream-2 stream-3) https://github.com/Netflix/RxJava I know! Macros are great for eliminating boilerplate! Enter rx/fn macro

Slide 50

Slide 50 text

Functions over Macros Consider combining 3 RxJava Observables An Observable is an asynchronous sequence ; With a function (Observable/zip (rx/fn* +) stream-1 stream-2 stream-3) https://github.com/Netflix/RxJava But a function can do better, allowing composition with existing Clojure functions

Slide 51

Slide 51 text

Separate representation/behavior // This can be expressed more fluently/builder-y keyspace.put() .withRow(1234) .withColumn(“name”) .withValue(“dave”) .withTtl(90) .execute() // Consider a typical method call putColumn(keyspace, 1234, “name”, “dave”, 90); ● No representation of the operation ● Closer, but without a lot of work still isn’t manipulable, introspectable, reusable etc

Slide 52

Slide 52 text

Separate representation/behavior ● Clojure supports “super”-builder pattern out of the box, aka DATA ; Define ops as map and execute (let [op {:keyspace ks :type :put :row 12345 :column "name" :value "dave" :ttl 90}] (execute op)) ; Helpers to make it “fluent” (-> (keyspace ks) put (row 12345) (column "name" "dave" 90) execute) ; Anyone can define new helpers ; Here’s a partial op (defn user-op [id] (-> (keyspace default-ks) (row id))) (-> (user-op 12345) put (column "name" "dave" 90) execute)

Slide 53

Slide 53 text

Emulate Existing Idioms ● Abstraction is language design ● Abstractions that emulate existing idioms in the host language (Clojure) will ○ Look better ○ Be easier to understand without research ○ Play better with existing features ● In Hystrix, defcommand is structurally identical to defn ○ Easy to switch. Easy to understand. ● Pigpen (mostly) has semantics identical to Clojure data pipelines

Slide 54

Slide 54 text

Emulate Existing Idioms - rx/let-o ● clojure.core/let makes wiring together data transforms easy ● Not so in Rx, especially for expressions with “forks” ● Enter rx/let-o to take care of the details (rx/let-o [?user (get-user-o 123) ?friends (rx/mapcat (fn [u] (map get-friends-o (:friends u))) ?user) ?ab (rx/mapcat get-ab ?user)] (rx/merge ?user ?friends ?ab))

Slide 55

Slide 55 text

Conclusion ● Take your time ● Clojure can produce big gains in productivity and satisfaction ● Yes, it's scary ● Mental shift required ○ Clojure examples to Clojure “in the large” ○ People are still figuring this out