Clojure at Netflix

8afe5283fb2d5f0a817c9c9ea4fa925e?s=47 Dave Ray
October 02, 2013

Clojure at Netflix

A talk for Craftsman Guild on my team's use of Clojure at Netflix. Describes good, bad, and ugly lessons learned from going from a pure-Java codebase to Clojure in production.

8afe5283fb2d5f0a817c9c9ea4fa925e?s=128

Dave Ray

October 02, 2013
Tweet

Transcript

  1. A Little Clojure at Netflix Dave Ray @darevay Software Engineer,

    Netflix October 2013
  2. Agenda • Netflix Culture • Why Clojure? • Our Path

    • The Bad • The Ugly • The Good
  3. Freedom and Responsibility ... “We hire smart people, give them

    hard problems and get out of their way. We strive to increase the freedom of our employees as we grow, enabling them to move quickly as the industry evolves. With that freedom comes increased responsibility. High performers thrive in that environment and make great choices for Netflix.” http://jobs.netflix.com/who-we-are.html
  4. … Best Tool For The Job

  5. Our Team • Netflix Social Infrastructure ◦ Social data storage/analysis,

    stateless web services ◦ Support Social APIs used by many Netflix UIs/devices • 3 engineers • 1 supportive manager • 1 medium-size-ish existing Java codebase
  6. (about :clojure) • Modern LISP • JVM • Functional •

    Dynamic • Opinionated You’re doing it wrong
  7. Data 1, 3.14, 1/2 ; Numbers "A string" this ;

    A symbol, used to name things :a-keyword ; Used for enumerations and map keys {:key1 "value1" :key2 "value2"} ; HashMap [:a :vector 1 2 3] ; Like java.util.ArrayList '(this is :a :list 1 2 3) • All data objects are immutable by default • Data is composable.
  8. Naming things ; def names a global thing (def pi

    3.14) ; Use let to bind values to local ; names (let [r 3.0 c (* 2.0 Math/PI r) a (* Math/PI (* r r))] {:radius r :circumference c :area a}) ;=> {:radius 3.0, :circumference 18.84955592153876, :area 28.274333882308138}
  9. Functions ; Use fn to create an ; anonymous function

    (fn [x] (* x x)) ; Use defn to define a named function (defn square [x] (* x x)) ; Call a function (square 3) ;=> 9 ; Pass a function as an argument (map square [1 2 3]) ;=> [1 4 9]
  10. Macros ; A macro can control evaluation, ; e.g. short-circuiting

    and expression (and (even? a) (odd? b)) -> (if (even? a) (if (odd? b) true false) false) • Clojure code is data • A macro is a function that takes code and returns new code • Invoked by the compiler ; A macro can also reduce boilerplate (rx/fn [a] (* 2 a)) -> (reify rx.Func1 (call [this a] (* 2 a)))
  11. Factors • Opinionated • Abstraction • Data transformation • Interactive

    development • Java interop Non-factors • Concurrency, STM, agents, etc. Choosing Clojure
  12. Our Path to Clojure • > 1 year of toe

    dipping • External Tools • Diagnostic services • Greenfield production service • Real production code
  13. Joyspring - “Netflix REPL” More sane/powerful than sh+curl (def s

    (subscriber/subscriber+ 12345)) ;=> { ... user data from subscriber service ...} ; Check if they're in A/B test 4567 (ab/allocation+ s 4567) ;=> nil ; Forcibly allocate them to test 4567 cell 2 (ab/allocate+ s 4567 2) ; Get social info (social/profile+ s :netflix) ;=> { ... social connection status ...} Separate tool, free from constraints of Netflix platform
  14. Diagnostic Services • Non-critical, "WTF is going on?!?" services. •

    One-off maintenance jobs like database migrations • Dip toes into Netflix platform from Clojure • Dip toes into Netflix build infrastructure (Ant, Ivy, and Jenkins, oh my!)
  15. Non-Critical Greenfield Service • What’s a greenfield?! • Pure Clojure

    implementation of small, low risk service, in production • Learn about: app structure, DI issues, testing, build, deploy
  16. Production - Bite the Bullet • New production features in

    Clojure • This is where it gets scary • More people, more code
  17. Java, meet Clojure • It’s easy! • Add clojure.jar •

    Put .clj files on classpath (like in src/main/resources) • Done • Yeah, but...
  18. Java, meet Clojure What to keep, what to rewrite? •

    Take it easy • No need to throw out working code • Clojure has good Java interop • When you need to write new Java, write Clojure instead • Tastefully add abstractions as you go
  19. Java, meet Clojure • Escape Java as fast as you

    can! • Collect args, call a single entry point Be careful with caching. Breaks interactive model. ; Some Clojure code (ns com.netflix.mine) (defn func-to-call [x] (* 2 x)) // Invoke it from Java final Var require = RT.var("clojure.core", "require"); require.invoke(Symbol.intern("com.netflix.mine")); final Var funcToCall = RT.var("com.netflix.mine", "func-to-call"); assertEquals(198L, funcToCall.invoke(99L));
  20. Java, meet Clojure … but • Where do tests go?

    ◦ JUnit? ◦ clojure.test? • Data structures? ◦ Keep existing objects? ◦ Map from Clojure maps and back? • Classes? What about classes? • Dependency Injection
  21. Where do tests go? • We write tests in clojure.test

    ◦ Still a lot of Java so we have some helper macros for Mockito • Custom JUnit4 test runner that finds and runs Clojure tests ◦ Run tests from Eclipse or wherever ◦ Tests magically appear in jenkins • There are many options here. Our approach is pragmatic, bowing to playing nicely within Netflix build infrastructure
  22. Data Structures? • But I typed in all these dumb

    Pojos already! • Again, we’ve been pragmatic • For existing code, for the most part, stick with existing Java objects • For new code, use plain maps and simple Clojure types • Occasional conversion functions where “old meets new”
  23. Dependency Injection? • AKA Passing parameters around • IMHO DI

    doesn’t magically go away in a dynamic language ◦ I still need to get the Cassandra Keyspace object to functions that use it ◦ It’s still DI even if the dependency is a simple function instead of an object implementing and interface • We currently take the “big context map” approach. Have other ideas we’d like to try
  24. The Bad

  25. (Lack of) type system EOM

  26. Refactoring • One of the few things Java (tooling) is

    good at, but still isn’t perfect • … but I have broken prod by moving a Java class ref'd by property files • 1/10th the code that does 10x as much. Maybe it's not so bad? • Still a pain point, especially if test coverage is low...
  27. Working with Others • Clojure code is dense, especially someone

    else’s clojure code • Given an undocumented function with arbitrary args, what does it accept/produce? • Need more discipline about documentation, pre/post-conditions, schemas
  28. Existing Java Libraries • Annotation fetish makes life difficult •

    Singleton fetish makes life difficult
  29. http://perevodik.net/en/posts/39/

  30. java.lang.ClassCastException: java.lang.Long cannot be cast to clojure.lang.IFn at joyspring.main$eval2203.invoke(NO_SOURCE_FILE:1) at

    clojure.lang.Compiler.eval(Compiler.java:6619) at clojure.lang.Compiler.eval(Compiler.java:6582) at clojure.core$eval.invoke(core.clj:2852) at clojure.main$repl$read_eval_print__6588$fn__6591.invoke(main.clj:259) at clojure.main$repl$read_eval_print__6588.invoke(main.clj:259) at clojure.main$repl$fn__6597.invoke(main.clj:277) at clojure.main$repl.doInvoke(main.clj:277) at clojure.lang.RestFn.invoke(RestFn.java:1096) at clojure.tools.nrepl.middleware.interruptible_eval$evaluate$fn__1610.invoke (interruptible_eval.clj:56) at clojure.lang.AFn.applyToHelper(AFn.java:159) at clojure.lang.AFn.applyTo(AFn.java:151) at clojure.core$apply.invoke(core.clj:617) at clojure.core$with_bindings_STAR_.doInvoke(core.clj:1788) at clojure.lang.RestFn.invoke(RestFn.java:425) at clojure.tools.nrepl.middleware.interruptible_eval$evaluate.invoke(interruptible_eva 41) at clojure.tools.nrepl.middleware.interruptible_eval$interruptible_eval$fn__1651$fn__1 invoke(interruptible_eval.clj:171) at clojure.core$comp$fn__4154.invoke(core.clj:2330) at clojure.tools.nrepl.middleware.interruptible_eval$run_next$fn__1644.invoke (interruptible_eval.clj:138) at clojure.lang.AFn.run(AFn.java:24) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) at java.lang.Thread.run(Thread.java:680) OMG, the Stacktraces!?!?!?!?
  31. None
  32. The Ugly

  33. Mocking (defn viewing-history [user] ... returns a seq of viewing

    record maps ... [{:video 1234 :start 0 :end 60 :duration 60} ...]) (def videos-watched [user] (->> user viewing-history (filter (fn [{:keys [start end duration]}] (> (/ (- end start) duration) .75)))) (deftest test-videos-watched (let [mock-viewing-history [{:video ...} {:video ...} ...] (with-redefs [viewing-history (constantly mock-viewing-history)] ... call videos-watched and test some stuff ...))) ; ... now we make a change to viewing-history … (defn viewing-history [user] ... returns a seq of maps ... [{:video 1234 :percent-watched 0.5} ...]) ; re-run videos-watched test. Success! WAT?!? Mocking "freezes" data and interactions. brrrr….
  34. Thawing Mocks clojure.core.typed (require '[clojure.core.typed :refer [ann]]) (ann pi Double)

    ; annotate a constant (def pi "Pi, more or less" 3.14) ; annotate a function (ann area [Double -> Double]) (defn area [r] (* pi (* r r))) https://github.com/clojure/core.typed A la carte static type checking for Clojure Powerful, but invasive
  35. Thawing Mocks Prismatic Schema (use 'clojure.core.contracts) (def area (with-constraints (fn

    [r] (* pi (* r r))) (contract area-contract [r] [number? => (number? %)]))) (require [schema.core :as s]) (s/defn area :- Double [r :- Double] (* pi (* r r))) clojure.core.contracts Structured, composable runtime assertions https://github.com/prismatic/schema https://github.com/clojure/core.contracts
  36. So, Integration Tests • Exercise module boundaries ◦ This is

    where stuff breaks • Requires ◦ Dedicated test/staging environment ◦ Occasional diagnostic endpoints for setup • Write them in Clojure! ◦ clojure.test + Joyspring
  37. So, Integration Tests • Slower? ◦ (but interactive development) •

    Brittle? ◦ False negatives due to state ◦ False negatives due to normal failures • Occasional Hacks ◦ Services tuned for production use. Differs from integration test patterns ◦ Caches!
  38. Integration Test Failure Fail (deftest test-link-visitor-to-facebook-id-failure (testing "A failure while

    linking fb id to customer leaves user in not_connected state" (fixture/ensure-test-user-disconnected) (sabot/inject (at hystrix.linkVisitorToFacebookId eval (throw (RejectedExecutionException. "test-link-visitor-to-facebook-id-failure"))) => (try+ (fixture/ensure-test-user-facebook-connected) (throw (Exception. "Request unexpectedly succeeded.")) (catch (comp #{503} :status) e (let [s (subscriber/subscriber+ (:customer-id fixture/netflix-user))] ; make sure status is restored in subscriber and fb id is removed (is (= "not_connected" (get-in s [:social :connection-status]))))))))) Sabot is a Clojure library for injecting specific, fine-grained failures into a request Magic here!
  39. The Good Not Pictured

  40. Interactive Workflow • Write integration test • Start server •

    Connect REPL • Edit, reload, test ◦ You can unit test in here too!
  41. Web REPL http://blog.jayfields.com/2012/06/clojure-production-web-repl.html • Explore instance state • Quickly sanity

    check function behavior • Easy back-of-envelope performance characteristics in real world conditions (region!)
  42. Abstraction • Clojure is extremely expressive • Build a language

    for your problem
  43. Abstraction • Not about lines of code. About clearly expressing

    intent • Take it easy ◦ Programmers love to wrap, especially Java • You'll get it wrong at least once • Wrapping or abstracting is language design, i.e. hard
  44. Abstracting Hystrix • https://github.com/Netflix/Hystrix • Resilience via thread pools, circuit

    breakers, fallbacks // Define a command in Java public class GetUserCommand extends HystrixCommand<User> { private final HttpClient client; private final long id; public GetUserCommand(HttpClient client, long id) { this.client = client; this.id = id; } @Override protected User run() { return client.get("/user/" + id, User.class); } @Override protected User getFallback() { return User.missing(id); } } // ... and use it .... new GetUserCommand(client, id).execute();
  45. Abstracting Hystrix (defn get-user [client id] (.get client id User))

    (require '[com.netflix.hystrix.core :as hystrix]) (hystrix/defcommand get-user {:hystrix/fallback-fn (fn [client id] (User/missing id))} [client id] (.get client id User)) ; ... and use it. OMG, just a fn call!! (get-user client 12345) This is what we’re doing, just defining a function So why does the Hystrix command have to look so different? No boilerplate here...
  46. Abstracting Pig • http://pig.apache.org/ • Map-reduce platform/language • Already a

    (bad) abstraction! REGISTER 'pp-example.py' USING jython AS func; users = LOAD 'users.tsv' as (user_id:long, image50x50:chararray, image145x145:chararray, image400x400:chararray); user_images = FOREACH users GENERATE TOBAG(image50x50, image145x145, image400x400) AS image_bag:bag{t:tuple(image: chararray)}; user_images_flat = FOREACH user_images GENERATE FLATTEN(image_bag); domains = FOREACH user_images_flat GENERATE func.extractDomain(image_bag::image); distinct_images = DISTINCT user_images; store distinct_images into 'domains'; import re @outputSchema('image:chararray') def extractDomain(image): match = re.search('(https?://.*?)/', image) return match.group(0);
  47. Abstracting Pig - Pigpen (require '[pigpen.pig :as pig] '[pigpen.exec :as

    exec]) (defn users [] (pig/load-clj "users.tsv")) (defn domains [] (->> (users) (pig/mapcat (juxt :image50x50 :image145x145 :image400x400)) (pig/map (fn [url] (if url (second (re-find #"(https?://.*?)/" url))))) (pig/distinct))) (defn domains-script [f] (->> (domains) (pig/store-pig "domains") (exec/write-script f))) ; What's this? A test? (deftest test-domains (with-redefs [users (constantly [ ... mock data ... ])] (is (= (exec/debug (domains)) ["http://facebook.com" "https://facebook.com" nil]))])) • Clojure is pretty awesome at data manipulation. • Also a real language. • Also, we already know it • Use it! Compile this Clojure code to Pig!
  48. Functions over Macros Consider combining 3 RxJava Observables An Observable

    is an asynchronous sequence ; Raw RxJava (Observable/zip (reify rx.util.functions.Func3 (call [this a b c] (+ a b c))) stream-1 stream-2 stream-3) https://github.com/Netflix/RxJava Initially, we use raw Java interop to implement the “Func3” interface. This is tedious.
  49. Functions over Macros Consider combining 3 RxJava Observables An Observable

    is an asynchronous sequence ; With a macro (Observable/zip (rx/fn [a b c] (+ a b c))) stream-1 stream-2 stream-3) https://github.com/Netflix/RxJava I know! Macros are great for eliminating boilerplate! Enter rx/fn macro
  50. Functions over Macros Consider combining 3 RxJava Observables An Observable

    is an asynchronous sequence ; With a function (Observable/zip (rx/fn* +) stream-1 stream-2 stream-3) https://github.com/Netflix/RxJava But a function can do better, allowing composition with existing Clojure functions
  51. Separate representation/behavior // This can be expressed more fluently/builder-y keyspace.put()

    .withRow(1234) .withColumn(“name”) .withValue(“dave”) .withTtl(90) .execute() // Consider a typical method call putColumn(keyspace, 1234, “name”, “dave”, 90); • No representation of the operation • Closer, but without a lot of work still isn’t manipulable, introspectable, reusable etc
  52. Separate representation/behavior • Clojure supports “super”-builder pattern out of the

    box, aka DATA ; Define ops as map and execute (let [op {:keyspace ks :type :put :row 12345 :column "name" :value "dave" :ttl 90}] (execute op)) ; Helpers to make it “fluent” (-> (keyspace ks) put (row 12345) (column "name" "dave" 90) execute) ; Anyone can define new helpers ; Here’s a partial op (defn user-op [id] (-> (keyspace default-ks) (row id))) (-> (user-op 12345) put (column "name" "dave" 90) execute)
  53. Emulate Existing Idioms • Abstraction is language design • Abstractions

    that emulate existing idioms in the host language (Clojure) will ◦ Look better ◦ Be easier to understand without research ◦ Play better with existing features • In Hystrix, defcommand is structurally identical to defn ◦ Easy to switch. Easy to understand. • Pigpen (mostly) has semantics identical to Clojure data pipelines
  54. Emulate Existing Idioms - rx/let-o • clojure.core/let makes wiring together

    data transforms easy • Not so in Rx, especially for expressions with “forks” • Enter rx/let-o to take care of the details (rx/let-o [?user (get-user-o 123) ?friends (rx/mapcat (fn [u] (map get-friends-o (:friends u))) ?user) ?ab (rx/mapcat get-ab ?user)] (rx/merge ?user ?friends ?ab))
  55. Conclusion • Take your time • Clojure can produce big

    gains in productivity and satisfaction • Yes, it's scary • Mental shift required ◦ Clojure examples to Clojure “in the large” ◦ People are still figuring this out