Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Practical core.async

Avatar for Max Penet Max Penet
October 01, 2014

Practical core.async

Alia + Jet on core.async

Avatar for Max Penet

Max Penet

October 01, 2014
Tweet

More Decks by Max Penet

Other Decks in Programming

Transcript

  1. Me? Max Penet, @mpenet on github/twitter • recently moved in

    Lund, Sweden • clojure user since the early days, ~2010 • working with clojure professionally the last 4 years • Interested in functional programming languages, distributed systems, like short project names and copper pans
  2. • makes smarter use of resources • fits nicely when

    dealing with streams • complex data enrichment async is not a silver bullet, but ...
  3. Tools of the trade (for this pres.) • Cassandra 2.1+

    • Jetty9 • core.async as our clojure facade to this world.
  4. The old way • promises • lamina • plain old

    callbacks (often sugarified/sweetened via monadic constructs or macros) • java FooHandler Objects
  5. enter core.async • thread management solution • reduces complexity and

    also • great disguise for java APIs async interfaces
  6. core.async crash course • channels are "queues"/stream of things, can

    be buffered in different ways • take! or <! gets first element from it, put! or >! appends new one (kinda) • go blocks • all the rest, alts! and friends • transducers and the list goes on...
  7. But why Jetty9? • Jetty9 IO is a major overhaul

    • old blocking IO is history • rewritten core for modularity, performance, efficiency, async capabilities
  8. Some of the new features • Complete WebSocket support, Client

    & Server • SPDY, HTTP2 • Async HTTP Client • not awfull async support (if you messed with jetty6- continuations & other apis, you know what I mean) • Already super stable (looking at you http-kit)
  9. Jet's approach • full ring compatibility • all async is

    exposed via core.async, should be simple and reuse the same patterns all over the library • Don't trade performance, be a minimalistic layer over Jetty9
  10. what's in the box • HTTP Server • HTTP Client

    • WebSocket Client • WebSocket Server
  11. HTTP Server • ring compatible adapter • ring extension for

    async responses • (async) chunked response • full compatibility with most jetty adapters, down to the options
  12. (require '[clojure.core.async :as async]) (defn async-handler [request] (let [ch (async/chan)]

    (async/go (async/<! (async/timeout 1000)) (async/>! ch {:body "foo" :headers {"Content-Type" "foo"} :status 202})) ch)) (qbits.jet.server/run-jetty {:ring-handler async-handler}) Async responses
  13. Chunked responses (require '[clojure.core.async :as async]) (defn handler [request] (let

    [ch (async/chan 1)] (async/go (dotimes [i 5] (async/<! (async/timeout 300)) (async/>! ch (str i "\n"))) (async/close! ch)) ;; important! {:body ch :headers {"Content-Type" "prout"} :status 201})) (qbits.jet.server/run-jetty {:ring-handler handler :port ...})
  14. you can mix things ex: sync handler with async chunked

    response It's all values in the end, the server knows how to deal with them accordingly
  15. 2 patterns emerge • channel as a promise (unrealized single

    value) • channel as stream • closing! indicates termination, releasing resources, flushing buffers and so on
  16. WebSocket is a bit more complex • bidirectional (recv, send)

    • we must listen to control messages • we want to keep ring middleware compatibility (mostly)
  17. How? • WebSocket handlers receive a ring request map, so

    compatible with middlewares, routing libs etc • WebSocket handlers are separate from "normal" HTTP • request map is extended with 3 channels
  18. WebSocket request map channels • :ctrl will receive status messages

    such as: - [::error e] - [::close reason] • :in will receive content sent by this connected client • :out will allow you to push content to this connected client and close the socket
  19. code! (use 'qbits.jet.server) ;; Simple ping/pong server, will wait for

    PING, reply PONG and close connection (run-jetty {:port 8013 :websocket-handler (fn [{:keys [in out ctrl] :as request-map}] (async/go (when (= "PING" (async/<! in)) (async/>! out "PONG") (async/close! out))))}})
  20. That was the easy part In practice you want to

    listen to control messages, make sure to close your resources on disconnects/failures, maybe setup flow control/backpressure and so on.
  21. The WebSocket Client A JVM based client, we're not talking

    about cljs here. A companion cljs client would be welcomed (PR!), since the API could be identical.
  22. If you understood the server part, nothing new to learn

    for the client, the only differences are: • partial request map, obviously • the root function name is different Underlying type and implementation are shared
  23. code! (use 'qbits.jet.client.websocket) ;; Simple PING client to our server,

    sends PING, waits for PONG and ;; closes the connection (connect! "ws://localhost:8013/" (fn [{:keys [in out ctrl]}] (async/go (async/>! out "PING") (when (= "PONG" (async/<! in)) (async/close! out))))))
  24. This is a common pattern too Mixing pure "data" channels

    with "control" channels is something fairly common. No need to learn new abstractions, it's all the same type of values and they can be dealt with the same way.
  25. (fn [{:keys [in out ctrl] :as request}] (let [msg-ch (msgs/connect!)]

    (async/go (loop [] (let [[value channel] (async/alts! [ctrl msg-ch])] (cond ;; receiving (= channel msg-ch) (do (async/>! out (json/generate-string value)) (recur)) ;; closing (and (= channel ctrl) (#{:qbits.jet.websocket/error :qbits.jet.websocket/close} (first value))) (do (async/close! msg-ch) (async/close! entry-ch) (async/close! out))))))) A more realistic example
  26. Browser model Well think of a browser session/profile really •

    shared cookies • shared auth • connection pooling, keep-alives, timeouts etc
  27. What does it mean in practice HttpClient acts as a

    central configuration point for network parameters (such as idle timeouts) and HTTP parameters (such as whether to follow redirects). HttpClient transparently pools connections to servers, but allows direct control of connections for cases where this is needed. HttpClient also acts as a central configuration point for cookies, via getCookieStore().
  28. code! (use 'qbits.jet.client.http) (use 'clojure.core.async) ;; returns a chan (def

    cl (client {...})) ;; returns a chan (http/get cl "http://graph.facebook.com/zuck") user> #<ManyToManyChannel clojure.core.async.impl.channels.ManyToManyChannel@731db933>
  29. more code! ;; block for the response (<!! (http/get cl

    "http://graph.facebook.com/zuck")) user> {:status 200, :headers {"content-type" "text/javascript; charset=UTF-8", "access-control-allow-origin" "*", "content-length" "173", "date" "Wed, 06 Aug 2014 15:51:02 GMT", "cache-control" "private, no-cache, no-store, must-revalidate"}, :body #<ManyToManyChannel clojure.core.async.impl.channels.ManyToManyChannel@7ca698b0>}
  30. :body is a channel too! • Chunked reponses • it

    stays open as long as the client sends data
  31. sugar please http/get, http/post, http/put, http/delete, & all others top

    level functions are just sugar over a http/request function that takes a client instance + a map. Just like clj-http...
  32. Alia: Cassandra client https://github.com/mpenet/alia • Follows the same principles as

    Jet • Leverages Datastax's java-driver, async core based on netty • Used in production by some (very big) names for quite some time now
  33. The Basics (require '[qbits.alia :as alia]) (def cluster (alia/cluster {:contact-points

    ["localhost"]})) (def session (alia/connect cluster)) (alia/execute session "SELECT * from foo;") (alia/execute session (select :users (where {:name :foo}) (columns :bar "baz"))) (def prepared-statement (alia/prepare session "select * from users where user_name=?;")) (alia/execute session prepared-statement {:values ["frodo"]})
  34. The Async bits • channels as deferreds for simple async

    queries • same api as the blocking async call, only the return value changes
  35. Code! ;; the wrong way (take! (execute-chan session "select *

    from users;") (fn [rows-or-exception] (do-something rows))) ;; the eating your tail, useless way, blocking on single async task... (def rows-or-exception (<!! (execute-chan session "select * from users;")))
  36. compute all values • Very often you just want to

    compute a set of things in parallel and gather the results • pmap & future • your own tread pool • core.async makes this super easy
  37. What most people do the first time (go (let [foo

    (<! (execute-chan session "select * from foo;")) bar (<! (execute-chan session "select * from bar;")) baz (<! (execute-chan session "select * from baz;"))] (concat foo bar baz))) This works, but we could do better... In this case Alia does the queries asynchronously but in the order they are in the code, parking treads until query A is realized before moving on to B etc... In that example there's no dependency between a result-set and another, we can improve this.
  38. one way to do it (go (loop [;;the list of

    queries remaining queries [(alia/execute-chan session (select :foo)) (alia/execute-chan session (select :bar)) (alia/execute-chan session (select :baz))] ;; where we'll store our results query-results '()] ;; If we are done with our queries return them, it's the exit clause (if (empty? queries) query-results ;; else wait for one query to complete (first completed first served) (let [[result channel] (alts! queries)] (println "Received result: " result " from channel: " channel) (recur ;; we remove the channel that just completed from our ;; pending queries collection (remove #{channel} queries) ;; and finally we add the result of this query to our ;; bag of results (conj query-results result))))))
  39. ? • all queries are executed in parallel • first

    come first served • results are gathered as they are realized
  40. This is a very common pattern • You can do

    the same with any core.async "deferred" • Tons of little utility functions come to mind, we can abstract all this stuff
  41. chan containing seq of all results (defn realize-deferreds [channels] (async/go

    (loop [chans channels results []] (if (empty? chans) results (let [[value ch] (async/alts! chans)] (do (recur (remove #{ch} chans) (conj results value)))))))) (realize-deferreds [(alia/execute-chan ...) (alia/execute-chan ...) (alia/execute-chan ...)])
  42. better, a chan that is fed results (defn deferreds-chan [channels]

    (let [ch (async/chan (count channels))] (async/go (loop [chans channels] (if (empty? chans) (async/close! ch) (let [[x ch'] (async/alts! chans)] (async/>! ch x) (recur (remove #{ch'} chans)))))) ch))
  43. transducer support (defn deferreds-chan [channels xform] (let [ch (async/chan (count

    channels) xform)] (async/go (loop [chans channels] (if (empty? chans) (async/close! ch) (let [[x ch'] (async/alts! chans)] (async/>! ch x) (recur (remove #{ch'} chans)))))) ch))
  44. go blocks • for complex scenarios abuse them • a

    couple of helpers such as the previously mentioned + let & <! in go blocks works wonders TLDR Just create fns that returns channels and compose over them
  45. execute-chan-buffered • Cassandra supports streaming of rows • It's all

    asynchronous under the hood • Queries over large dataset are not memory hogs if handled properly • fetch-size is configurable • fetch-size + c.c.async buffer size combo!
  46. code! (execute-chan-buffered session "select * from items;") (execute-chan-buffered session "select

    * from items;" {:fetch-size 5}) (execute-chan-buffered session "select * from items;" {:fetch-size 5 :channel (async/chan 10 ...)})
  47. fully configurable You can have custom fetch-size, channel, let alia

    guess from a fetch size at query level or cluster level, or just default to (chan) meaning you will need to have a consumer if you dont' want the channel feeding to halt.
  48. Out of scope but… • fancy query DSL https://github. com/mpenet/hayt

    • lazy sequence over queries+predicates • basically supports everything Datastax's java-driver has to offer • hides the ugly from java (type coercion etc...)
  49. Public project using Alia and/or Jet • http://pithos.io/ Robust, scalable

    object store compatible with the industry-standard S3 API • https://github.com/pyr/cyanite Cassandra backed carbon daemon and metric web service • https://github.com/MastodonC/kixi.hecuba A data platform built with Clojure, ClojureScript, core.async, Om, Cassandra and other technologies
  50. things to be aware of • go blocks run on

    a fixed size thread pool • dont do blocking IO in go blocks! • put! internal queue size • exception handling • difficult to mix blocking and non-blocking, you often lose the advantage of one/both, Alia+Jet are a nice fit in this respect