Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Onyx - Overview

Anuj
November 26, 2016

Onyx - Overview

Onyx - Overview slides as presented at IN/Clojure

Anuj

November 26, 2016
Tweet

Other Decks in Technology

Transcript

  1. Distributed Computation Engines • Cascalog ◦ Data processing on Hadoop

    ◦ Support for Clojure and Java • Apache Spark ◦ Large Scale Data processing and Interactive Analysis ◦ Streaming Support ◦ Built in Scala, Good support for Scala/Python ◦ Clojure DSLs- Flambo, Sparkling • Apache Flink ◦ Batch and Stream Data Processing ◦ Streaming Dataflow Engine ◦ No production support for Clojure (as of now) ▪ Word Count samples do exist * * https://github.com/mjsax/flink-external/tree/master/flink-clojure
  2. Apache Spark Scenario • Dataflow Execution Engine ◦ Multiple Apps

    for Interactive and Streaming ◦ Batch Jobs are Scheduled ◦ Any change requires Spark app to be resubmitted • Long Running Apps ◦ Deploy Once for Multiple Jobs ◦ Fixed resource allocation (not using dynamic allocation yet) • Clojure is enforced ◦ Glue Scala/Java Plug-ins
  3. Why Clojure? • Philosophy ◦ Clojure makes some really smart

    design decisions • Concurrency • Interop • Macros ◦ Code is data • REPL ◦ Immediate Feedback http://programmers.stackexchange.com/questions/179096/whats-so-great-about-clojure https://adambard.com/blog/ten-reasons-to-use-clojure/
  4. Why Onyx? • Written in Clojure, for Clojure ◦ 70%

    of our codebase is in Clojure • Architecture ◦ Masterless ◦ Fault Tolerant ◦ Cloud Scale ◦ Distributed Computation • Implementation ◦ Programs as immutable data structures ▪ Very close to our way of defining dataflow models ◦ Decouples behavioral set from specific execution • Unified API for both batch and stream processing ◦ As good as it sounds
  5. Components • ZooKeeper ◦ Only external dependency ◦ Dev mode

    can start standalone zookeeper via curator • Peer ◦ Only entity in Onyx ◦ All Peers are considered equal ◦ There is no master Peer (masterless design) ▪ No single coordinating process ▪ No entity to orchestrate the cluster ◦ Works on at-most one job at a time ◦ Virtual Peer ▪ Single Peer process running on a single physical machine ▪ Works on at most one task at a time
  6. Job Scheduling • All Peers contend to pick the job

    • Job Scheduling Strategies ◦ Greedy ◦ Balanced Robin ◦ Percentage • Task Scheduling ◦ Balanced ◦ Percentage ◦ Colocation ▪ Assigns to peers on the single machine, low latency, min network • Tags can be used to assign behavior to peers ◦ Make database peer so all database task go to that peer ◦ Assign CPU intensive tasks to a set of peers with high CPU or so
  7. Messaging • Peers communicate with other Peers directly via channels

    • Messaging layer is pluggable • Aeron is the default messaging implementation ◦ High throughput and low latency ◦ Subscription (Connection Multiplexing) ▪ Aeron subscribers perform deserialization ▪ May become CPU bound ▪ Multiple subscribers per node ◦ Connection Short Circuiting ▪ Co-located virtual Peers bypass Aeron ▪ Direct communication without any network or serialization overhead
  8. Coordination • Immutable append only log that is asynchronously replicated

    to each peer replica • Log entries are the functions (pure, deterministic, idempotent) with args that are used to update the replica • Each peer sees each event in the cluster in exactly the same order • Since each peer has its own independent pointers to the log, they never block • Replica contains structural information of the cluster known to the Peer
  9. Garbage Collection • When Peer comes online, it first goes

    to the fixed origin address and then starts reading the events forward • gc process sets the fixed origin address marker to the last entry as new start ◦ Step-1: Creates its own replica by pretending to be a peer and read every entry till the last entry ◦ Step-2: Take the replica and store it in origin address that peer can now use ◦ Step-3: Instructs origin to atomically point to new start • In the process if the peers are left behind, they may crash and eventually start from the new origin address
  10. Onyx Cluster • 3-Phase Cluster Join Strategy Peers 1 -

    4 form a ring. Peer 5 wants to join
  11. Onyx Cluster • 3-Phase Cluster Join Strategy • Peer 5

    initiates the 1-phase of the join protocol • Peer 1 prepares to accept Peer 5 into the ring by adding a watch to it
  12. Onyx Cluster • 3-Phase Cluster Join Strategy • Peer 5

    initiates the 2-phase of the join protocol • Peer 5 notifies Peer 4 as a peer to watch
  13. Onyx Cluster • 3-Phase Cluster Join Strategy Peer 5 has

    been fully stitched into the cluster with the ring intact
  14. Onyx Cluster • Peer Failure Detection Strategy Peer 1 will

    signal Peer 5’s death, but Peer 5 never got the chance to signal Peer 4’s death.
  15. Onyx Cluster • Peer Failure Detection Strategy Peer 1 signals

    Peer 5’s death, and closes to the ring by adding a watch to Peer 4
  16. Onyx Cluster • Peer Failure Detection Strategy Peer 1 signals

    Peer 4’s death, and further closes to the ring by adding a watch to Peer 3
  17. Terminology • Function ◦ Clojure function that receives and emits

    segments • Segments ◦ Data (maps) that Onyx allows to emit between functions • Workflow ◦ Articulates the paths that flow through the cluster at runtime (DAG) • Catalog ◦ Describes all inputs, outputs, functions in a Workflow • Flow Conditions ◦ Applied on segment-by-segment basis ◦ Dataflow direction is determined by defined predicate functions Disclaimer: Images used are only for demonstration purpose and are not endorsed by Onyx. The images are not being used for any commercial purpose.
  18. Terminology • Lifecycle and Hooks ◦ Describes lifetime of a

    task ◦ Can inject hooks at critical points during a task ◦ Carries context map • Sentinel ◦ Signals to end stream, switch between streaming/batch mode (:done) • Task ◦ Smallest unit of work ◦ Associated with only one Job • Job ◦ Collection of Workflow, Catalog, Flow Conditions, Lifecycles and Execution Parameters ◦ Every task is associated with exactly one job Disclaimer: Images used are only for demonstration purpose and are not endorsed by Onyx. The images are not being used for any commercial purpose.
  19. Catalog {:onyx/name :in :onyx/tenancy-ident :core.async/read-from-chan :onyx/type :input :onyx/medium :core.async :onyx/max-peers

    1 :onyx/batch-size batch-size :onyx/doc "Reads segments from a core.async channel"} {:onyx/name :split-by-spaces :onyx/fn :onyx-starter.core/split-by-spaces :onyx/type :function :onyx/batch-size batch-size} {:onyx/name :mixed-case :onyx/fn :onyx-starter.core/mixed-case :onyx/type :function :onyx/batch-size batch-size} {:onyx/name :loud :onyx/fn :onyx-starter.core/loud :onyx/type :function :onyx/batch-size batch-size} {:onyx/name :question :onyx/fn :onyx-starter.core/question :onyx/type :function :onyx/batch-size batch-size} {:onyx/name :loud-output :onyx/tenancy-ident :core.async/write-to-chan :onyx/type :output :onyx/medium :core.async :onyx/max-peers 1 :onyx/batch-size batch-size :onyx/doc "Writes segments to a core.async channel"} {:onyx/name :question-output :onyx/tenancy-ident :core.async/write-to-chan :onyx/type :output :onyx/medium :core.async :onyx/max-peers 1 :onyx/batch-size batch-size :onyx/doc "Writes segments to a core.async channel"}
  20. Function (defn mixed-case-impl [s] (->> (cycle [(memfn toUpperCase) (memfn toLowerCase)])

    (map #(%2 (str %1)) s) (apply str))) (defn mixed-case [segment] {:word (mixed-case-impl (:word segment))}) • Take segments as parameters • Emit one or more segments as output
  21. Lifecycle and Hooks • Task Lifecycle API supports hooks to

    start and stop stateful entities {:onyx/name :in :onyx/tenancy-ident :core.async/read-from-chan :onyx/type :input :onyx/medium :core.async :onyx/max-peers 1 :onyx/batch-size batch-size :onyx/doc "Reads segments from a core.async channel"} {:lifecycle/task :in :core.async/id (java.util.UUID/randomUUID) :lifecycle/calls :onyx-starter.lifecycles.sample-lifecycle/in-calls} {:lifecycle/task :in :lifecycle/calls :onyx.plugin.core-async/reader-calls} • Lifecycle calls are related to lifecycles ◦ Consist of a map of functions that are used when resolving lifecycle entries to their corresponding functions. (def get-input-channel (memoize (fn [id] (chan 10000)))) (defn inject-in-ch [event lifecycle] {:core.async/chan (get-input-channel (:core.async/id lifecycle))}) (def in-calls {:lifecycle/before-task-start inject-in-ch}) (def reader-calls {:lifecycle/before-task-start inject-reader :lifecycle/after-task-stop log-retry-count})
  22. Job • Environment Configuration ◦ :onyx/tenancy-id : Provides a way

    to provide strong, multi-tenant isolation of peers ◦ :zookeeper/server? : Used to startup a local, in-memory ZooKeeper (test only) ◦ :zookeeper.server/port : Port to use for the local in-memory ZooKeeper ◦ :zookeeper/address : Addresses of ZooKeeper servers to use for coordination • Peer Configuration ◦ :onyx/tenancy-id : Provides a way to provide strong, multi-tenant isolation of peers ◦ :zookeeper/address : Addresses of ZooKeeper servers to use for coordination ◦ :onyx.peer/job-scheduler : Coordinates which jobs peers are allowed to volunteer to execute ◦ :onyx.messaging/impl : Messaging protocol to use for peer-to-peer communication ◦ :onyx.messaging/bind-addr : IP address to bind the peer to for messaging ◦ :onyx.messaging/peer-port : Port that peers should use to communicate • Submit Job (onyx.api/submit-job config job) ◦ Environment, peer configuration with workflow, catalog, lifecycles and flow conditions