Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building distributed systems with OSS

Building distributed systems with OSS

7f1dfa02fd3771699d5bac40fc54a21c?s=128

Mateusz Gajewski

October 03, 2015
Tweet

More Decks by Mateusz Gajewski

Other Decks in Programming

Transcript

  1. Building massively distributed systems with OSS Mateusz ‘Serafin’ Gajewski allegro.tech

    meeting v8 2015
  2. None
  3. Web Scale source: reactivemanifest.org

  4. Distributed top-down architecture, computing, messaging, databases (No/New SQL), data processing,

    file systems, resource management, infrastructure.
  5. Distributed toolbox dynamic flow control, rate limiting, exponential back-offs, automatic

    failover, hinted-handoffs, data scrubbing, CRDTs, backpressure, circuit breakers, bulk heads vector clocks, two-phase commit, consensus algorithms, gossip protocols, leader election, distributed coordination, eventual consistency, data replication, OCC, MVCC...
  6. Thesis: Building distributed & correct systems is very hard. Proof

    through: Jepsen :)
  7. Thesis: Most of our problems/needs can be addressed using existing

    Open Source Software. Proof through: a lot of companies i.e. Allegro ;)
  8. Just four OSS examples with concepts behind them

  9. Apache Cassandra · 2008

  10. Architecture

  11. SSTable

  12. Read/write path

  13. Will it scale?

  14. Yes it will!

  15. Apache Kafka · 2011

  16. Architecture

  17. Partition structure source: kafka.apache.org

  18. Will it scale?

  19. Apache Spark · 2009

  20. Components source: spark.apache.org

  21. RDD abstraction

  22. Architecture source: spark.apache.org

  23. Does it scale? source: databricks.com

  24. Apache Mesos · 2009

  25. Mesos architecture source: mesos.apache.org

  26. Offers source: mesos.apache.org

  27. Mesos ecosystem source: mesosphere.com

  28. Does it scale?

  29. All you need is... Scalable system = Cassandra as data

    storage + Spark as data processing engine + Mesos as resource scheduler + Kafka as core messaging.
  30. Good news: we use it all!

  31. but... OSS cons & pros for your consideration

  32. OSS cons • immature (not production-ready), • bugs, • poor

    or misleading documentation, • learning curve, • few or no experts on the market, • slow adoption rate, • dependencies on other OSS, • (sometimes) lack of support
  33. OSS pros • “there is OSS for that” ;) •

    licensing, • sources, • speeds up time-to-market • helps recruiting
  34. OSS tips • stay up-to-date, • don’t trust docs -

    deep dive instead, • engage with community, • remove OSS barriers - contribute back, • release your software - share, • grow experts in your company - educate, • evaluate-hold-adopt cycle - experiment, • know your hardware & OS - tune, • be patient ;)
  35. Q/A?

  36. Thank you!

  37. Key facts • partitioned, nested, sorted map, • AP system

    (with tunable C), • masterless architecture (p2p) with gossip protocol, • multi dc (a)synchronous replication, • consistent hashing (with virtual nodes), • support CQL (query language similar to SQL), • modeled after Dynamo, BigTable.
  38. Key facts • general purpose, distributed data-processing engine, • extends

    Map/Reduce & Dryad data flow programming models, • fault tolerance via RDDs, • supports iterative algorithms, map/reduce, stream processing, relational queries & hybrid models, • partial DAG execution
  39. Key facts • distributed, fault tolerant resource scheduler, • provides

    performance isolation, • leader election with ZooKeeper, • master maintains soft-state.
  40. Key facts • partitioned, immutable, linearizable append-only log, • CA

    system (can lost data during partition), • (a)synchronous replication (tunable), • at-least-once delivery semantics, • ZooKeeper for partition leader election, • ISR (in-sync-replicas set) concept, • relies heavily on OS caches.