Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Neha Narula on The Scalable Commutativity Rule

Neha Narula on The Scalable Commutativity Rule

Moore's law is over, or at least, we won't be making programs go faster by running on faster processors, but instead by parallelizing our code to use more of them. Reasoning about concurrent code is difficult; but it's also very hard to understand whether your design has latent scalability bottlenecks until you can actually run it on many cores. And what if the problem is in your interface, instead of just the implementation?

This paper presents a simple, elegant rule: whenever interface operations commute, they can be implemented in a way that scales.

The authors apply this idea to Linux, and create a new operating system by using the rule, sv6. Their paper also comes with software, COMMUTER, which can help developers evaluate their interfaces to find opportunities for scaling.

This is a very powerful idea, and probably has applications in other areas like distributed systems. In this talk I'll present the paper, and speculate a bit about where else this research could be useful.

Papers_We_Love

April 01, 2015
Tweet

More Decks by Papers_We_Love

Other Decks in Research

Transcript

  1. The Scalable Commutativity Rule by Austin Clements, Frans Kaashoek, Nickolai

    Zeldovich, Robert Morris, and Eddie Kohler Papers We Love NYC April 1, 2015
  2. Neha Narula Ph.D. candidate at MIT – Working on high performance

    concurrency control in databases and distributed systems – How do we get high performance and strong consistency? Formerly @Google http://nehanaru.la @neha
  3. Current Software Development •  Benchmark, re-design, test •  Hard to

    know what problems might arise in the future •  The real bottlenecks might be in the interface design, not just the implementation
  4. What Scales on Today’s Multicores? •  Cache coherence: the MESI

    protocol •  Reads do not conflict, reads and writes or writes and writes do •  Conflict-free is a good proxy for scalability Two operations are scalable if they are conflict-free.
  5. The Scalable Commutativity Rule Whenever interface operations commute, they can

    be implemented in a way that scales. Commutes Scalable implementation exists creat with lowest fd ? creat -> 3 creat -> 4
  6. The Scalable Commutativity Rule Whenever interface operations commute, they can

    be implemented in a way that scales. Commutes Scalable implementation exists creat with lowest fd
  7. The Scalable Commutativity Rule Whenever interface operations commute, they can

    be implemented in a way that scales. Commutes Scalable implementation exists creat with lowest fd creat with any fd ? creat -> 13 creat -> 47
  8. The Scalable Commutativity Rule Whenever interface operations commute, they can

    be implemented in a way that scales. Commutes Scalable implementation exists creat with lowest fd creat with any fd rule
  9. Intuition Behind Rule When operations commute – The results are independent

    of order – Communication is unnecessary – And without communication, no conflicts
  10. Example: Reference Counter T1 T2 T3 T4 T5 iszero() F

    iszero() F dec() 2 dec() 1 dec() 0 R1 commutes; conflict free implementation: shared counter R2 does not commute because dec() returns counter value R1 R2
  11. Example: Reference Counter T1 T2 T3 T4 T5 iszero() F

    iszero() F dec() ok dec() ok dec() ok R1 commutes; conflict free implementation: shared counter R2 does not commute because dec() returns counter value R2’ does commute; conflict-free implementation: per-core counter R3 depends on state Initial value > 3 Initial value ≤ 3 R1 R2’ R3
  12. Histories and Specifications A history H is sequence of invocations

    and responses on threads. A specification ζ defines an interface. ζ is the set of legal histories given the allowed behavior of the interface.
  13. Reordering A reordering H’ is a permutation of H that

    maintains operations order for each individual thread (H|t = H’|t for all t).
  14. Commutativity A region Y of a legal history XY SIM-

    commutes if every reordering Y’ of Y also yields a legal history and every legal extension Z of XY is also a legal extension of XY’. (And this must be true for every prefix of every reordering of Y.)
  15. The Formal Rule Let ζ be a specification with a

    reference implementation M. Consider a history where XY where Y commutes in XY and M can generate XY. There exists a correct implementation of ζ whose execution of XY is conflict-free in the commutative region Y.
  16. Commuter •  Input: Symbolic Model •  Analyzer computes commutativity conditions

    •  Testgen computes test cases •  Mtrace detects conflict
  17. Example: rename() rename(a, b) and rename(c, d) commute if: • 

    Both source files exist and all names are different •  Neither source file exists •  a xor c exists, and it is not the other rename's destination •  One call is a self-rename of an existing file and a ≠ c •  a and c are hard links to the same inode, a ≠ c, and b = d •  Both calls are self-renames Important to have discriminating commutativity conditions •  ∀states, rename almost never commutes •  More commutative cases ⇒ more opportunities to scale •  Captures more operations applications usually do
  18. Commuter Finds Non-scalable Cases in Linux •  Directory-wide locking • 

    File descriptor reference counts •  Address space-wide locking
  19. sv6: A Scalable OS •  POSIX-like operating system •  File

    system and virtual memory system follow commutativity rule •  Implementation using standard parallel programming techniques, but guided by Commuter
  20. Remaining 1% Idempotent Updates •  Two lseeks of same FD

    to the same offset •  Two pwrites of same data to same offset
  21. Refining POSIX with the Rule •  Lowest FD versus any

    FD •  stat versus xstat •  Unordered sockets •  Delayed munmap •  fork+exec versus posix_spawn
  22. What Can We Learn? •  Embrace non-determinism •  Decompose compound

    operations •  Permit weak ordering •  Release resources asynchronously
  23. Limitations of the Rule •  Rule says a scalable implementation

    exists. – It might not have the best raw performance – You might need different scalable implementations for different regions – How do I find this implementation? •  The non-scalable non-commutativity rule •  Synchronized clocks
  24. Distributed Systems and Databases •  Reads still don’t conflict, but

    no cache coherence for invalidations •  Rule should still apply to message passing systems •  Commutative concurrency control