Upgrade to Pro — share decks privately, control downloads, hide ads and more …

A Note On Distributed Computing

A Note On Distributed Computing

Papers We Love, Boston, #1

Christopher Meiklejohn

August 28, 2014
Tweet

More Decks by Christopher Meiklejohn

Other Decks in Programming

Transcript

  1. . . . .. . . . .. . .

    . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . . . . . . . . A Note On Distributed Computing (1994) Jim Waldo Geoff Wyant Ann Wollrath Sam Kendall Sun Microsystems Laboratories, Inc. Papers We Love, Boston August 28th, 2014 Waldo et al (Sun) A Note On Distributed Computing Papers We Love, Boston 1 / 32
  2. . . . .. . . . .. . .

    . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . Overview . . . 1 Background Timeline Remote Procedure Call Distributed Objects . . . 2 Paper Introduction The Vision of Unified Objects Deja Vu All Over Again Local and Distributed Computing The Myth of "Quality of Service" Lessons from NFS Taking the Difference Seriously A Middle Ground . . . 3 Review Where we are now . . . 4 References Waldo et al (Sun) A Note On Distributed Computing Papers We Love, Boston 2 / 32
  3. . . . .. . . . .. . .

    . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . About Me Senior Engineer at Basho Technologies, Inc. Researcher with the SyncFree Consortium Former systems administrator Talk History surrounding the talk and context Motivation for thinking about distributed computing Waldo et al (Sun) A Note On Distributed Computing Papers We Love, Boston 3 / 32
  4. . . . .. . . . .. . .

    . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . Timeline I 1974: RFC 674 "Procedure Call Protocol Documents, Version 2" 1975: RFC 684 "A Commentary on Procedure Calling as a Network Protocol" 1976: RFC 707 "A High-Level Framework for Network-Based Resource Sharing" 1984: "Implementing Remote Procedure Calls" [3] 1987: Distribution and Abstract Types in Emerald [4] 1988: Distributed Programing in Argus [10] 1988: RFC 1057 "Remote Procedure Call Protocol Specification Version 2" Waldo et al (Sun) A Note On Distributed Computing Papers We Love, Boston 4 / 32
  5. . . . .. . . . .. . .

    . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . Timeline II 1991: CORBA 1.0 1996: "A Distributed Object Model for the Java System" [14] 1997: CORBA 2.0 1999 - Present: EJB, XML-RPC, SOAP, REST Waldo et al (Sun) A Note On Distributed Computing Papers We Love, Boston 5 / 32
  6. . . . .. . . . .. . .

    . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . Remote Procedure Call General term for executing a subroutine in a different address space without writing the remote execution code First popular implementation: Sun RPC, used for Network File System (NFS) Examples of later RPC-like mechanisms: Java RMI, Modula-3 network objects, XML-RPC, SOAP, CORBA, Avro, Thrift, Protocol Buffers, Finagle, etc. Waldo et al (Sun) A Note On Distributed Computing Papers We Love, Boston 6 / 32
  7. . . . .. . . . .. . .

    . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . RFC 684 "Rather, we take exception to PCP’s underlying premise: that the procedure calling discipline is the starting point for building multi-computer systems." Commentary on RFC 674, which introduces the Procedure Call Protocol, Version 2. (PCP). Procedure calling is usually a primitive operation Local vs. remote calls have different cost profiles Message passing instead of procedure-calling is a better model Situations where the concept is weak: Difficulty to recover from malfunction or errors (rollback vs. execption) Difficult to sequence operations Synchronous; geared towards one-to-one call-return Backpressure and load-shedding becomes harder (priority servicing) Waldo et al (Sun) A Note On Distributed Computing Papers We Love, Boston 7 / 32
  8. . . . .. . . . .. . .

    . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . RFC 707 "Because of this cost differential, the applications programmer must exercise discretion in his use of remote resources, even though the mechanics of their use will have been greatly simplified by the RTE. Like virtual memory, the procedure call model offers great convenience, and therefore power, in exchange for reasonable alertness to the possibilities of abuse." Generalizes the request/response mechanism that services such as TELNET and FTP use to a procedure call mechanism that is usable by machines, not humans. Makes similar control flow argument. Waldo et al (Sun) A Note On Distributed Computing Papers We Love, Boston 8 / 32
  9. . . . .. . . . .. . .

    . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . Distributed Objects Replicated objects State-machine replication [9] [11] ISIS, Virtual Synchrony [1] [2] Live objects Distinct identities Encapsulate their own state Examples: CORBA, DCOM Waldo et al (Sun) A Note On Distributed Computing Papers We Love, Boston 9 / 32
  10. . . . .. . . . .. . .

    . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . Common Object Request Broker Architecture (CORBA) Allows programs to communicate between different languages and different machines Interface definition language (IDL) to specify interfaces Mapping from IDL’s to languages; ie. Java, C++ Benefits: Language-independence OS-independence Data typing Data transfer Waldo et al (Sun) A Note On Distributed Computing Papers We Love, Boston 10 / 32
  11. . . . .. . . . .. . .

    . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . It’s Just a Mapping Problem (2003) “The goal is to merge middleware abstractions directly into the realm of the programming language, minimizing the impedance mismatch between the programming language world and the middleware world. For example, mappings make request invocations on distributed objects and services appear as normal programming-language function calls, and they map distributed system exceptions into native programming language exception-handling mechanisms.” [12] Argues transparency is how quality is measured by developers. References current work as a case of transparencies masking failures. Waldo et al (Sun) A Note On Distributed Computing Papers We Love, Boston 11 / 32
  12. . . . .. . . . .. . .

    . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . Unified View of Objects "It is the thesis of this note that this unified view of objects is mistaken." [7] Local computing vs. distributed computing It is perilous to ignore the differences! Historically: Emerald, Argus, etc. Contemporary: CORBA, DCOM, etc. Waldo et al (Sun) A Note On Distributed Computing Papers We Love, Boston 12 / 32
  13. . . . .. . . . .. . .

    . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . Objects "It is the thesis of this note that this unified view of objects is mistaken." [7] Interfaces defined in an Interface Definition Language (IDL) Extension of the RPC mechanism to the object-oriented paradigm Abstracts how to perform the operation We’ve learned already that local and remote calls are not the same Waldo et al (Sun) A Note On Distributed Computing Papers We Love, Boston 13 / 32
  14. . . . .. . . . .. . .

    . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . The Promise Write the application and design the correct interfaces Tune performance by relocation of objects Test with “real bullets” [7] The assumptions: The correct interfaces will naturally merge Correctness is not based in location, only on interface Waldo et al (Sun) A Note On Distributed Computing Papers We Love, Boston 14 / 32
  15. . . . .. . . . .. . .

    . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . Three False Principles "there is a single natural object-oriented design for a given application, regardless of the context in which that application will be deployed" "failure and performance issues are tied to the implementation of the components of an application, and consideration of these issues should be left out of an initial design" "the interface of an object is independent of the context in which that object is used" Waldo et al (Sun) A Note On Distributed Computing Papers We Love, Boston 15 / 32
  16. . . . .. . . . .. . .

    . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . Language Approach "The hard problems in distributed computing are not the problems of getting things on and off the wire." [7] “Every 10 years” [7] Failure stems from the inherent differences between local and distributed computing Problems: Latency Coordination Partial failure Memory access Concurrency Waldo et al (Sun) A Note On Distributed Computing Papers We Love, Boston 16 / 32
  17. . . . .. . . . .. . .

    . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . Latency Most obvious difference Ignoring latency directly impacts performance "Rely on steadily increasing speed of underlying hardware" [7] Performance analysis and relocation of objects is not always possible Waldo et al (Sun) A Note On Distributed Computing Papers We Love, Boston 17 / 32
  18. . . . .. . . . .. . .

    . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . Memory Access Pointers are not valid across address spaces Distributed shared memory is one approach Replacement with references, or marshalling, as approaches Permit vs. enforce Waldo et al (Sun) A Note On Distributed Computing Papers We Love, Boston 18 / 32
  19. . . . .. . . . .. . .

    . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . Partial Failure and Concurrency Fundamental difference between local and distributed computing Local computing: Failures are total Failures are detectable Return of control Distributed computing: Components can fail: networks, machines, etc. Failures are partial No global state Failure of link indistinguishable from processor failure Waldo et al (Sun) A Note On Distributed Computing Papers We Love, Boston 19 / 32
  20. . . . .. . . . .. . .

    . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . Partial Failure and Concurrency "The question is not “can you make remote method invocation look like local method invocation?” but rather “what is the price of making remote method invocation identical to local method invocation?”" [7] Core problems in distributed systems: ensuring consistent state Consensus, agreement Indeterminacy Two possible paths to a unified object model: Treat all objects as local Treat all objects as remote Concurrency shares the same problembs Waldo et al (Sun) A Note On Distributed Computing Papers We Love, Boston 20 / 32
  21. . . . .. . . . .. . .

    . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . Partial Failure and Concurrency "This approach would also defeat the overall purpose of unifying the object models. The real reason for attempting such a unification is to make distributed computing more like local computing and thus make distributed computing easier. This second approach to unifying the models makes local computing as complex as distributed computing." [7] Waldo et al (Sun) A Note On Distributed Computing Papers We Love, Boston 21 / 32
  22. . . . .. . . . .. . .

    . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . The Myth of "Quality of Service" "Quality of service" provided by implementation of an object Queue example: Time-out on enqueue; operation is retried Partial failure observed on timed-out operations Results in multiple enqueue operations of same value Should be avoiding duplicate objects Guarantees don’t necessarily hold over composition Deletion while value exists example: Without locks, races and partial writes can occur Causality is difficult [5] [8] Waldo et al (Sun) A Note On Distributed Computing Papers We Love, Boston 22 / 32
  23. . . . .. . . . .. . .

    . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . Lessons from NFS Network File System; Sun Microsystems’s distributed file system Prior to NFS, errors were catastrophic; introduced partial failure Stateless protocol, implemented in UDP Soft mounting Introduced new failure statuses Problematic for programs that don’t use new error codes Hard mounting Application hangs / blocks until system available Most used Waldo et al (Sun) A Note On Distributed Computing Papers We Love, Boston 23 / 32
  24. . . . .. . . . .. . .

    . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . Taking the Difference Seriously Local vs. remote separation does not imply different interfaces Best to split objects by allowance for indeterminacy Possibly compile interfaces into local vs. remote Still requires knowledge of local vs. remote Keeps the difference visible Waldo et al (Sun) A Note On Distributed Computing Papers We Love, Boston 24 / 32
  25. . . . .. . . . .. . .

    . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . A Middle Ground "local-remote" objects Potentially increased latency, but local memory access. Simplified or optimized IDL NUMA is an example today Waldo et al (Sun) A Note On Distributed Computing Papers We Love, Boston 25 / 32
  26. . . . .. . . . .. . .

    . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . Convenience Over Correctness (2008) "We have a general-purpose imperative programming-language hammer, so we treat distributed computing as just another nail to bend to fit the programming models." [13] Impedance mismatch with Interface Definition Languages (IDL) Base types are easy to map; more complex types are less so Concerns over scalability RPC mechanism lacks metadata for caching Representational State Transfer (REST) is a good mechanism Building frameworks on top of this is a repeat of the problem Waldo et al (Sun) A Note On Distributed Computing Papers We Love, Boston 26 / 32
  27. . . . .. . . . .. . .

    . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . Distributed Languages Today Erlang Has RPC, but prefers asynchronous message passing Potentially problematic local-vs-remote process semantics Bloom [6] Logic-based disorderly programming model with messages Enforces asynchronous message passing when doing remote communication Analysis tools to provide coordination where needed Erlang-inspired Cloud Haskell (distributed-process) Akka Waldo et al (Sun) A Note On Distributed Computing Papers We Love, Boston 27 / 32
  28. . . . .. . . . .. . .

    . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . In Conclusion "Does developer convenience really trump correctness, scalability, performance, separation of concerns, extensibility, and accidental complexity?" [13] Waldo et al (Sun) A Note On Distributed Computing Papers We Love, Boston 28 / 32
  29. . . . .. . . . .. . .

    . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . References I K. Birman and R. Cooper. The isis project: Real experience with a fault tolerant programming system. SIGOPS Oper. Syst. Rev., 25(2):103–107, Apr. 1991. K. Birman and T. Joseph. Exploiting virtual synchrony in distributed systems. SIGOPS Oper. Syst. Rev., 21(5):123–138, Nov. 1987. A. D. Birrell and B. J. Nelson. Implementing remote procedure calls. ACM Trans. Comput. Syst., 2(1):39–59, Feb. 1984. A. Black, N. Hutchinson, E. Jul, H. Levy, and L. Carter. Distrbution and abstract types in emerald. IEEE Trans. Softw. Eng., 13(1):65–76, Jan. 1987. Waldo et al (Sun) A Note On Distributed Computing Papers We Love, Boston 29 / 32
  30. . . . .. . . . .. . .

    . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . References II B. Charron-Bost. Concerning the size of logical clocks in distributed systems. Inf. Process. Lett., 39(1):11–16, July 1991. N. Conway, W. R. Marczak, P. Alvaro, J. M. Hellerstein, and D. Maier. Logic and lattices for distributed programming. In Proceedings of the Third ACM Symposium on Cloud Computing, SoCC ’12, pages 1:1–1:14, New York, NY, USA, 2012. ACM. S. C. Kendall, J. Waldo, A. Wollrath, and G. Wyant. A note on distributed computing. Technical report, Mountain View, CA, USA, 1994. L. Lamport. Time, clocks, and the ordering of events in a distributed system. Commun. ACM, 21(7):558–565, July 1978. Waldo et al (Sun) A Note On Distributed Computing Papers We Love, Boston 30 / 32
  31. . . . .. . . . .. . .

    . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . References III L. Lamport. Using time instead of timeout for fault-tolerant distributed systems. ACM Trans. Program. Lang. Syst., 6(2):254–280, Apr. 1984. B. Liskov. Distributed programming in argus. Commun. ACM, 31(3):300–312, Mar. 1988. F. B. Schneider. Implementing fault-tolerant services using the state machine approach: A tutorial. ACM Comput. Surv., 22(4):299–319, Dec. 1990. S. Vinoski. It’s just a mapping problem. IEEE Internet Computing, 7(3):88–90, May 2003. Waldo et al (Sun) A Note On Distributed Computing Papers We Love, Boston 31 / 32
  32. . . . .. . . . .. . .

    . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . .. . . . .. . . . . .. . . . .. . . . . .. . . . .. . . . .. . References IV S. Vinoski. Convenience over correctness. IEEE Internet Computing, 12(4):89–92, July 2008. A. Wollrath, R. Riggs, and J. Waldo. A distributed object model for the java system. USENIX COMPUTING SYSTEMS, 9, 1996. Waldo et al (Sun) A Note On Distributed Computing Papers We Love, Boston 32 / 32