Towards a Solution to the Red Wedding Problem

Towards a Solution to the Red Wedding Problem

HotEdge '18
Boston, Massachusetts, USA


Christopher Meiklejohn

July 10, 2018


  1. 1.


    Meiklejohn Heather C. Miller Zeeshan Lakhani Université catholique de Louvain Instituto Superior Técnico Northeastern University Comcast Cable
  2. 2.

    WHY EDGE COMPUTING? Replicate compute to reduce user perceived latency.

    Replicate storage to further reduce user perceived latency. Push compute and storage replicas closer to client to further reduce latency. Alleviate load on “origin” server, enabling further scalability.
  3. 3.

    “FIRST GEN” COMMERCIAL EDGE (2000+) Content Delivery Networks (CDNs) 

    Interpose on request/response cycle and modify.  Header, location based common. Examples:  Akamai: sandboxed JavaScript rewrite environment  Fastly: VCL-based sandboxed rewrite environment Limited execution environment  Assumes stateless applications  Cannot access CDN content stored at node origin cdn-1 cdn-2 Interpose on request/response performing “rewrites” based on cookies, location, etc.
  4. 4.

    “SECOND GEN” COMMERCIAL EDGE (2015+) “Serverless” solution using functions 

    Migrate functions from DCs (10s) to PoPs (100s)  Interpose on origin requests and requests at PoP Examples  Amazon’s Lambda + Lambda@Edge  Microsoft’s Azure Functions and Functions at IoT Edge  Google Cloud Functions Stateless execution environment  Events are triggered by requests  Short lived containers with no state storage us-west-1 cf-1 cf-2 Interpose on request/response from client to function. Interpose on request/response with origin.
  5. 5.

    THE RED WEDDING PROBLEM Write spikes are unhandled by edge

     Read caches can be maintained at edge  Writes must be routed to the origin server Enabling low latency operations  Writes must be handled at the edge to reduce user perceived latency Enabling greater scalability  Writes must be handled at edge and batching leveraged to reduce load on the origin server Common with collaborative applications  Reddit during FC Barcelona matches  Game of Thrones shared Wiki during episodes origin cdn-1 cdn-2 Users must wait for origin round trip for write operations. Origin server is a bottleneck for write operations coming from PoPs.
  6. 6.

    THE RED WEDDING PROBLEM: CHALLENGES State storage  How should

    state be persisted at the edge?  What constraints to current commercial offerings present for state storage? Arbitrating concurrency  How do we arbitrate concurrent modifications?  How do we minimize conflicts?  How do we maximize batching? Application logic  Clients don’t talk directly to the database, so how do we get application code to the edge?  How do we handle authentication and authorization? origin cdn-1 cdn-2 How do we resolve conflicting writes arriving at the origin? Application logic, authentication of users and state storage all must be handled at the edge.
  7. 7.

    SOLVING THE RED WEDDING PROBLEM us-west-1 us-east-1 cf-1 cf-2 Containerized

    application code (Lambda)  Runs authorization, authentication, and part of the application that writes to the database. Containerized database (Lambda)  Runs a database inside the instance at edge that is initialized on startup and accepts local writes Convergent data structures (CRDTs)  Automatic conflict resolution, allows removal of redundant information and batch synchronization with peers Users interact with services running at the edge  Writes accepted at edge – low latency response  Updates are batched to the central origin periodically  Relieves pressure on the origin server Users talk to their local PoP. Replicas in the same DC communicate each other. Updated are batched back to the local DC and then origin. Lambda instances scaled for increased demand.
  8. 8.

    PROTOTYPE ON LAMBDA us-west-1 us-east-1 Lambda based prototype implementation 

    Redis-like key-value store built in Erlang embedded in an Amazon Lambda function Communicate through AMQP broker in the DC  Periodically synchronize state by sending state of data store through AMQP  On initialization, instances “bootstrap” from other running instances
  9. 9.

    PROTOTYPE ON LAMBDA: CHALLENGES us-west-1 us-east-1 Lambda environment  Invocations

    cannot directly communicate with one another, requiring external message broker  Invocations do not know what other invocations are running at any given time  No storage, must bootstrap from other invocation or origin on startup Lambda@Edge environment  No message broker available in the edge environment, therefore requires round-trip with datacenter  Only certain language runtimes supported External message broker. Must bootstrap from other active invocations.
  10. 10.

    MORE DETAILS Preliminary results in the paper  Lambda invocation

    time, point-to-pint messaging, Lambda runtime overhead Related Work  ExCamera: parallel video processing on Lambda – also had to work around the inter-replica communication problem  CloudPath: solution for path computing where various data storage and capabilities can be deployed across a hierarchy of nodes: assumes significant data deployment infrastructure in the DC Amazon Lambda  Discussions with Amazon around what changes would enable easier construction of these applications  Message broker, inter-invocation communication, etc.
  11. 11.

    CONCLUSION Desired feedback  Is this problem interesting? How general

    is the problem?  Is there a need for stateful applications at the edge? Controversial points  Are CRDTs general enough to support the types of applications that would benefit from this?  Can enough application code operate at the edge to make this feasible? Open issues  Performance, durability of write operations at the edge  Consistency models – strong vs. causal vs. session guarantees Problematic circumstances  Global invariants, ie. bank transfer with non-negative balance, etc. requires coordination