$30 off During Our Annual Pro Sale. View Details »

Performance and Fault Tolerance for the Netflix API - August 2012

Performance and Fault Tolerance for the Netflix API - August 2012

Description

Presented at East Bay Java User Group on August 22nd 2012 (http://www.meetup.com/eastbayjug/events/74544872/)

The Netflix API receives over a billion requests a day which translates into multiple billions of calls to underlying systems in the Netflix service-oriented architecture. These requests come from more than 800 different devices ranging from gaming consoles like the PS3, XBox and Wii to set-top boxes, TVs and mobile devices such as Android and iOS.

This presentation describes how the Netflix API supports those devices and achieves fault tolerance in a distributed architecture while depending on dozens of systems which can fail at any time. It also explains how a new system design allows each device to optimize API calls to their unique needs and leverage concurrency on the server-side to improve their performance.

(Some slides have been modified and notes included for readability and understanding of content without accompanying speech.)

Only difference from QCon Sao Paulo slides (https://speakerdeck.com/u/benjchristensen/p/performance-and-fault-tolerance-for-the-netflix-api-qcon-sao-paulo) is the US instead of Brazilian screenshot.

Ben Christensen

August 22, 2012
Tweet

More Decks by Ben Christensen

Other Decks in Technology

Transcript

  1. Performance and Fault Tolerance
    for the Netflix API
    Ben Christensen
    Software Engineer – API Platform at Netflix
    @benjchristensen
    http://www.linkedin.com/in/benjchristensen
    http://techblog.netflix.com/
    1

    View Slide

  2. 2
    Netflix has over 27 million video streaming customers in 47 countries across North & South America, United Kingdom and Ireland who get unlimited access to movies and TV shows from over
    800 different devices for $7.99USD/month (about the same converted price in each countries local currency).
    In June 2012 Netflix customers streamed over 1 billion hours of content.

    View Slide

  3. Discovery Streaming
    3
    Streaming devices talk to 2 major edge services: the first is the Netflix API that provides functionality related to discovering and browsing content while the second handles the playback of
    video streams.

    View Slide

  4. Netflix API Streaming
    4
    This presentation focuses on the “Discovery” portion of traffic that the Netflix API handles.

    View Slide

  5. 5
    The Netflix API powers the “Discovery” user experience on the 800+ devices up until a user hits the play button at which point the “Streaming” edge service takes over.

    View Slide

  6. Open API Netflix Devices
    API Request Volume by Audience
    6
    Traffic to the Netflix API is predominantly focused on serving the discovery UIs of Netflix streaming devices. This means it is primarily an internal API used by Netflix development teams.

    View Slide

  7. Netflix API
    Dependency A
    Dependency D
    Dependency G
    Dependency J
    Dependency M
    Dependency P
    Dependency B
    Dependency E
    Dependency H
    Dependency K
    Dependency N
    Dependency Q
    Dependency C
    Dependency F
    Dependency I
    Dependency L
    Dependency O
    Dependency R
    7
    The Netflix API serves all streaming devices and acts as the broker between backend Netflix systems and the user interfaces running on the 800+ devices that support Netflix streaming.
    More than 1 billion incoming calls per day are received which in turn fans out to several billion outgoing calls (averaging a ratio of 1:6) to dozens of underlying subsystems with peaks of over
    200k dependency requests per second.

    View Slide

  8. Netflix API
    Dependency A
    Dependency D
    Dependency G
    Dependency J
    Dependency M
    Dependency P
    Dependency B
    Dependency E
    Dependency H
    Dependency K
    Dependency N
    Dependency Q
    Dependency C
    Dependency F
    Dependency I
    Dependency L
    Dependency O
    Dependency R
    8
    First half of the presentation discusses resilience engineering implemented to handle failure and latency at the integration points with the various dependencies.

    View Slide

  9. Dozens of dependencies.
    One going bad takes everything down.
    99.99%30 = 99.7% uptime
    0.3% of 1 billion = 3,000,000 failures
    2+ hours downtime/month
    even if all dependencies have excellent uptime.
    Reality is generally worse.
    9
    Even when all dependencies are performing well the aggregate impact of even 0.01% downtime on each of dozens of services equates to potentially hours a month of downtime if not
    engineered for resilience.

    View Slide

  10. 10

    View Slide

  11. 11

    View Slide

  12. 12
    Latency is far worse for system resilience than failure. Failures naturally “fail fast” and shed load whereas latency backs up queues, threads and system resources and if isolation techniques
    are not used it can cause an entire system to fail.

    View Slide

  13. "Timeout guard" daemon prio=10 tid=0x00002aaacd5e5000 nid=0x3aac runnable [0x00002aaac388f000] java.lang.Thread.State: RUNNABLE
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
    - locked <0x000000055c7e8bd8> (a java.net.SocksSocketImpl)
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:391)
    at java.net.Socket.connect(Socket.java:579)
    at java.net.Socket.connect(Socket.java:528)
    at java.net.Socket.(Socket.java:425)
    at java.net.Socket.(Socket.java:280)
    at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(DefaultProtocolSocketFactory.java:80)
    at org.apache.commons.httpclient.protocol.ControllerThreadSocketFactory$1.doit(ControllerThreadSocketFactory.java:91)
    at org.apache.commons.httpclient.protocol.ControllerThreadSocketFactory$SocketTask.run(ControllerThreadSocketFactory.java:158)
    at java.lang.Thread.run(Thread.java:722)
    [Sat Jun 30 04:01:37 2012] [error] proxy: HTTP: disabled connection for (127.0.0.1)
    > 80% of requests rejected
    Median
    Latency
    13
    This is an example of what a system looks like when high latency occurs without load shedding and isolation. Backend latency spiked (from <100ms to >1000ms at the median, >10,000 at
    the 90th percentile) and saturated all available resources resulting in the HTTP layer rejecting over 80% of requests.

    View Slide

  14. No single dependency should
    take down the entire app.
    Fallback.
    Fail silent.
    Fail fast.
    14
    It is a requirement of high volume, high availability applications to build fault and latency tolerance into their architecture. Infrastructure is an aspect of resilience engineering but it can not
    be relied upon by itself - software must be resilient.

    View Slide

  15. 15
    Netflix uses a combination of aggressive network timeouts, tryable semaphores and thread pools to isolate dependencies and limit impact of both failure and latency.

    View Slide

  16. Tryable semaphores for “trusted” clients and fallbacks
    Separate threads for “untrusted” clients
    Aggressive timeouts on threads and network calls
    to “give up and move on”
    Circuit breakers as the “release valve”
    16

    View Slide

  17. 17
    With isolation techniques the application container is now segmented according to how it uses its underlying dependencies instead of using a single shared resource pool to communicate
    with all of them.

    View Slide

  18. 18
    A single dependency failing will no longer be permitted to take more resources than it was allocated and can have its impact controlled.

    View Slide

  19. 19
    In this case the backend service has become latent and saturates all available threads allocated to it so further requests to it are rejected (the orange line) instead of blocking or using up all
    available system threads.

    View Slide

  20. 20

    View Slide

  21. 30 rps x 0.2 seconds = 6 + breathing room = 10 threads
    Thread-pool Queue size: 5-10 (0 doesn't work but get close to it)
    Thread-pool Size + Queue Size
    Queuing is Not Free
    21
    Requests in queue block user threads thus must be considered part of the resources being allocated to a dependency.
    Setting a queue to 100 is equivalent to saying 100 incoming requests can block while waiting for this dependency. There is typically not a good reason for having a queue size higher than
    5-10.
    Bursting should be handled through batching and throughput should be accommodated by a large enough thread pool. It is better to increase the thread-pool size rather than the queue as
    commands executing in the thread-pool receive forward progress whereas items in the queue do not.

    View Slide

  22. Cost of Thread @ ~60rps
    mean - median - 90th - 99th (time in ms)
    Time for thread to execute Time user thread waited
    22
    The Netflix API has ~30 thread pools with 5-20 threads in each. A common question and concern is what impact this has on performance.
    Here is a sample of a dependency circuit for 24 hours from the Netflix API production cluster with a rate of 60rps per server.
    Each execution occurs in a separate thread with mean, median, 90th and 99th percentile latencies shown in the first 4 legend values. The second group of 4 values is the user thread
    waiting on the dependency thread and shows the total time including queuing, scheduling, execution and waiting for the return value from the Future.
    The calling thread median, 90th and 99th percentiles are the last 3 legend values.
    This example was chosen since it is relatively high volume and low latency so the cost of a separate thread is potentially more of a concern than if the backend network latency was 100ms
    or higher.

    View Slide

  23. Cost of Thread @ ~60rps
    mean - median - 90th - 99th (time in ms)
    Time for thread to execute Time user thread waited
    Cost: 0ms
    23
    At the median (and lower) there is no cost to having a separate thread.

    View Slide

  24. Cost of Thread @ ~60rps
    mean - median - 90th - 99th (time in ms)
    Time for thread to execute Time user thread waited
    Cost: 3ms
    24
    At the 90th percentile there is a cost of 3ms for having a separate thread.

    View Slide

  25. Cost of Thread @ ~60rps
    mean - median - 90th - 99th (time in ms)
    Time for thread to execute Time user thread waited
    Cost: 9ms
    25
    At the 99th percentile there is a cost of 9ms for having a separate thread. Note however that the increase in cost is far smaller than the increase in execution time of the separate thread
    which jumped from 2 to 28 whereas the cost jumped from 0 to 9.
    This overhead at the 90th percentile and higher for circuits such as these has been deemed acceptable for the benefits of resilience achieved.
    For circuits that wrap very low latency requests (such as those primarily hitting in-memory caches) the overhead can be too high and in those cases we choose to use tryable semaphores
    which do not allow for timeouts but provide most of the resilience benefits without the overhead. The overhead in general though is small enough that we prefer the isolation benefits of a
    separate thread.

    View Slide

  26. Cost of Thread @ ~75rps
    mean - median - 90th - 99th (time in ms)
    Time user thread waited
    Time for thread to execute
    26
    This is a second sample of a dependency circuit for 24 hours from the Netflix API production cluster with a rate of 75rps per server.
    As with the first example this was chosen since it is relatively high volume and low latency so the cost of a separate thread is potentially more of a concern than if the backend network
    latency was 100ms or higher.
    Each execution occurs in a separate thread with mean, median, 90th and 99th percentile latencies shown in the first 4 legend values. The second group of 4 values is the user thread
    waiting on the dependency thread and shows the total time including queuing, scheduling, execution and waiting for the return value from the Future.
    The calling thread median, 90th and 99th percentiles are the last 3 legend values.

    View Slide

  27. Cost of Thread @ ~75rps
    mean - median - 90th - 99th (time in ms)
    Time user thread waited
    Time for thread to execute
    Cost: 0ms
    27
    At the median (and lower) there is no cost to having a separate thread.

    View Slide

  28. Cost of Thread @ ~75rps
    mean - median - 90th - 99th (time in ms)
    Time user thread waited
    Time for thread to execute
    Cost: 2ms
    28
    At the 90th percentile there is a cost of 2ms for having a separate thread.

    View Slide

  29. Cost of Thread @ ~75rps
    mean - median - 90th - 99th (time in ms)
    Time user thread waited
    Time for thread to execute
    Cost: 2ms
    29
    At the 99th percentile there is a cost of 2ms for having a separate thread.

    View Slide

  30. Semaphores
    Effectively No Cost
    ~5000rps per instance
    30
    Semaphore isolation on the other hand is used for dependencies which are very high-volume in-memory lookups that never result in a synchronous network request. The cost is practically
    zero (atomic compare-and-set counter for semaphore).

    View Slide

  31. Netflix DependencyCommand Implementation
    31

    View Slide

  32. Netflix DependencyCommand Implementation
    (1) Construct DependencyCommand Object
    On each dependency invocation its DependencyCommand object will be constructed with the arguments necessary to make the call to the server.
    For example:
    DependencyCommand command = new DependencyCommand(arg1, arg2)
    (2) Execution Synchronously or Asynchronously
    Execution of the command can then be performed synchronously or asychronously:
    K value = command.execute()
    Future value = command.queue()
    The synchronous call execute() invokes queue().get() unless the command is specified to not run in a thread.
    (3) Is Circuit Open?
    Upon execution of the command it first checks with the circuit-breaker to ask "is the circuit open?".
    If the circuit is open (tripped) then the command will not be executed and flow routed to (8) DependencyCommand.getFallback().
    If the circuit is closed then the command will be executed and flow continue to (5) DependencyCommand.run().
    (4) Is Thread Pool/Queue Full?
    If the thread-pool and queue associated with the command is full then the execution will be rejected and immediately routed through fallback (8).
    If the command does not run within a thread then this logic will be skipped.
    (5) DependencyCommand.run()
    The concrete implementation run() method is executed.
    (5a) Command Timeout
    The run() method occurs within a thread with a timeout and if it takes too long the thread will throw a TimeoutException. In that case the response is routed
    through fallback (8) and the eventual run() method response is discarded.
    If the command does not run within a thread then this logic will not be applicable.
    32

    View Slide

  33. Netflix DependencyCommand Implementation
    (6) Is Command Successful?
    Application flow is routed based on the response from the run() method.
    (6a) Successful Response
    If no exceptions are thrown and a response is returned (including a null value) then it proceeds to return the response after some logging and a performance
    check.
    (6b) Failed Response
    When a response throws an exception it will mark it as "failed" which will contribute to potentially tripping the circuit open and it will route application flow to (8)
    DependencyCommand.getFallback().
    (7) Calculate Circuit Health
    Successes, failures, rejections and timeouts are all reported to the circuit breaker to maintain a rolling set of counters which calculate statistics.
    These stats are then used to determine when the circuit should "trip" and become open at which point subsequent requests are short-circuited until a period of
    time passes and requests are permitted again after health checks succeed.
    (8) DependencyCommand.getFallback()
    The fallback is performed whenever a command execution fails (an exception is thrown by (5) DependencyCommand.run()) or when it is (3) short-circuited
    because the circuit is open.
    The intent of the fallback is to provide a generic response without any network dependency from an in-memory cache or other static logic.
    (8a) Fallback Not Implemented
    If DependencyCommand.getFallback() is not implemented then an exception with be thrown and the caller left to deal with it.
    (8b) Fallback Successful
    If the fallback returns a response then it will be returned to the caller.
    (8c) Fallback Failed
    If DependencyCommand.getFallback() fails and throws an exception then the caller is left to deal with it.
    This is considered a poor practice to have a fallback implementation that can fail. A fallback should be implemented such that it is not performing any logic that
    would fail. Semaphores are wrapped around fallback execution to protect against software bugs that do not comply with this principle, particular if the fallback itself
    tries to perform a network call that can be latent.
    (9) Return Successful Response
    If (6a) occurred the successful response will be returned to the caller regardless of whether it was latent or not.
    33

    View Slide

  34. Netflix DependencyCommand Implementation
    Fallbacks
    Cache
    Eventual Consistency
    Stubbed Data
    Empty Response
    34

    View Slide

  35. Netflix DependencyCommand Implementation
    35

    View Slide

  36. So, how does it work in the real world?
    36

    View Slide

  37. Visualizing Circuits in Near-Realtime
    (latency is single-digit seconds, generally 1-2)
    37
    This is an example of our monitoring system which provides low-latency (1-2 seconds typically) visibility into the traffic and health of all DependencyCommand circuits across a cluster.

    View Slide

  38. last minute latency percentiles
    Request rate
    2 minutes of request rate to
    show relative changes in traffic
    circle color and size represent
    health and traffic volume
    hosts reporting from cluster
    Error percentage of
    last 10 seconds
    Circuit-breaker
    status
    Rolling 10 second counters
    with 1 second granularity
    Failures/Exceptions
    Thread-pool Rejections
    Thread timeouts
    Successes
    Short-circuited (rejected)
    38

    View Slide

  39. 39
    This view of the dashboard was captured during a latency monkey simulation to test resilience against latency (http://techblog.netflix.com/2011/07/netflix-simian-army.html) and shows how
    several of the DependencyCommands degraded in health and showed timeouts, threadpool rejections, short-circuiting and failures.
    The DependencyCommands of dependencies not affected by latency were unaffected.
    During this test no users were prevented from using Netflix on any devices. Instead fallbacks and graceful degradation occurred and as soon as latency was removed all systems returned
    to health within seconds.

    View Slide

  40. 40
    This was another latency monkey simulation that affected a single DependencyCommand.

    View Slide

  41. Peak at 100M+ incoming requests (30k+/second)
    Success drops off, Timeouts and Short Circuiting shed load
    Latency spikes from ~30ms median to first 2000+ then 10000+ ms
    41
    These graphs show the full duration of a latency monkey simulation (and look similar to real production events) when latency occurred and the DependencyCommand timed-out and short-
    circuited the requests and returned fallbacks.

    View Slide

  42. 42

    View Slide

  43. Fallback.
    Fail silent.
    Fail fast.
    Shed load.
    43

    View Slide

  44. Netflix API
    Dependency A
    Dependency D
    Dependency G
    Dependency J
    Dependency M
    Dependency P
    Dependency B
    Dependency E
    Dependency H
    Dependency K
    Dependency N
    Dependency Q
    Dependency C
    Dependency F
    Dependency I
    Dependency L
    Dependency O
    Dependency R
    44
    Second half of the presentation discusses architectural changes to enable optimizing the API for each Netflix device as opposed to a generic one-size-fits-all API which treats all devices the
    same.

    View Slide

  45. Single Network Request from Clients
    (use LAN instead of WAN)
    landing page requires
    ~dozen API requests
    Netflix API
    Device
    Server
    45
    The one-size-fits-all API results in chatty clients, some requiring ~dozen requests to render a page.

    View Slide

  46. Single Network Request from Clients
    (use LAN instead of WAN)
    some clients are limited in the number of
    concurrent network connections
    46

    View Slide

  47. Single Network Request from Clients
    (use LAN instead of WAN)
    network latency makes this even worse
    (mobile, home, wifi, geographic distance, etc)
    47

    View Slide

  48. Single Network Request from Clients
    (use LAN instead of WAN)
    push call pattern to server ...
    Netflix API
    Device
    Server
    48
    The client should make a single request and push the 'chatty' part to the server where low-latency networks and multi-core servers can perform the work far more efficiently.

    View Slide

  49. Single Network Request from Clients
    (use LAN instead of WAN)
    ... and eliminate redundant calls
    Netflix API
    Device
    Server
    49

    View Slide

  50. 50
    With dozens of classes of devices to support it wasn’t feasible for the API team to create custom endpoints for each device, otherwise a single team would be the bottleneck for all client
    teams and it would be an explosion of complexity for a single team to try and manage. Also, the subject matter expertise of what each device needs does not reside with the API team.
    Instead, the API team provides a platform and allows each client team to build their own custom endpoints that are optimized to the device they are targeting.

    View Slide

  51. Send Only The Bytes That Matter
    (optimize responses for each client)
    part of client now on server
    Netflix API
    Client Client
    Device
    Server
    51
    The client now extends over the network barrier and runs a portion in the server itself. The client sends requests over HTTP to its other half running in the server which then can access a
    Java API at a very granular level to access exactly what it needs and return an optimized response suited to the devices exact requirements and user experience.

    View Slide

  52. Send Only The Bytes That Matter
    (optimize responses for each client)
    client retrieves and delivers exactly what their
    device needs in its optimal format
    Netflix API
    Client Client
    Device
    Server
    52

    View Slide

  53. Send Only The Bytes That Matter
    (optimize responses for each client)
    interface is now a Java API that client
    interacts with at a granular level
    Netflix API
    Service Layer
    Client Client
    Device
    Server
    53

    View Slide

  54. Netflix API
    Service Layer
    Client Client
    Device
    Server
    Leverage Concurrency
    (but abstract away its complexity)
    54

    View Slide

  55. Leverage Concurrency
    (but abstract away its complexity)
    no synchronized, volatile, locks, Futures or
    Atomic*/Concurrent* classes in client-server code
    Netflix API
    Service Layer
    Client Client
    Device
    Server
    55
    Concurrency is abstracted away behind an asynchronous API and data is retrieved, transformed and composed using high-order-functions (such as map, mapMany, merge, zip, take, toList,
    etc). Groovy is used for its closure support that lends itself well to the functional programming style.

    View Slide

  56. Functional Reactive Programming
    composable asynchronous functions
    Fully asynchronous API - Clients can’t block
    def video1Call = api.getVideos(api.getUser(), 123456, 7891234);
    def video2Call = api.getVideos(api.getUser(), 6789543);
    // higher-order functions used to compose asynchronous calls together
    wx.merge(video1Call, video2Call).subscribe([
    onNext: {
    video ->
    // called for each ‘video’ from the merge
    response.getWriter().println("{id: " + video.id + ",
    title: '" + video.title + "'}");
    },
    onError: {
    exception ->
    response.getWriter().println("{errorMessage: '" +
    exception.getMessage() + "'}");
    }
    ])
    Service calls are
    all asynchronous
    Functional
    programming
    with higher-order
    functions
    56

    View Slide

  57. 57

    View Slide

  58. Bursts to Single Dependency
    Duplicate Requests
    58

    View Slide

  59. Request Collapsing
    batch don’t burst
    59
    The DependencyCommand resilience layer is leveraged for concurrency including optimizations such as request collapsing (automated batching) which bundles bursts of calls to the same
    service into batches without the client code needing to understand or manually optimize for batching.
    This is particularly important when client code becomes highly concurrent and data is requested in multiple different code paths sometimes written by different engineers. Request
    collapsing automatically captures and batches the calls together. The collapsing functionality also supports sharded architectures so a batch of requests can be sharded into sub-batches if
    the client-server relationship requires requests to be routed to a sharded backend.

    View Slide

  60. Request Collapsing
    batch don’t burst
    100:1 collapsing ratio (batch size of ~100)
    60
    This graph shows an extreme example of a dependency where we collapse requests at a ratio of 100:1

    View Slide

  61. Request Collapsing
    batch don’t burst
    100:1 collapsing ratio (batch size of ~100)
    4000 rps instead of 400,000 rps
    61
    This is the same graph but on a power scale instead of linear so the blue line (actual network requests) shows up.

    View Slide

  62. 62
    When multiple calls to the same backend occur concurrently or within a short time-window (10ms for example) ...

    View Slide

  63. Multiple network
    calls collapsed
    into one
    63
    ... they are collapsed into a single batched request.

    View Slide

  64. Request Scoped Caching
    short-lived and concurrency aware
    64
    Another use of the DependencyCommand layer is to allow client code to perform requests without concern of duplicate network calls due to concurrency.
    The Futures is atomically cached using “putIfAbsent” in the request scope shared via ThreadLocals of each thread so clients can request data in multiple code paths without inefficiency concerns.

    View Slide

  65. Request Caching
    stateless
    65
    Some examples of request caching de-duplicating backend calls. On some the impact is reasonably high while on most it is a small percentage or none at all but overall provided a
    measurable drop in network calls and in some use cases for client code significantly improved latency by eliminating unnecessary network calls.

    View Slide

  66. 66
    Within a single user request when multiple duplicate calls are executed ...

    View Slide

  67. Extra network call de-duped
    67
    ... they are de-duped through concurrency-aware request-scoped caches.

    View Slide

  68. Optimize for each device. Leverage the server.
    Netflix API
    Device
    Server
    68
    The Netflix API is becoming a platform that empowers user-interface teams to build their own API endpoints that are optimized to their client applications and
    devices.

    View Slide

  69. /ps3/home
    Dependency F
    10 Threads
    Dependency G
    10 Threads
    Dependency H
    10 Threads
    Dependency I
    5 Threads
    Dependency J
    8 Threads
    Dependency A
    10 Threads
    Dependency B
    8 Threads
    Dependency C
    10 Threads
    Dependency D
    15 Threads
    Dependency E
    5 Threads
    Dependency K
    15 Threads
    Dependency L
    4 Threads
    Dependency M
    5 Threads
    Dependency N
    10 Threads
    Dependency O
    10 Threads
    Dependency P
    10 Threads
    Dependency Q
    8 Threads
    Dependency R
    10 Threads
    Dependency S
    8 Threads
    Dependency T
    10 Threads
    /android/home
    /tv/home
    Functional Reactive Dynamic Endpoints
    Asynchronous Java API
    69

    View Slide

  70. Fault Tolerance in a High Volume, Distributed System
    http://techblog.netflix.com/2012/02/fault-tolerance-in-high-volume.html
    Making the Netflix API More Resilient
    http://techblog.netflix.com/2011/12/making-netflix-api-more-resilient.html
    Embracing the Differences : Inside the Netflix API Redesign
    http://techblog.netflix.com/2012/07/embracing-differences-inside-netflix.html
    Ben Christensen
    @benjchristensen
    http://www.linkedin.com/in/benjchristensen
    Netflix is Hiring
    http://jobs.netflix.com
    70

    View Slide