Back on your Feet

BACK ON YOUR FEET Or I how learned to love
state

HELLO! Claudio Ortolina Head of Elixir @ Erlang Solutions Ltd.
[email protected] @cloud8421

ABOUT STATE Data that changes over time

Useful Diﬃcult to model Diﬃcult to maintain Can be lost
(and needs to be recovered)

IT REQUIRES THINKING

OUT OF THE TAR PIT* B. Moseley, P.Marks, 2006 *Link

OUR USE CASE: GIGS NEAR ME For the concert connoisseur

HOW IT WORKS ➤ Given my location (deﬁned by lat/lng),
get a list of relevant gigs ➤ For each one of them get, the artists involved ➤ For each artist, get their latest release

EXAMPLE GET /gigs/47.6183116/-122.20037839999999

DATA MODEL EVENTS has many LOCATION have many ARTISTS have
one RELEASE

DATA MODEL defmodule Gig.Location do defstruct coords: {0, 0}, metro_area:
nil, event_ids: [] end defmodule Gig.Event do defstruct id: nil, name: nil, artists: [], venue: nil, starts_at: :not_available end

DATA MODEL defmodule Gig.Artist do defstruct id: nil, mbid: nil,
name: nil end defmodule Gig.Release do defstruct id: nil, title: nil, type: "Album", release_date: :not_available end

PAIN POINTS ➤ Cannot query APIs in real-time (too expensive,
N+1 api calls) ➤ Both APIs are rate-limited ➤ Need to cache data ➤ Results update over time without us knowing anything about it (making polling necessary)

ITERATION ONE

ONE LOCATION, ONE PROCESS ➤ For each location, start a
new process ➤ We use registry to track them ➤ Each process fetches and refreshes its own data

PROCESS LIFECYCLE Time INIT FETCH EVENTS cast FETCH RELEASES cast
TERMINATE send_after :expire

PROS ➤ Basic isolation (an isolated process crash doesn't aﬀect
others) ➤ Scales predictably (memory usage) ➤ Easy expire (self-terminate the process)

CONS ➤ A repeated failure of a single process can
take down the application tree ➤ Events are duplicated among processes

ITERATION TWO

EXTRACT DATA STORAGE ➤ Events and releases moved to shared
ETS tables ➤ Process keeps location data + list of event ids ➤ Requires periodic cleanup of storage (in case of crashes, data may get stale)

PROCESS LIFECYCLE Time INIT TERMINATE send_after :expire ETS FETCH EVENTS
cast FETCH RELEASES cast

PROS ➤ More eﬃcient memory usage ➤ Concurrent reads and
writes ➤ Data survives everything except a node crash

CONS ➤ A process crash loses the relationship between location
and events

ITERATION THREE

MORE EXTRACTIONS ➤ Move locations to ETS ➤ Don’t go
through the process for any reads ➤ The process is only responsible for refresh and expire

PROCESS LIFECYCLE Time INIT TERMINATE send_after :expire ETS FETCH EVENTS
cast FETCH RELEASES cast STORE LOCATION cast

PROS ➤ Fast concurrent lookup for everything ➤ Survives refresh
crashes (worst case scenario is stale data)

CONS ➤ Stale data requires explicit information about its nature,
e.g. (6 hours old). ➤ Requires sweep

ITERATION FOUR

GOING DISTRIBUTED ➤ Discreet pieces of data linked by references
(event ids, musicbrainz ids) ➤ If any reference points to non existing data, we can trigger a refresh and expose the inconsistent state to the api consumer, so that the user has the right expectations ➤ For sharding on normal distribution, we can replace ETS with shards (or equivalent) ➤ Other option is using classic, external datastore (which allows horizontal scaling)

RATE LIMIT ➤ Interaction with external apis can modelled with
a queue which fetches respecting a rate limit ➤ Api client should also use a rate limiter to avoid being blocked ➤ Load testing with rate limit is key

SUMMARY

FOCUS ON STATUSES Exposing status (started, fetching, fetched) forces clients
to handle all corner cases

ASSUME DATA INCONSISTENCY Sooner or later something will crash. Focus
on writing code that recovers as eﬃciently as possible

HANDLE DISTRIBUTION RESPONSIBLY Powerful, but may introduce more inconsistent scenarios

KEEP A WIDE ARSENAL OF TOOLS Queues, rate limiters, backoﬀs
are only two examples. Resilient design requires planning.

THANKS! Any questions?

HELLO! Claudio Ortolina Head of Elixir @ Erlang Solutions Ltd.
[email protected] @cloud8421

Back on your Feet

Back on your Feet

More Decks by Claudio Ortolina

Other Decks in Technology

Featured

Transcript