Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Shipping a Replacement Architecture in Elixir

Chris Bell
February 11, 2018

Shipping a Replacement Architecture in Elixir

Talk given at EMPEX LA, 2018.

Describes the journey at Frame.io of shipping a new Elixir powered set of services to replace our aging Ruby on Rails and Node.js powered stack.

#elixir #phoenix

Chris Bell

February 11, 2018
Tweet

More Decks by Chris Bell

Other Decks in Technology

Transcript

  1. I’m Chris and I <3 Elixir • 3 years of

    writing production Elixir apps • EMPEX NYC Organizer • ElixirTalk Co-host (with @desmondmonster)
  2. • Powers Review & Collaboration for Video teams • Used

    by Vice, Turner, Buzzfeed, NYTimes, NASA • 450,000+ customers • Founded in 2014, based in NYC
  3. • Ruby 1.9.3, Rails 3.2 • No tests… at all…

    for anything • Custom ORM into DynamoDB • Metaprogramming everywhere • No logical separation of concerns (authorization, persistence, domain logic) • No metrics or visibility into performance API ISSUES PRE MIGRATION
  4. • Everything is a string … even nulls. • Foreign

    keys stored as a JSON encoded list of strings on the model. No atomic updates = lost data. • DynamoDB cost $$$ to support our workload • No ability to paginate anything = up to 45s response times and large payloads (> 1mb of JSON) DATABASE ISSUES PRE-MIGRATION
  5. • Easy-ish ramp up for our existing Ruby / Python

    developers • Highly concurrent: use resources much more efficiently • Building on a mature VM (BEAM) and established language (Erlang & OTP) • Language attributes promote explicitness: immutability, pattern matching, multi function heads. WHY ELIXIR?
  6. API 
 (Ruby on Rails) Websocket connection Real-time Service (Node.JS)

    Support Tool (Node.js) Push Notifications SES Email Digest Service (Node.js) Client
 (iOS / Web / Adobe) DynamoDB LEGACY ARCHITECTURE
  7. API 
 (Ruby on Rails) Websocket connection Real-time Service (Node.JS)

    Support Tool (Node.js) Push Notifications SES Email Digest Service (Node.js) Client
 (iOS / Web / Adobe) DynamoDB LEGACY ARCHITECTURE: WHAT WE REPLACED
  8. • Elixir powered API, notifications system, real-time service, and support

    tool • Migrated all of our data to Postgres from DynamoDB • Dockerized all of the above and rebuilt our tooling and deploy process from the ground up WHAT WE SHIPPED
  9. V2 API 
 (Phoenix) Websocket connection Real-time Service 
 (Elixir

    / Phoenix) Support Tool (Phoenix) Push Notifications SES Email Service Client
 (iOS / Web / Adobe) Postgres UPDATED ARCHITECTURE Munger API (Phoenix) Core Business Logic Memcached Umbrella App
  10. • ~40 EC2 Instances to ~5 (running on ECS) •

    API 95th Percentile: ~30ms @ ~120rps • Database cost 91% reduced • Full visibility into all parts of the system (via statsd & datadog) • Modular, documented, maintainable codebase WHAT WE SHIPPED: RESULTS
  11. 1. The Intermediary API 2. Umbrella App Structure 3. Event

    System 4. Moving millions of records A WHIRLWIND TOUR THROUGH THE SYSTEM
  12. • Consumes new API, spits out old schemas and maintains

    legacy contract (by stringifying everything) • Allowed us to ship our new stack sooner with fewer implications for our different clients • Complexity is high, but designed to be thrown away (~6 months time) THE INTERMEDIARY API
  13. 1. The Intermediary API 2. Umbrella App Structure 3. Event

    System 4. Moving millions of records A WHIRLWIND TOUR THROUGH THE SYSTEM
  14. • We use a single ‘monorepo’ to contain all our

    separate applications structured as an Umbrella • Total of 11 apps right now UMBRELLA APP STRUCTURE
  15. UMBRELLA APP STRUCTURE Core API Support Tool Munger DB Cron

    Dynasaur Monitoring Middleware PHOENIX APPS BUSINESS LOGIC SHARED COMPONENTS Auth Email
  16. UMBRELLA APP STRUCTURE Core API Support Tool Munger DB Cron

    Dynasaur Monitoring Middleware PHOENIX APPS BUSINESS LOGIC SHARED COMPONENTS Auth Email
  17. • Apps built and deployed as separate Docker containers in

    CircleCI via Distillery • Each build & deploy takes ~5 minutes (run in parallel) • Blue / green deploys via ECS • All auto-scaled via CPU / Memory threshold alarms UMBRELLA APP STRUCTURE
  18. UMBRELLA APP STRUCTURE Core API Support Tool Munger DB Cron

    Dynasaur Monitoring Middleware PHOENIX APPS BUSINESS LOGIC SHARED COMPONENTS Auth Email
  19. • Core houses all of our business logic, services, Ecto

    schemas, access policies, deferred logic, and more • Broken into two contexts: Accounts & Projects • API & Support Tool use the Core to fetch data and execute requests – they are effectively dumb HTTP wrappers UMBRELLA APP STRUCTURE: CORE
  20. 1. The Intermediary API 2. Umbrella App Structure 3. Event

    System 4. Moving millions of records A WHIRLWIND TOUR THROUGH THE SYSTEM
  21. • All changes through our system broadcasted through a single,

    local event bus • Provides a powerful hook to build deferred functionality on-top of (like notifications, analytics tracking etc) • Implemented using GenStage and Protocols EVENT SYSTEM: WHAT IS IT?
  22. EVENT SYSTEM AssetCreated
 Event Broadcaster Audits Analytics Notifications Usage Cache

    Broadcaster will notify all consumers concurrently. Events always typed as structs Consumers implemented DynamicSupervisors
  23. A WHIRLWIND TOUR THROUGH THE SYSTEM 1. The Intermediary API

    2. Umbrella App Structure 3. Event System 4. Moving millions of records
  24. • Moved all our records from DynamoDB into Postgres •

    Migrated through Flow tasks that streamed data from our tables and converted into the appropriate Postgres schemas • Largest table size was ~9m records, each record > 100kb (lots of JSON) MOVING MILLIONS OF RECORDS
  25. MOVING MILLIONS OF RECORDS: HOW IT WORKS Define Dynamo Schema

    Stream from table Translate Old to New 3 2 1
  26. MOVING MILLIONS OF RECORDS: HOW IT WORKS Define Dynamo Schema

    Stream from table Translate Old to New 3 2 1
  27. MOVING MILLIONS OF RECORDS: HOW IT WORKS Define Dynamo Schema

    Stream from table Translate Old to New 3 2 1
  28. MOVING MILLIONS OF RECORDS: HOW IT WORKS • Each table

    migration job runs in its own isolated Docker container using the ECS run task • We monitored errors in our jobs and constantly refined and tweaked the parallelism for each job • Ran weekly migrations and manually checked the migrated data in our QA environment
  29. • Bugs and replicating old bugs • Team ramp up:

    3 new developers learning Elixir trying to ship a thing is hard. Protip: establish patterns. • Understanding the performance characteristics of a new system and new database. • Estimation of complexity: went 6 weeks over our planned delivery date. CHALLENGES DURING THE MIGRATION
  30. TAKEAWAY #3 Good code isn’t about getting it right the

    first time. Good code is just legacy code that doesn’t get in the way. @tef_ebooks