2016 - Trisha Kothari - Data in a dynamic syste...

August 20, 2016

53

2016 - Trisha Kothari - Data in a dynamic system: Strategies for backwards compatibility

Description
There are several unanswered questions in deploying huge schema or logic changes: How do you modify systems with zero downtime or service interruption? How do you optimize online data migrations to allow for fallbacks? Any changes in schema or code in dynamic systems may cause existing users to experience downtime. The talk focuses on strategies to ensure backwards compatibility and prevent breaking data integrity.

Abstract
In an ideal scenario, feature development is easy. Just replace the old code with new code, and you’re done. This is, in fact, true for a system in state of inertia. However, in a dynamic system, with constantly moving pieces of business logic, this presents a hard problem. There are several unanswered questions while deploying huge schema or logic changes: How do you make code and schema changes with zero downtime or service interruption? How do you optimize online migrations of data to allow for fallbacks? Any changes in schema or code in dynamic systems may cause existing users to experience downtime. The talk focuses on strategies to ensure backwards compatibility and prevent breaking data integrity.

Bio
Trisha works as a Software Engineer at Affirm, a take on modern banking started by Max Levchin. At Affirm, Trisha has worked on several projects including the creation of the underlying financial system, architecture of systems for underwriting data processing, and several other product features. She graduated from the University of Pennsylvania studying Computer Science.

PyBay

August 20, 2016

Tweet

More Decks by PyBay

See All by PyBay

2017 - The Packaging Gradient

2

940

2017 - Building Bridges: Stopping Python 2 without damages

0

660

2017 - Bringing Python 3 to LinkedIn

1

570

2017 - Python Debugging with PUDB

0

720

2017 - Opening up to Open Source

0

260

2017 - A Gentle Introduction to Text Classification with Deep Learning

2

200

2017 - Performant Asynchronous Programming at Quora

1

390

2017 - latus - a Personal Cloud Storage App written in Python

2

530

2017 - Everything You Ever Wanted to Know About Web Authentication in Python

3

640

Other Decks in Programming

See All in Programming

変化を楽しむエンジニアリング ~ いままでとこれから ~

0

710

CLI ツールを Go ライブラリとして再実装する理由 / Why reimplement a CLI tool as a Go library

3

1k

Amazon Q CLI開発で学んだAIコーディングツールの使い方

3

180

Scale out your Claude Code ~⁨⁩自社専用Agentで10xする開発プロセス~

9

1.9k

4

1.4k

Comparing decimals in Swift Testing

0

170

バイブコーディング × 設計思考

0

110

11年かかってやっとVibe Codingに時代が追いつきましたね

1

260

SwiftでMCPサーバーを作ろう！

PRO

2

230

一人でAIプロダクトを作るための工夫〜技術選定・開発プロセス編〜 / I want AI to work harder

12

2.5k

ZeroETLで始めるDynamoDBとS3の連携

0

160

CEDEC2025 長期運営ゲームをあと10年続けるための0から始める自動テスト ~4000項目を50%自動化し、月1→毎日実行にした3年間~

akatsukigames_tech

0

120

Featured

See All Featured

The Cult of Friendly URLs

79

6.5k

Large-scale JavaScript Application Architecture

512

110k

A Tale of Four Properties

160

23k

30

1.2k

How to Ace a Technical Interview

278

23k

The Success of Rails: Ensuring Growth for the Next 100 Years

46

7.6k

Intergalactic Javascript Robots from Outer Space

272

27k

Understanding Cognitive Biases in Performance Measurement

29

1.8k

Writing Fast Ruby

628

62k

Typedesign – Prime Four

42

2.7k

"I'm Feeling Lucky" - Building Great Search Experiences for Today's Users (#IAC19)

229

22k

Why Our Code Smells

PRO

337

57k

Transcript

Data in a dynamic system Strategies for backwards compatibility
None
None
None
None
None
None
None
Breaking user experience is not a new problem
Who am I?
None
None
None
None
What is a loan? Movement of money in a ledger
Double entry accounting system id balance amount Mate id 1 cash 100 2 2 principal -100 1
Bank v1 Bank v2
None
Why is data hard in a dynamic system?
None
Dynamic systems • Service level changes • Changes to data
at rest DATA INTEGRITY IS OF PARAMOUNT IMPORTANCE!
Deployment strategies must ensure backwards compatibility
Changes to service level code • Conditionals ◦ if <Condition1>:
Treatment1() else: Treatment2() ◦ Messy code :( • API versioning • Deploying new dependencies first ◦ Optional arguments ◦ Make sure results can be consumed by the caller
Why is dealing with data at rest hard? Data is
big! Data is dumb! How do you get backwards compatibility?
Data versioning
None
Write-both migrations Writer Old data Reader
Write-both migrations Writer Old data Reader New data
Write-both migrations Writer Old data Reader New data Migration!! #hearandnow
@trisha_kothari
Write-both migrations Writer New data Reader
Complex Data migrations
None
None
luigi Open sourced in late 2012 Awesome for batch jobs
Not for replacing Hive or Pig Spotify, Foursquare, Stripe, Affirm, hotels.com, etc Why Luigi for data migration?
None
None
None
Quick note on schema migrations
Alembic Database migration tool for SQLAlchemy Autogenerate: • Easy! •
Gotcha: Renaming a column ⇒ Removal and addition of new column • Another gotcha: “Multiple heads not supported”
None
None
Paying it forward: lessons learned
None
None
None
None
Evolution of database normalizations
Thank you! @trishak42 [email protected]