Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
A perfect Storm for legacy migration
Search
ryan lemmer
October 21, 2013
Programming
0
1.6k
A perfect Storm for legacy migration
EuroClojure 2013 - Berlin
ryan lemmer
October 21, 2013
Tweet
Share
More Decks by ryan lemmer
See All by ryan lemmer
Modern Haskell: making sense of the type system
ryanlemmer
1
600
Distributed Computation: dealing with Time and Failure in the wild
ryanlemmer
0
850
Other Decks in Programming
See All in Programming
猫と暮らすネットワークカメラ生活🐈 ~Vision frameworkでペットを愛でよう~ / iOSDC Japan 2025
yutailang0119
0
220
10年もののAPIサーバーにおけるCI/CDの改善の奮闘
mbook
0
780
私はどうやって技術力を上げたのか
yusukebe
43
17k
NetworkXとGNNで学ぶグラフデータ分析入門〜複雑な関係性を解き明かすPythonの力〜
mhrtech
3
1k
uniqueパッケージの内部実装を支えるweak pointerの話
magavel
0
920
実践AIチャットボットUI実装入門
syumai
7
2.5k
The Past, Present, and Future of Enterprise Java
ivargrimstad
0
140
Чего вы не знали о строках в Python – Василий Рябов, PythoNN
sobolevn
0
160
Introducing ReActionView: A new ActionView-Compatible ERB Engine @ Kaigi on Rails 2025, Tokyo, Japan
marcoroth
3
920
Web Components で実現する Hotwire とフロントエンドフレームワークの橋渡し / Bridging with Web Components
da1chi
3
1.9k
overlayPreferenceValue で実現する ピュア SwiftUI な AdMob ネイティブ広告
uhucream
0
110
CSC305 Lecture 02
javiergs
PRO
1
260
Featured
See All Featured
Rebuilding a faster, lazier Slack
samanthasiow
84
9.2k
How to Think Like a Performance Engineer
csswizardry
27
2k
Writing Fast Ruby
sferik
629
62k
StorybookのUI Testing Handbookを読んだ
zakiyama
31
6.2k
Documentation Writing (for coders)
carmenintech
75
5k
Music & Morning Musume
bryan
46
6.8k
Why Our Code Smells
bkeepers
PRO
339
57k
The World Runs on Bad Software
bkeepers
PRO
71
11k
Imperfection Machines: The Place of Print at Facebook
scottboms
269
13k
Raft: Consensus for Rubyists
vanstee
139
7.1k
The Power of CSS Pseudo Elements
geoffreycrofte
79
6k
Done Done
chrislema
185
16k
Transcript
@ryanlemmer a perfect storm for legacy migration CAPE TOWN @clj_ug_ct
legacy monolith Customer Accounting Billing Product Catalog CRM ... MySQL
Ruby on Rails
legacy Billing Run Customer Accounting Billing Product Catalog CRM ...
Bank Recon MySQL Ruby Ruby
legacy backlog bugs
legacy replacement replace this
legacy replacement replace substitute something that is broken, old or
inoperative
the “legacy problem” can’t fix bugs can’t add features not
performant
a “legacy solution” immutable It’s just too risky to do
in-situ changes
a “legacy solution” vintage the grapes or wine produced in
a particular season
The situation It’s not broken, just Immutable It’s valuable vintage
- still generating revenue We don’t need to “replace” We need to “make the Legacy Problem go away”
vintage migration vintage ?
vintage migration vintage We chose to migrate “financial” parts first
because it posed the highest risk to the business ?
vintage migration vintage statements MySQL Mongo & Redis
feeding off vintage vintage clients invoices ... ...
feeding off vintage statements clients invoices ? ... ...
feeding off vintage clients invoices transform old client write new
client write new invoice transform old invoice ... ...
... ... migration bridge statemen tage Big Run every night
+ incremental run every 10 mins Bridge is one-directional, Statements is read-only Imperative, sequential code
... ... new migration ? full text search stateme vintage
bridge
migration bridge: search clients invoices index- entity index-field index-field index-field
index-field index-field contacts ... ... ...
migration bridge clients invoices index-field index-field index-field index-field index-field write
client write invoice contacts index- entity search statements transform client transform invoice ... ... ... clients invoices ... ... }
... ... ... statements age search statements (batched) bridge search
About 10 million rows several hours to migrate sequentially
first pass solution Batched data migration BUT WHAT NEXT? it
was the easiest thing to do it is not performant not fault tolerant fragile because of data dependencies go parallel and distributed have fault tolerance go real-time served as scaffolding for the next solution
storm Apache Thrift + Nimbus Ingredients: Zookeeper Clojure (> 50%)
* suitable for polyglots
... storm - spouts clients index-field index-field index-field index-field index-field
write client index- entity transform client ... clients
... storm - spout SPOUT TUPLE
storm - data model TUPLE named list of values [“seekoei”
7] [“panda” 10] [147 {:name ‘John’ ...}] [253 {:name ‘Mary’ ...}] word frequency ID client
... storm - spout a SPOUT emits TUPLES UNBOUNDED STREAM
of TUPLES continuously over time a SPOUT is an
... storm - client spout [“client” {:id 147, ...}] CLIENT
SPOUT CLIENT TUPLE periodically emits a entity values
clojure spout (defspout client-‐spout ["entity" “values”] [conf context collector]
(let [next-‐client (next-‐legacy-‐client) tuple [“client” next-‐client]] (spout (nextTuple [] (Thread/sleep 100) (emit-‐spout! collector tuple)) (ack [id])))) creates a pulse
clojure spout (defspout client-‐spout ["entity" “values”] [conf context collector]
(let [next-‐client (next-‐legacy-‐client) tuple [“client” next-‐client]] (spout (nextTuple [] (Thread/sleep 100) (emit-‐spout! collector tuple)) (ack [id]))))
clojure spout [“client” {:id 147, ...}] CLIENT TUPLE (defspout client-‐spout
["entity" “values”] [conf context collector] (let [next-‐client (next-‐legacy-‐client) tuple [“client” next-‐client]] (spout (nextTuple [] (Thread/sleep 100) (emit-‐spout! collector tuple)) (ack [id])))) TUPLE SCHEMA
... storm - spout [“client” {:id 147, ...}] [“client” {:id
201, ...}] [“client” {:id 407, ...}] [“client” {:id 101, ...}] The client SPOUT packages input and emits TUPLES continuously over time
... storm - bolts transform client CLIENT SPOUT BOLT
storm - bolts (defbolt transform-‐client-‐bolt ["client"]
{:prepare true} [conf context collector] (bolt (execute [tuple] (let [h (.getValue tuple 1)] (emit-‐bolt! collector [(transform-‐tuple h)]) (ack! collector tuple)))))
storm - bolts [{:id 147, ...}] OUTGOING TUPLE [“client” {:id
147, ...}] INCOMING TUPLE (defbolt transform-‐client-‐bolt ["client"] {:prepare true} [conf context collector] (bolt (execute [tuple] (let [h (.getValue tuple 1)] (emit-‐bolt! collector [(transform-‐tuple h)]) (ack! collector tuple)))))
storm - topology (topology {"1" (spout-‐spec (client-‐spout)
:p 1)} {"2" (bolt-‐spec {"1" :shuffle} transform-‐client-‐bolt :p 1)})) 1 2 ...
storm - topology (topology {"1" (spout-‐spec (client-‐spout)
:p 1)} {"2" (bolt-‐spec {"1" :shuffle} transform-‐client-‐bolt :p 1)})) 1 2 ...
bolt tasks (topology {"1" (spout-‐spec (client-‐spout)
:p 1)} {"2" (bolt-‐spec {"1" :shuffle} transform-‐client-‐bolt :p 1)})) 1 2 ...
bolt tasks (topology {"1" (spout-‐spec (client-‐spout)
:p 1)} {"2" (bolt-‐spec {"1" :shuffle} transform-‐client-‐bolt :p 3)})) 1 2 ...
which task? (topology {"1" (spout-‐spec (client-‐spout)
:p 1)} {"2" (bolt-‐spec {"1" :shuffle} transform-‐client-‐bolt :p 3)})) 1 2 ? ...
grouping - “shuffle” (topology {"1" (spout-‐spec (client-‐spout)
:p 1)} {"2" (bolt-‐spec {"1" :shuffle} transform-‐client-‐bolt :p 3)})) 1 2 ...
grouping - “ field” 1 2 ... [“active” {:id 147,
...}] [12 {:inv-id 147, ...}] TUPLE SCHEMA ["client-‐id" “invoice-‐vals”] count invoices per client (in memory)
grouping - “ field” 1 2 ... [“active” {:id 147,
...}] [12 {:inv-id 147, ...}] [“active” {:id 147, ...}] [“active” {:id 147, ...}] [401 {:inv-id 32, ...}] [“active” {:id 147, ...}] [“active” {:id 147, ...}] [232 {:inv-id 45, ...}] TUPLE SCHEMA ["client-‐id" “invoice-‐vals”] group by field “client-id”
grouping - “ field” (topology {"1" (spout-‐spec (client-‐spout)
:p 1)} {"2" (bolt-‐spec {"1" [“client-‐id”]} transform-‐client-‐bolt :p 3)})) 1 2 ...
grouping - “ field” 1 2 ... [“active” {:id 147,
...}] [12 {:inv-id 147, ...}] [“active” {:id 147, ...}] [“active” {:id 147, ...}] [401 {:inv-id 32, ...}] [“active” {:id 147, ...}] [“active” {:id 147, ...}] [232 {:inv-id 45, ...}] 2 2 similar “client-id” vals go to the same Bolt Task
grouping - “ field” ... field compute aggregation
bridge - topology index-field write client write invoice index- fields
transform client transform invoice ... ... ... clients invoices contacts
storm - failure success! oops! a failure! ...
storm reliability Build a tree of tuples so that Storm
knows which tuples are related ack/fail Spouts + Bolts
storm guarantees Storm will re-process the entire tuple tree on
failure First attempt fails Storm retries the tuple tree until it succeeds
failure + idempotency write client transform client x2 x2 side-effects!
...
transactional topologies write client transform client x1 x1 run-once semantics
... strong ordering on data processing Storm Trident
search statements storm topologies real-time bridge age
topology design ... ... ...
topology design ... ... ... design the (directed) graph
grouping + parallelism index-field write client write invoice index- fields
transform client transform invoice :shuffle :shuffle :shuffle :shuffle :shuffle :shuffle :p 1 :p 1 :p 1 :p 10 :p 3 :p 3 ... ... ... tune the runtime by annotating the graph edges
topology - tuple schema [“client”] [“entity” “values”] [“invoice”] [“entity” “values”]
[“entity” “values”] [“client”] [“invoice”] [“key_val_pairs”] [“key_val”] We are actually processing streams of tuples continuously
ntage topology design clients context sales context billing context (queue)
(queue) .. .. .. .. .. ..
storm “real-time, distributed, fault-tolerant, computation system” stream processing realtime analytics
continuous computation distributed RPC ...
reflections
search statements age storm topologies vintage is first- class
search statements age storm topologies transform data
search statements age storm topologies not code refactor if you
can! (but only if it’s worth the effort)
search statements age storm topologies not a picnic because we’re
still replacing code and now we’ve added replication
but worth it Big Replace Smaller replacements In-situ changes Augment:
new alongside old Replace Evolve new Kill Starve (until irrelevant)
EUROCLOJURE Berlin 2013 thanks