10 insane things on Big Data

10 insane things on big data LUIS BELLOCH   MONEYMATE
/ ACCUDELTA SEPT. 2016 ETSINF UPV

Since 1991, ~100 employees Ofﬁces in Dublin, Boston, London, New
York, Stockholm, Milan and Valencia. Valencia is an engineering ofﬁce only Black Rock, Fidelity, J.P. Morgan US, M&G, Prudential, Charles Schwab, Schroders, State Street, Columbia Threadneedle, Canada Life, IFDS, New Ireland, ...

#1 Wild Data So… is that a bunch of Excel
and CSV files randomly piled up? - Day 1, MoneyMate developer “

#2 Timing ⏰ Data is inconsistent most of the time!

#3 Schema Agnostic Every client has his own schema,  loading
system has to be fast.

#3 Schema Agnostic • Reduced load time from 22 h
to 9min • In-Memory and DB modes • Avoid write-locks as much as possible • Homeostasis: resilient/adaptive loading • Reactive async publishing LOADING PUBLISHING

#4 Parallel Testing Replay one-month events in the system,  …
using two software versions,   … then compare row-by-row, cell-by-cell.

#5 Schema Evolutions • ~50MB of SQL, several more CSVs
• VCS and code review friendly • Test-data & container migrations • Forward-only, no rollbacks • Exercised many times per day through CI builds • etcd distributed locks, coordination

#6a Market Right after the Brexit, one of our clients
started to load data in a daily-basis, instead of monthly.

#6b Government Solvency II regulation was delayed for +2 years

#7 Latency, the hard way Minimum network latency between New
York and Dublin  Distance: 5111.28 km Best fiber refractive index: 1.5 (n = c / v) Max speed on that fiber: 199,861,639 m/s tfiber = 5111.28 / vmax = 25.57ms tmin = d / c = 17.04ms

(http://www.nanex.net/aqck2/4680.html)

#8 DIY Cluster Cloud? Over my dead body. - One
of our lovely customers “

That moment when you realize undersea cable broke and cluster
is down (2014)

#9 Who needs a cluster? Most of the problems are
small.  Distributed systems are hard.

#10 Small Data Big data is an excuse,  a catalyst
improving the tools we have today

thanks! @luisbelloch

10 insane things on Big Data

10 insane things on Big Data

Luis Belloch

More Decks by Luis Belloch

Other Decks in Programming

Featured

Transcript