Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Building a real time analytics engine in JRuby
Search
David Dahl
March 02, 2013
Programming
1
520
Building a real time analytics engine in JRuby
David Dahl
March 02, 2013
Tweet
Share
More Decks by David Dahl
See All by David Dahl
Nosql - getting over the bad parts
effata
1
110
Other Decks in Programming
See All in Programming
JJUG CCC 2025 Fall: Virtual Thread Deep Dive
ternbusty
3
350
AsyncSequenceとAsyncStreamのプロポーザルを全部読む!!
s_shimotori
1
280
Core MIDI を勉強して作曲用の電子ピアノ作ってみた!
hypebeans
0
110
What’s Fair is FAIR: A Decentralised Future for WordPress Distribution
rmccue
0
170
HTTPじゃ遅すぎる! SwitchBotを自作ハブで動かして学ぶBLE通信
occhi
0
240
Functional Calisthenics in Kotlin: Kotlinで「関数型エクササイズ」を実践しよう
lagenorhynque
0
130
CSC509 Lecture 11
javiergs
PRO
0
310
DartASTとその活用
sotaatos
2
120
Kotlin 2.2が切り拓く: コンテキストパラメータで書く関数型DSLと新しい依存管理のかたち
knih
0
420
ビルドプロセスをデバッグしよう!
yt8492
0
310
Web エンジニアが JavaScript で AI Agent を作る / JSConf JP 2025 sponsor session
izumin5210
4
1.4k
Kotlin + Power-Assert 言語組み込みならではのAssertion Library採用と運用ベストプラクティス by Kazuki Matsuda/Gen-AX
kazukima
0
110
Featured
See All Featured
The Art of Programming - Codeland 2020
erikaheidi
56
14k
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
127
54k
The Cost Of JavaScript in 2023
addyosmani
55
9.2k
Reflections from 52 weeks, 52 projects
jeffersonlam
355
21k
Making the Leap to Tech Lead
cromwellryan
135
9.6k
How Fast Is Fast Enough? [PerfNow 2025]
tammyeverts
3
320
Fantastic passwords and where to find them - at NoRuKo
philnash
52
3.5k
We Have a Design System, Now What?
morganepeng
54
7.9k
Fight the Zombie Pattern Library - RWD Summit 2016
marcelosomers
234
17k
Agile that works and the tools we love
rasmusluckow
331
21k
Producing Creativity
orderedlist
PRO
348
40k
Git: the NoSQL Database
bkeepers
PRO
432
66k
Transcript
Building a real time analytics engine in JRuby David Dahl
@effata
whoami ‣ Senior developer at Burt ‣ Analytics for online
advertising ‣ Ruby lovers since 2009 ‣ AWS
None
None
None
Getting started ‣ Writing everything to mysql, querying for every
report - Broke down on first major campaign ‣ Precalculate all the things! ‣ Every operation in one application - Extremely scary to deploy ‣ Still sticking to MRI
None
Stuck ‣ Separate and buffer with RabbitMQ - Eventmachine ‣
Store stuff with MongoDB - Blocking operations ‣ Bad things
Java? ‣ Threading ‣ “Enterprise” ‣ Lots of libraries Think
about creating something Java ecosystem Discover someone has made it for you already Profit!
Moving to JRuby ‣ Threads! ‣ A real GC ‣
JIT ‣ Every Java, Scala, Ruby lib ever made ‣ Wrapping java libraries is fun! ‣ Bonus: Not hating yourself
Challenges
“100%” uptime ‣ We can “never” be down! ‣ But
we can pause ‣ Don’t want to fail on errors ‣ But it’s ok to die
Buffering ‣ Split into isolated services ‣ Add a buffering
pipeline between - We LOVE RabbitMQ ‣ Ack and persist in a “transaction” ‣ Figure out if you want - at most once - at least once
Databases ‣ Pick the right tool for the job ‣
MongoDB everywhere = bad ‣ Cassandra ‣ Redis ‣ NoDB - keep it streaming!
Java.util.concurrent
Shortcut
Executors Better than doing Thread.new
thread_pool = ! Executors.new_fixed_thread_pool(16) stuff.each do |item| thread_pool.submit do crunch_stuff(item)
end end
Blocking queues Producer/consumer pattern made easy Don’t forget back pressure!
queue = ! JavaConcurrent::LinkedBlockingQueue.new # With timeout queue.offer(data, 60, Java::TimeUnit::SECONDS)
queue.poll(60, Java::TimeUnit::SECONDS) # Blocking queue.put(data) queue.take
Back pressure Storage Timer Data processing Queue State
queue = ! JavaConcurrent::ArrayBlockingQueue.new(100) # With timeout queue.offer(data, 60, Java::TimeUnit::SECONDS)
queue.poll(60, Java::TimeUnit::SECONDS) # Blocking queue.put(data) queue.take
More awesomeness ‣ Java.util.concurrent - Atomic(Boolean/Integer/Long) - ConcurrentHashMap - CountDownLatch
/ Semaphore ‣ Google Guava ‣ LMAX Disruptor
Easy mode ‣ Thread safety is hard ‣ Use j.u.c
‣ Avoid shared mutual state if possible ‣ Back pressure
Actors Another layer of abstractions
Akka Concurrency library in Scala Most famous for its actor
implementation
Mikka Small ruby wrapper around Akka
class SomeActor < Mikka::Actor def receive(message) # do the thing
end end
Storm github.com/colinsurprenant/redstorm
We broke it But YOU should definitely try it out!
Hadoop github.com/iconara/rubydoop
module WordCount class Mapper def map(key, value, context) # ...
end end class Reducer def reduce(key, value, context) # ... end end end
Rubydoop.configure do |input_path, output_path| job 'word_count' do input input_path output
output_path mapper WordCount::Mapper reducer WordCount::Reducer output_key Hadoop::Io::Text output_value Hadoop::Io::IntWritable end end
Other cool stuff ‣ Hotbunnies ‣ Eurydice ‣ Bundesstrasse ‣
Multimeter
Thank you @effata
[email protected]