Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Building a real time analytics engine in JRuby
Search
David Dahl
March 02, 2013
Programming
530
1
Share
Building a real time analytics engine in JRuby
David Dahl
March 02, 2013
More Decks by David Dahl
See All by David Dahl
Nosql - getting over the bad parts
effata
1
120
Other Decks in Programming
See All in Programming
UIの境界線をデザインする | React Tokyo #15 メイントーク
sasagar
2
290
Kubernetes上でAgentを動かすための最新動向と押さえるべき概念まとめ
sotamaki0421
3
490
レガシーPHP転生 〜父がドメインエキスパートだったのでDDD+Claude Codeでチート開発します〜
panda_program
0
880
Going Multiplatform with Your Android App (Android Makers 2026)
zsmb
2
400
ルールルルルルRubyの中身の予備知識 ── RubyKaigiの前に予習しなイカ?
ydah
1
170
「Linuxサーバー構築標準教科書」を読んでみた #ツナギメオフライン.7
akase244
0
1.4k
iOS機能開発のAI環境と起きた変化
ryunakayama
0
180
Nuxt Server Components
wattanx
0
270
セグメントとターゲットを意識するプロポーザルの書き方 〜採択の鍵は、誰に刺すかを見極めるマーケティング戦略にある〜
m3m0r7
PRO
0
510
おれのAgentic Coding 2026/03
tsukasagr
1
140
それはエンジニアリングの糧である:AI開発のためにAIのOSSを開発する現場より / It serves as fuel for engineering: insights from the field of developing open-source AI for AI development.
nrslib
1
850
TiDBのアーキテクチャから学ぶ分散システム入門 〜MySQL互換のNewSQLは何を解決するのか〜 / tidb-architecture-study
dznbk
1
170
Featured
See All Featured
Mobile First: as difficult as doing things right
swwweet
225
10k
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
35
3.4k
Templates, Plugins, & Blocks: Oh My! Creating the theme that thinks of everything
marktimemedia
31
2.8k
Organizational Design Perspectives: An Ontology of Organizational Design Elements
kimpetersen
PRO
1
670
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
35
2.4k
Git: the NoSQL Database
bkeepers
PRO
432
67k
We Analyzed 250 Million AI Search Results: Here's What I Found
joshbly
1
1.2k
jQuery: Nuts, Bolts and Bling
dougneiner
66
8.4k
Agile Leadership in an Agile Organization
kimpetersen
PRO
0
130
Leveraging Curiosity to Care for An Aging Population
cassininazir
1
220
How Fast Is Fast Enough? [PerfNow 2025]
tammyeverts
3
520
How to Think Like a Performance Engineer
csswizardry
28
2.5k
Transcript
Building a real time analytics engine in JRuby David Dahl
@effata
whoami ‣ Senior developer at Burt ‣ Analytics for online
advertising ‣ Ruby lovers since 2009 ‣ AWS
None
None
None
Getting started ‣ Writing everything to mysql, querying for every
report - Broke down on first major campaign ‣ Precalculate all the things! ‣ Every operation in one application - Extremely scary to deploy ‣ Still sticking to MRI
None
Stuck ‣ Separate and buffer with RabbitMQ - Eventmachine ‣
Store stuff with MongoDB - Blocking operations ‣ Bad things
Java? ‣ Threading ‣ “Enterprise” ‣ Lots of libraries Think
about creating something Java ecosystem Discover someone has made it for you already Profit!
Moving to JRuby ‣ Threads! ‣ A real GC ‣
JIT ‣ Every Java, Scala, Ruby lib ever made ‣ Wrapping java libraries is fun! ‣ Bonus: Not hating yourself
Challenges
“100%” uptime ‣ We can “never” be down! ‣ But
we can pause ‣ Don’t want to fail on errors ‣ But it’s ok to die
Buffering ‣ Split into isolated services ‣ Add a buffering
pipeline between - We LOVE RabbitMQ ‣ Ack and persist in a “transaction” ‣ Figure out if you want - at most once - at least once
Databases ‣ Pick the right tool for the job ‣
MongoDB everywhere = bad ‣ Cassandra ‣ Redis ‣ NoDB - keep it streaming!
Java.util.concurrent
Shortcut
Executors Better than doing Thread.new
thread_pool = ! Executors.new_fixed_thread_pool(16) stuff.each do |item| thread_pool.submit do crunch_stuff(item)
end end
Blocking queues Producer/consumer pattern made easy Don’t forget back pressure!
queue = ! JavaConcurrent::LinkedBlockingQueue.new # With timeout queue.offer(data, 60, Java::TimeUnit::SECONDS)
queue.poll(60, Java::TimeUnit::SECONDS) # Blocking queue.put(data) queue.take
Back pressure Storage Timer Data processing Queue State
queue = ! JavaConcurrent::ArrayBlockingQueue.new(100) # With timeout queue.offer(data, 60, Java::TimeUnit::SECONDS)
queue.poll(60, Java::TimeUnit::SECONDS) # Blocking queue.put(data) queue.take
More awesomeness ‣ Java.util.concurrent - Atomic(Boolean/Integer/Long) - ConcurrentHashMap - CountDownLatch
/ Semaphore ‣ Google Guava ‣ LMAX Disruptor
Easy mode ‣ Thread safety is hard ‣ Use j.u.c
‣ Avoid shared mutual state if possible ‣ Back pressure
Actors Another layer of abstractions
Akka Concurrency library in Scala Most famous for its actor
implementation
Mikka Small ruby wrapper around Akka
class SomeActor < Mikka::Actor def receive(message) # do the thing
end end
Storm github.com/colinsurprenant/redstorm
We broke it But YOU should definitely try it out!
Hadoop github.com/iconara/rubydoop
module WordCount class Mapper def map(key, value, context) # ...
end end class Reducer def reduce(key, value, context) # ... end end end
Rubydoop.configure do |input_path, output_path| job 'word_count' do input input_path output
output_path mapper WordCount::Mapper reducer WordCount::Reducer output_key Hadoop::Io::Text output_value Hadoop::Io::IntWritable end end
Other cool stuff ‣ Hotbunnies ‣ Eurydice ‣ Bundesstrasse ‣
Multimeter
Thank you @effata
[email protected]