Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Building a real time analytics engine in JRuby
Search
David Dahl
March 02, 2013
Programming
1
500
Building a real time analytics engine in JRuby
David Dahl
March 02, 2013
Tweet
Share
More Decks by David Dahl
See All by David Dahl
Nosql - getting over the bad parts
effata
1
110
Other Decks in Programming
See All in Programming
Amazon Bedrockマルチエージェントコラボレーションを諦めてLangGraphに入門してみた
akihisaikeda
1
190
Djangoにおける複数ユーザー種別認証の設計アプローチ@DjangoCongress JP 2025
delhi09
PRO
4
520
Rails 1.0 のコードで学ぶ find_by* と method_missing の仕組み / Learn how find_by_* and method_missing work in Rails 1.0 code
maimux2x
1
280
iOSでQRコード生成奮闘記
ktcryomm
2
150
未経験でSRE、はじめました! 組織を支える役割と軌跡
curekoshimizu
1
230
Devin入門 〜月500ドルから始まるAIチームメイトとの開発生活〜 / Introduction Devin 〜Development With AI Teammates〜
rkaga
5
1.7k
「個人開発マネタイズ大全」が教えてくれたこと
bani24884
1
320
AWS CDKにおけるL2 Constructの仕組み / aws-cdk-l2-construct
gotok365
4
300
ABEMA iOS 大規模プロジェクトにおける段階的な技術刷新 / ABEMA iOS Technology Upgrade
akkyie
1
270
Expoによるアプリ開発の現在地とReact Server Componentsが切り開く未来
yukukotani
2
300
PRレビューのお供にDanger
stoticdev
1
250
メンテが命: PHPフレームワークのコンテナ化とアップグレード戦略
shunta27
0
340
Featured
See All Featured
The Illustrated Children's Guide to Kubernetes
chrisshort
48
49k
[RailsConf 2023] Rails as a piece of cake
palkan
53
5.3k
The Pragmatic Product Professional
lauravandoore
32
6.4k
Optimising Largest Contentful Paint
csswizardry
34
3.1k
How GitHub (no longer) Works
holman
314
140k
Save Time (by Creating Custom Rails Generators)
garrettdimon
PRO
29
1.1k
Typedesign – Prime Four
hannesfritz
41
2.5k
The Power of CSS Pseudo Elements
geoffreycrofte
75
5.5k
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
29
2.5k
Speed Design
sergeychernyshev
28
820
Building a Scalable Design System with Sketch
lauravandoore
462
33k
Statistics for Hackers
jakevdp
797
220k
Transcript
Building a real time analytics engine in JRuby David Dahl
@effata
whoami ‣ Senior developer at Burt ‣ Analytics for online
advertising ‣ Ruby lovers since 2009 ‣ AWS
None
None
None
Getting started ‣ Writing everything to mysql, querying for every
report - Broke down on first major campaign ‣ Precalculate all the things! ‣ Every operation in one application - Extremely scary to deploy ‣ Still sticking to MRI
None
Stuck ‣ Separate and buffer with RabbitMQ - Eventmachine ‣
Store stuff with MongoDB - Blocking operations ‣ Bad things
Java? ‣ Threading ‣ “Enterprise” ‣ Lots of libraries Think
about creating something Java ecosystem Discover someone has made it for you already Profit!
Moving to JRuby ‣ Threads! ‣ A real GC ‣
JIT ‣ Every Java, Scala, Ruby lib ever made ‣ Wrapping java libraries is fun! ‣ Bonus: Not hating yourself
Challenges
“100%” uptime ‣ We can “never” be down! ‣ But
we can pause ‣ Don’t want to fail on errors ‣ But it’s ok to die
Buffering ‣ Split into isolated services ‣ Add a buffering
pipeline between - We LOVE RabbitMQ ‣ Ack and persist in a “transaction” ‣ Figure out if you want - at most once - at least once
Databases ‣ Pick the right tool for the job ‣
MongoDB everywhere = bad ‣ Cassandra ‣ Redis ‣ NoDB - keep it streaming!
Java.util.concurrent
Shortcut
Executors Better than doing Thread.new
thread_pool = ! Executors.new_fixed_thread_pool(16) stuff.each do |item| thread_pool.submit do crunch_stuff(item)
end end
Blocking queues Producer/consumer pattern made easy Don’t forget back pressure!
queue = ! JavaConcurrent::LinkedBlockingQueue.new # With timeout queue.offer(data, 60, Java::TimeUnit::SECONDS)
queue.poll(60, Java::TimeUnit::SECONDS) # Blocking queue.put(data) queue.take
Back pressure Storage Timer Data processing Queue State
queue = ! JavaConcurrent::ArrayBlockingQueue.new(100) # With timeout queue.offer(data, 60, Java::TimeUnit::SECONDS)
queue.poll(60, Java::TimeUnit::SECONDS) # Blocking queue.put(data) queue.take
More awesomeness ‣ Java.util.concurrent - Atomic(Boolean/Integer/Long) - ConcurrentHashMap - CountDownLatch
/ Semaphore ‣ Google Guava ‣ LMAX Disruptor
Easy mode ‣ Thread safety is hard ‣ Use j.u.c
‣ Avoid shared mutual state if possible ‣ Back pressure
Actors Another layer of abstractions
Akka Concurrency library in Scala Most famous for its actor
implementation
Mikka Small ruby wrapper around Akka
class SomeActor < Mikka::Actor def receive(message) # do the thing
end end
Storm github.com/colinsurprenant/redstorm
We broke it But YOU should definitely try it out!
Hadoop github.com/iconara/rubydoop
module WordCount class Mapper def map(key, value, context) # ...
end end class Reducer def reduce(key, value, context) # ... end end end
Rubydoop.configure do |input_path, output_path| job 'word_count' do input input_path output
output_path mapper WordCount::Mapper reducer WordCount::Reducer output_key Hadoop::Io::Text output_value Hadoop::Io::IntWritable end end
Other cool stuff ‣ Hotbunnies ‣ Eurydice ‣ Bundesstrasse ‣
Multimeter
Thank you @effata david@burtcorp.com