Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Building a real time analytics engine in JRuby
Search
David Dahl
March 02, 2013
Programming
1
490
Building a real time analytics engine in JRuby
David Dahl
March 02, 2013
Tweet
Share
More Decks by David Dahl
See All by David Dahl
Nosql - getting over the bad parts
effata
1
110
Other Decks in Programming
See All in Programming
日付と正規化
megmogmog1965
0
140
「2024年版 Kotlin サーバーサイドプログラミング実践開発」の補講 〜O/Rマッパー編〜
n_takehata
2
260
3 Effective Rules for Success with Signals in Angular
manfredsteyer
PRO
0
120
CSC307 Lecture 09
javiergs
PRO
1
500
企業向け生成AIアプリの 開発から得られた知見
takaakikakei
0
310
Prompt FlowによるLLMアプリケーション開発
yuto2000
1
1k
継続的な活動で築く地方エンジニアの道
myamashii
2
350
初心者がおさえておきたいAWS CDKのベストプラクティス 2024
konokenj
15
7.3k
Cloudflare Workers x AWS Lambdaの組み合わせユースケース / Cloudflare Workers x AWS Lambda Combination Use Case
seike460
PRO
2
310
Folding Cheat Sheet #7
philipschwarz
PRO
0
150
しくじり先生 Image Matching Challenge 2024 編
goosehaaan
0
810
DMMプラットフォームにおけるTiDBの導入から運用まで
pospome
7
3k
Featured
See All Featured
Statistics for Hackers
jakevdp
792
220k
Facilitating Awesome Meetings
lara
46
5.8k
Making the Leap to Tech Lead
cromwellryan
127
8.7k
RailsConf 2023
tenderlove
16
720
The Straight Up "How To Draw Better" Workshop
denniskardys
229
130k
It's Worth the Effort
3n
181
27k
Faster Mobile Websites
deanohume
303
30k
Scaling GitHub
holman
458
140k
Easily Structure & Communicate Ideas using Wireframe
afnizarnur
189
16k
Rebuilding a faster, lazier Slack
samanthasiow
78
8.5k
What’s in a name? Adding method to the madness
productmarketing
PRO
21
2.9k
Bootstrapping a Software Product
garrettdimon
PRO
304
110k
Transcript
Building a real time analytics engine in JRuby David Dahl
@effata
whoami ‣ Senior developer at Burt ‣ Analytics for online
advertising ‣ Ruby lovers since 2009 ‣ AWS
None
None
None
Getting started ‣ Writing everything to mysql, querying for every
report - Broke down on first major campaign ‣ Precalculate all the things! ‣ Every operation in one application - Extremely scary to deploy ‣ Still sticking to MRI
None
Stuck ‣ Separate and buffer with RabbitMQ - Eventmachine ‣
Store stuff with MongoDB - Blocking operations ‣ Bad things
Java? ‣ Threading ‣ “Enterprise” ‣ Lots of libraries Think
about creating something Java ecosystem Discover someone has made it for you already Profit!
Moving to JRuby ‣ Threads! ‣ A real GC ‣
JIT ‣ Every Java, Scala, Ruby lib ever made ‣ Wrapping java libraries is fun! ‣ Bonus: Not hating yourself
Challenges
“100%” uptime ‣ We can “never” be down! ‣ But
we can pause ‣ Don’t want to fail on errors ‣ But it’s ok to die
Buffering ‣ Split into isolated services ‣ Add a buffering
pipeline between - We LOVE RabbitMQ ‣ Ack and persist in a “transaction” ‣ Figure out if you want - at most once - at least once
Databases ‣ Pick the right tool for the job ‣
MongoDB everywhere = bad ‣ Cassandra ‣ Redis ‣ NoDB - keep it streaming!
Java.util.concurrent
Shortcut
Executors Better than doing Thread.new
thread_pool = ! Executors.new_fixed_thread_pool(16) stuff.each do |item| thread_pool.submit do crunch_stuff(item)
end end
Blocking queues Producer/consumer pattern made easy Don’t forget back pressure!
queue = ! JavaConcurrent::LinkedBlockingQueue.new # With timeout queue.offer(data, 60, Java::TimeUnit::SECONDS)
queue.poll(60, Java::TimeUnit::SECONDS) # Blocking queue.put(data) queue.take
Back pressure Storage Timer Data processing Queue State
queue = ! JavaConcurrent::ArrayBlockingQueue.new(100) # With timeout queue.offer(data, 60, Java::TimeUnit::SECONDS)
queue.poll(60, Java::TimeUnit::SECONDS) # Blocking queue.put(data) queue.take
More awesomeness ‣ Java.util.concurrent - Atomic(Boolean/Integer/Long) - ConcurrentHashMap - CountDownLatch
/ Semaphore ‣ Google Guava ‣ LMAX Disruptor
Easy mode ‣ Thread safety is hard ‣ Use j.u.c
‣ Avoid shared mutual state if possible ‣ Back pressure
Actors Another layer of abstractions
Akka Concurrency library in Scala Most famous for its actor
implementation
Mikka Small ruby wrapper around Akka
class SomeActor < Mikka::Actor def receive(message) # do the thing
end end
Storm github.com/colinsurprenant/redstorm
We broke it But YOU should definitely try it out!
Hadoop github.com/iconara/rubydoop
module WordCount class Mapper def map(key, value, context) # ...
end end class Reducer def reduce(key, value, context) # ... end end end
Rubydoop.configure do |input_path, output_path| job 'word_count' do input input_path output
output_path mapper WordCount::Mapper reducer WordCount::Reducer output_key Hadoop::Io::Text output_value Hadoop::Io::IntWritable end end
Other cool stuff ‣ Hotbunnies ‣ Eurydice ‣ Bundesstrasse ‣
Multimeter
Thank you @effata
[email protected]