This presentation discusses Storm, the distributed computation system, and how it’s used at Keen IO. Redstorm, which makes it possible to build Storm topologies in Ruby, is also discussed. First given at #sfrails in October 2013.
About me * Full stack developer spanning 2 millenia * Helped found & build Togetherville (Disney) Ruby 1.8.7 and Rails 2.3.8 FTW! * Author of mongoid_alize, four & keen-gem. * Mentor at HackBright & HackReactor * Currently VP Engineering at Keen IO
I’d rather spend time building features I know I need analytics, but Sendgrid Iron.io Twilio Heroku Pusher I use APIs Would I like Keen IO? YES You probably would!
What is Storm? a) A project with 7,000+ followers on Github b) Low-latency distributed computation system c) WNBA team in Seattle d) Capable of streaming map-reduce Pop quiz! Storm is: e) All of the above
Storm at Keen IO The primary logical layer for storing events and performing queries. Cassandra distributes the data & Storm distributes the computation. Because Storm and Cassandra scale linearly, we can perform writes and queries with low latency, high throughput, all while remaining fault tolerant.
How fast is this? The Write Topology Storm Nodes Cassandra Nodes Events/Sec 3 6 50,000+ The Query Topology Query Type Collection Size (events) Mean Response Time Full Count 100M >100ms Average w/ groups 100M 300ms Sum over a field 600M 800ms
Haz Storm for Ruby? REDSTORM https://github.com/colinsurprenant/redstorm Elegant JRuby bindings for Storm. Includes batteries: CLI scripts to package jars & work with storm locally and deploy t a cluster. Very easy way to get familiar with Storm. Simple twitter streaming example – https://github.com/dzello/ontweet
Thanks #sfrails! More resources for Storm & distributed systems http://www.michael-noll.com/blog/2012/10/16/understanding-the- parallelism-of-a-storm-topology/ https://speakerdeck.com/dzello/distributed-systems-are-everywhere- where-the-full-stack-is-headed http://storm-project.net/ https://github.com/colinsurprenant/redstorm/wiki/Ruby-DSL-Documentation
Coming to defrag? (November 4th - 6th) Check out my talk: One Billion Per Second The Rise of Designer Data Architectures http://defragcon.com/2013/agenda/