Processing events at scale

SCALE processing events at Mariusz Gil

WROCŁAW, POLAND

SCALE processing events at Who is working on application which
is runnig on more than X servers?

SCALE SCALE SCALE SCALE at

user browser &

user browser & data processing, rendering request, routing

faster is better slower is unusable

requests sometimes are heavy… …too heavy

event processing logic should be moved out from request-response loop

RabbitMQ is a platform to send and receive messages

producer consumer Firstly

producer consumer1 consumer2 Basic QoS settings

producer consumer1 consumer2

<?php namespace Acme\DemoBundle\Controller; use Symfony\Bundle\FrameworkBundle\Controller\Controller; class TweetController extends Controller {
public function newTweetAction() { // ... // EXAMPLE AND VERY NAIVE IMPLEMENTATION $form->handleRequest($request); if ($form->isValid()) { $this->get('tweet_feed_producer')->publish(array( 'user' => $user, 'tweet' => 'Lorem ipsum dolor sit amet...' )); } // ... } }

<?php namespace Acme\DemoBundle\Consumer; use OldSound\RabbitMqBundle\RabbitMq\ConsumerInterface; use PhpAmqpLib\Message\AMQPMessage; class TweetFeedsConsumer implements
ConsumerInterface { public function execute(AMQPMessage $msg) { // ... // EXAMPLE AND VERY NAIVE IMPLEMENTATION $friends = $user->getFriends(); foreach ($friends as $friend) { $friend->getFeed()->push($tweet); } return true; } }

How to know if our consumers layer is efficient or
not?

producer consumer producer consumer producer consumer Directed Acyclic Graphs

Apache Storm is a distributed realtime computation system

doing for realtime processing what Hadoop did for batch processing

written in Clojure, but language agnostic

use cases realtime analytics online machine learning continous computations distributed
RPC Storm's small set of primitives satisfy a stunning number of use cases.

unbouded sequence of tuples stream

spout source of streams Joke about TCP and UDP connections

bolt process input stream and produce new one

Topology is a network of spouts and bolts Directed Acyclic
Multi-Graphs

Infrastructure level nimbus, supervisors, workers Apache Zookeepers

public class RandomSentenceSpout extends BaseRichSpout { SpoutOutputCollector _collector; Random _rand;
@Override public void open(Map conf, TopologyContext context, SpoutOutputCollector collector) { _collector = collector; _rand = new Random(); } @Override public void nextTuple() { Utils.sleep(100); String[] sentences = new String[] { "the cow jumped over the moon", "an apple a day keeps the doctor away", "four score and seven years ago", "snow white and the seven dwarfs", "i am at two with nature"}; String sentence = sentences[_rand.nextInt(sentences.length)]; _collector.emit(new Values(sentence)); } @Override public void ack(Object id) { } @Override public void fail(Object id) { } @Override public void declareOutputFields(OutputFieldsDeclarer declarer) { declarer.declare(new Fields("word")); } }

public static class WordCount extends BaseBasicBolt { Map<String, Integer> counts
= new HashMap<String, Integer>(); @Override public void execute(Tuple tuple, BasicOutputCollector collector) { String word = tuple.getString(0); Integer count = counts.get(word); if (count == null) count = 0; count++; counts.put(word, count); collector.emit(new Values(word, count)); } @Override public void declareOutputFields(OutputFieldsDeclarer declarer) { declarer.declare(new Fields("word", "count")); } }

public class WordCountTopology { public static void main(String[] args) throws
Exception { TopologyBuilder builder = new TopologyBuilder(); builder.setSpout("spout", new RandomSentenceSpout(), 5); builder.setBolt("split", new SplitSentence(), 8) .shuffleGrouping("spout"); builder.setBolt("count", new WordCount(), 12) .fieldsGrouping("split", new Fields("word")); Config conf = new Config(); conf.setDebug(true); if (args != null && args.length > 0) { conf.setNumWorkers(3); StormSubmitter.submitTopology(args[0], conf, builder.createTopology()); } else { conf.setMaxTaskParallelism(3); LocalCluster cluster = new LocalCluster(); cluster.submitTopology("word-count", conf, builder.createTopology()); Thread.sleep(10000); cluster.shutdown(); } } }

high-level abstraction Trident for realtime processing

FixedBatchSpout spout = new FixedBatchSpout(new Fields("sentence"), 3, new Values("the cow
jumped over the moon"), new Values("the man went to the store and bought some candy"), new Values("four score and seven years ago"), new Values("how many apples can you eat")); spout.setCycle(true); TridentTopology topology = new TridentTopology(); TridentState wordCounts = topology.newStream("spout1", spout) .each(new Fields("sentence"), new Split(), new Fields("word")) .groupBy(new Fields("word")) .persistentAggregate( new MemoryMapState.Factory(), new Count(), new Fields("count") ).parallelismHint(6);

events after all ? Where are my

NOWHERE unfortunately…

DO verengineering on’t Redis pub/sub

@mariuszgil

THANKS

( it depends )

Processing events at scale

Processing events at scale

More Decks by Mariusz Gil

Other Decks in Programming

Featured

Transcript