Upgrade to Pro — share decks privately, control downloads, hide ads and more …

stomppdf.pdf

mpobrien
April 11, 2012
520

 stomppdf.pdf

mpobrien

April 11, 2012
Tweet

Transcript

  1. Problem heavy processing tasks (encoding video, audio) can’t be done

    in a main thread real time data from external sources Solution Queue up tasks and execute in background Queue up data and listen as needed Wednesday, April 11, 12
  2. 1 2 n producers queue(s) consumers 1 2 n FIFO,

    priority based, 1-1 Wednesday, April 11, 12
  3. //set up an event handler for new messages client.on("message", function

    (channel, message) { console.log("I received message", message); console.log("on channel", channel); }); //subscribe to a channel client.subscribe("blarg"); //publish to a channel pub_client.publish("blarg", "sup bro?"); Messaging w/ Node Redis does it nicely messaging should be this easy with mongo + nodejs Wednesday, April 11, 12
  4. MongoDB for Queues? • Messages are full-fledged Mongo documents -

    message backlog can do: indexes, query, map reduce - nested objects and atomic updates • Avoid running another server, if it makes sense • You get durability and failover for free Wednesday, April 11, 12
  5. strategy #1 - Atomic Queue (findAndModify) Producer - put something

    on the queue Consumer - remove and return highest priority item from the queue db.queue.insert( { message:”iceberg right ahead”, sender:”some guy”, priority: 1000 } ); db.queue.findAndModify({sort:{priority:-1}, remove:true}) Wednesday, April 11, 12
  6. Pros Messages consumed atomically Shardable Cons Consuming message holds a

    write lock Queries are flexible Not pure asynchronous (poll) Message is mutable strategy #1 - Atomic Queue (findAndModify) Wednesday, April 11, 12
  7. Implementation in NodeJS We need abstractions for: •Polling - make

    an asynchronous external API •Failover - should be reliable and automatic •Message consumption - customizable using EventEmitter interface (like redis client) strategy #1 - Atomic Queue (findAndModify) Wednesday, April 11, 12
  8. var pq = new PollQueue('localhost', 27017, "dbname", "colname", {maxInterval :

    500, remove : true}); pq.addQuery("announcements", {“destination”:”announcements”}); pq.on("message", function(doc, filterId){ console.log("received from ", filterId, doc); }); pq.start(); PollQueue Class clean interface to message queue (EventEmitter) hide all the nasty details optional - we could also update Wednesday, April 11, 12
  9. var PollQueue = function(host, port, dbName, collectionName, options){ EventEmitter.call(this); .

    . . } inherits(PollQueue, EventEmitter); ... this.emit(“ready”, ...); this.emit(“message”, ...); Implementation inherit from EventEmitter: now we can emit() incoming data from mongo Wednesday, April 11, 12
  10. fetching queue items: var queueGetCallBack = function(err, doc){ ... if(activeFilters.length

    > 0){ process.nextTick(function(){ self.pollQuery(activeFilters, lastFullScan); }); }else{ setTimeout(function(){ self.pollQuery(null,lastFullScan)}, 1000); } ... } for(var i=0;i<filtersToScan.length;i++){ var filterId = filtersToScan[i] var filter = self.filters[filterId]; self.collection.findAndModify(filter, {priority:-1}, {}, {remove:true}, queueGetCallBack) } Wednesday, April 11, 12
  11. findAndModify queue results + notes between 1k-2k processed per second

    make sure you’re hitting indexes suitable for light workloads useful for scheduling background tasks Wednesday, April 11, 12
  12. strategy #2 - topic Publish/Subscribe (tailable cursors) Tailable cursor -

    works on capped collections db.createCollection({“queue”, {capped:true, size:10000000, autoIndexId:true} important if you’re using replica set capped collection - works like a circular queue Wednesday, April 11, 12
  13. Pass {tailable:true} to options in find() Cursor returns new documents

    as they are added Initial query will cover the entire collection strategy #2 - topic Publish/Subscribe (tailable cursors) Wednesday, April 11, 12
  14. Gotchas: • If collection is empty, cursor returns null instantly

    - need to poll for first msg • On failover, need to re-initialize cursors • if collection is non-empty, need to find latest element to start from strategy #2 - topic Publish/Subscribe (tailable cursors) Wednesday, April 11, 12
  15. Summary findAndModify (PollQueue) tailable cursor (TailQueue) faster (8-10k+ m/s) slower

    (1-2k+ m/s) atomic queue topic (pub/sub) async not really async (but we faked it) Wednesday, April 11, 12
  16. Interfacing with other systems STOMP protocol human readable messaging protocol

    supported by: ActiveMQ RabbitMQ OpenMQ MorbidQ and others Wednesday, April 11, 12
  17. STOMP frames: http-ish SEND destination:/queue/a content-type:text/plain hello queue a ^@

    frame command headers body frame terminator (null byte) Wednesday, April 11, 12
  18. CONNECT accept-version:1.1 host:10gen.com ^@ client server CONNECTED version:1.1 ^@ #1

    #2 #3A SUBSCRIBE id:0 destination:/queue/foo ^@ SEND destination:/queue/a content-type:text/plain hello queue a ^@ MESSAGE subscription:0 message-id:007 destination:/queue/a content-type:text/plain hello queue a^@ #3B negotiate protocol version Wednesday, April 11, 12
  19. PollQueue STOMP interpreter TailQueue Replica Set Replica Set Replica Set

    msg producers msg consumers Wrap data sources and make them look like STOMP servers Wednesday, April 11, 12
  20. Wrap data sources and make them look like STOMP servers

    • Message producers/consumers no longer need to touch DB • Consistent abstractions for message ACK/NACK • Supports transactions (batch SEND with BEGIN/COMMIT/ABORT) Wednesday, April 11, 12
  21. var myserver = new StompServer(8124); myserver.on("open", function(){ console.log("Server ready."); });

    myserver.on("error", function(err){ console.error("error", err); }); myserver.on("connect", function(client){ client.state = "new"; console.log("New client connected"); // Set up client connections ... }); myserver.on("frame", function(frame, client){ ... // handle an incoming STOMP frame from client ... } nodestomp - STOMP handler in NodeJS API: Wednesday, April 11, 12
  22. interfacing mongo queue and STOMP if(cmd == "send"){ var destination

    = frame.headers['destination'] this.queue.getCollection().insert( {"message":frame.body.toString("utf-8"), "destination":destination}, {safe:true}, function(err, doc){ if(err){ StompFrame.build('ERROR', {"message-id" : JSON.stringify(doc._id), "message" : "message could not be sent."}, “”, true) .serialize(client) } } ); } publishing Wednesday, April 11, 12
  23. interfacing mongo queue and STOMP pushing messages for(var i=0;i<this.datasources.length;i++){ var

    datasource = this.datasources[i] datasource.on("message", function(doc, filterId){ var subscribers = this.subscriptions[filterId]; var msgFrame = StompFrame.build( 'MESSAGE', //frame command {"message-id" : doc._id.toString()}, //headers JSON.stringify(doc), true) //body .serialize(subscribers) }); } Wednesday, April 11, 12
  24. TODOs full STOMP compliance benchmarks aggressive failover testing in-memory buffering

    during failovers interface STOMP to other datastores: redis, sql, etc. Wednesday, April 11, 12