Upgrade to Pro — share decks privately, control downloads, hide ads and more …

RTDB Technology

Sponsored · Ship Features Fearlessly Turn features on and off without deploys. Used by thousands of Ruby developers.
Avatar for chicagozer chicagozer
November 16, 2013

RTDB Technology

The technology and architecture behind RTDB, a real-time database.

Avatar for chicagozer

chicagozer

November 16, 2013
Tweet

More Decks by chicagozer

Other Decks in Technology

Transcript

  1. Reboot… !  Start with a requirement that analytics should be

    delivered in real-time. !  Introduce a web based “subscriber” modeled after message push. !  But instead of delivering individual messages – deliver the aggregated analytic results. !  Highly useful for executive dashboards, retail analytics, mobile real-time updates and many more.
  2. A node.js database solution ! RTDB is a real-time JSON document

    database implemented entirely in node.js. ! Documents are inserted via REST API. ! Analytics are made available via map/reduce. ! When data changes, queries are updated in real- time and subscribers are updated immediately using HTML5 server-sent events.
  3. Comparison SQL/Document DB !  Relational/JSON !  SQL or Map/Reduce ! 

    DB specific Driver !  Polling !  Dynamic/Prepared !  Separate BI/Analytics RTDB !  JSON !  Map/Reduce !  REST !  Push !  Prepared !  Integrated Analytics
  4. RTDB implementation Publisher Express Event Emitter Client Subscriber JSON over

    REST JSON via HTML5 Server Sent Event Configurable File System
  5. Map Reduce Finalize Map/Reduce Pipeline Personalize organize by “key” Rollup/

    aggregate Sorting/filter Apply user specific filter/ function Note: RTDB will perform an “incremental” map/ reduce when possible. This is very powerful. Add one record to a million record collection? No need to “remap” the entire dataset.
  6. Reduce emit(values.reduce(function (a, b) { return a + b; }));

    Map emit(item.artist,1); Finalize reduction.sort(function(a,b) {return b[1] -a[1];}); if (reduction.length > 30) reduction.length = 30; emit(reduction); { artist: “Neil Diamond”, song: “Forever in Blue Jeans”, album: “You Don’t Bring me flowers”, year: 1978 } [[‘Neil Diamond’,1]]
  7. Why node.js? ! node.js was the initial choice for prototyping Server

    Sent Events. ! As the prototype matured into an application, the application never outgrew node.js. ! node.js is interpreted so the map/reduce scripts can be changed at run- time. ! Very fast; very stable. ! Scalable I/O. Used heavily for files and subscribers. ! Async model is very well-suited for the internal architecture. ! NPM provides an excellent set of API building blocks for rapid assembly. ! Extremely PaaS friendly. (Heroku, Amazon EBS, Modulus, Cloudnode and more).
  8. APIs leveraged via NPM •  Express – used for REST

    API and web admin •  Jade – templates for web admin •  AWS-SDK – access to the S3 file system •  Async – concurrency framework •  Symmetry – JSON delta processing •  Winston – logging Note: NPM offers several overlapping APIs. You are free to choose the best fit for your needs.
  9. Configurable File System (CFS) •  RTDB can use a variety

    of backing stores; even other databases. •  Loads all javascript modules in specified directory via “requires” var cfslist = fs.readdirSync('./cfs'); var cfsTypes = {}; cfslist.forEach(function(file) { var cfs = require('./cfs/' + file); cfsTypes[cfs.name] = cfs; }); self.cfs = new cfsTypes[self.globalSettings.cfs](); self.cfs.init(self.globalSettings.cfsinit);
  10. Configurable File System •  Small set of required methods: function

    name() - return a unique name for this provider function init(parms) - initialize with params from settings.json function exists(dir, callback) - does this exist? function get(key, callback) - return object by key function del(key, callback) - delete object by key function put(prefix, item, callback, expires) - put object function list(prefix, callback) - list objects
  11. JSON simplifies everything. Here is all the code for inserting

    documents. app.post('/db/collections/:id/documents', function(req, res) { var c = database.collectionAt(req.params.id); var docs = []; if (!Array.isArray(req.body)) docs.push(req.body); else docs = req.body; c.put(docs, function(err) { if (!err) res.send(201); else res.send(500,err); }); }); Note: REST and JSON make it very easy to interact with the database using command line tools such as CURL.
  12. Node.js events are the real-time glue •  Create emitter _emitter

    = new events.EventEmitter(); •  Register reduce function _emitter.once('change', doReduce); •  When there is work… emitter.emit('change'); •  Use “once” versus “on” to manage flow.
  13. Async framework expedites concurent programming •  Async.each – process in

    parallel •  Async.eachSeries – process sequentially •  Load collections in priority order •  Async.eachLimit – parallel, but with limit •  Load files from file system without running out of system resources
  14. Symmetry for wire protocol performance a = { x: 3,

    y: 5, z: 1 }; b = { x: 3, y: 8, z: 1 }; Symmetry.diff(a, b) # => { t: 'o', s: { y: 8 } } obj = { x: 3, y: 5, z: 1 }; diff = { t: 'o', s: { y: 8 } }; Symmetry.patch(obj, diff); obj # => { x: 3, y: 8, z: 1 } Example from https://github.com/Two-Screen/symmetry Symmetry will “delta” the JSON result set and can significantly reduce the bytes transferred.
  15. GUIDs •  Every persistent object gets a GUID. •  Easy

    to share data between implementations. var uuid = require('node-uuid'); this._id = uuid.v4();
  16. For more info (I would love to hear from you.)

    [email protected] Twitter @rheosoft http://facebook.com/rheosoft