Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Sprayer: low latency, reliable multichannel messaging

Pablo E
November 30, 2013

Sprayer: low latency, reliable multichannel messaging

At Telefonica PDI we are developing an internal messaging service to be used by our own products.

Sprayer is a low latency, reliable messaging system supporting delivery of messages to a single receiver, predefined group of receivers or specific list of receivers over different channels (SMS, HTTP, WebSockets, Email, Android, iOS and Firefox OS native push…). We are using Redis, MongoDB and RabbitMQ to implement Sprayer.

In this talk we will review Sprayer’s architecture. We will see for each of these technologies, why, where and for what they are used as well as some tips.

Talk done together with Javier Arias ( @javier_arilos ) at NoSQL Matters Barcelona 2013.

Pablo E

November 30, 2013
Tweet

More Decks by Pablo E

Other Decks in Programming

Transcript

  1. who are we? Javier Arias @javier_arilos Javier is a Software

    Architect and developer, worked in different sectors such as M2M, Telcos, Finance, Airports. Pablo Enfedaque @pablitoev56 Pablo is a SW R&D engineer with a strong background in high performance computing, big data and distributed systems.
  2. Telefónica is the 4th largest telco in the world 2

    years ago Telefonica Digital was established to spread our business to the digital world former Telefonica R&D / PDI was merged into this new company some context
  3. overview we are developing an internal messaging service to be

    used by our own products we have polyglot persistence using different NoSQL technologies in this talk we will review Sprayer’s architecture and, for each technology, how it is used
  4. why sprayer? a common push messaging service. why? ➔ each

    project with messaging needs was implementing its own server its own way ➔ 5 push messaging systems in the company ➔ none of them supporting a wide variety of transports ➔ independent deployment and operations
  5. the problem cross technology push: point to point and pubsub:

    PaaS, multitenant iOS Android Websockets HTTP eMail SMS FirefoxOS 1 to 1 1 to N 1 to Group
  6. inspiration ➔ Google’s Thialfi: http://research.google. com/pubs/pub37474.html ➔ Twitter Timeline: http://www.infoq.

    com/presentations/Twitter-Timeline-Scalability ➔ Pusher: http://www.pusher.com ➔ Pubnub: http://www.pubnub.com ➔ Amazon SNS: http://aws.amazon.com/sns/
  7. SPRAYER! Sprayer is a low latency, reliable messaging system supporting

    delivery of messages to a single receiver, to a predefined group of receivers or to a specific list of receivers over different channels (WebSockets, SMS, Email, HTTP and iOS, Android or Firefox OS native push…) the proposal
  8. server side API challenges ➔ common interface for all channels

    ➔ reliable, consistent, idempotent ➔ route messages efficiently ➔ simple and user oriented ◆ manage subscriptions ◆ send messages: to list or group (topic) ◆ get delivery feedback ➔ standards based (HTTP + Json)
  9. architecture APPLICATION <BACKEND> sprayer backend sms gateway email gateway GCM

    APNs Operational storage ACCEPTER REST API MESSAGES DISPATCHING
  10. message dispatching challenges ➔ scaling horizontally ➔ reliability ➔ different

    channels: ◆ HTTP (outbound) ◆ Websockets (inbound) ◆ iOS push (APNs) ◆ Android push (GCM) ◆ SMS ◆ eMail
  11. architecture APPLICATION <BACKEND> sprayer backend IOS HTTP WEBSOCKETS ANDROID SMS

    EMAIL sms gateway email gateway GCM APNs Operational storage ACCEPTER REST API MESSAGES ROUTING
  12. outbound-stateless dispatchers ANDROID GCM simple dispatchers: HTTP, iOS, Android... ➔

    Take message, get msg subscribers, dispatch to receiver, report feedback ➔ Completely stateless Operational storage ACCEPTER REST API
  13. Operational storage connection aware dispatchers clients (websockets, HTTP long poll

    …) ➔ messages are stored until clients connect ➔ client inits a persistent connection ➔ potentially, millions of clients WEBSOCKETS ROUTER DELIVE RER ACCEPTER REST API inboxes
  14. message routing challenges routing (two-steps): ➔ API routes messages to

    N dispatchers ➔ Each dispatcher routes message to M receivers (subscribers of a group) both steps must be decoupled The number of receivers could be thousands
  15. architecture APPLICATION <BACKEND> sprayer backend IOS HTTP WEBSOCKETS ANDROID SMS

    EMAIL sms gateway email gateway GCM APNs Subscriptions storage email sms android WS HTTP iOS Operational storage ACCEPTER REST API FEEDBACK
  16. async delivery feedback challenges make msg feedback available through API

    to clients feedback must not compromise message delivery or API feedback: msg delivery, connections, push The number of updates could be millions
  17. architecture APPLICATION <BACKEND> sprayer backend IOS HTTP WEBSOCKETS ANDROID SMS

    EMAIL sms gateway email gateway GCM APNs STATUS FEEDER Subscriptions storage email sms android WS HTTP iOS Operational storage feedback ACCEPTER REST API
  18. subscriptions storage APPLICATION <BACKEND> sprayer backend IOS HTTP WEBSOCKETS ANDROID

    SMS EMAIL sms gateway email gateway GCM APNs STATUS FEEDER Subscriptions storage email sms android WS HTTP iOS Operational storage feedback ACCEPTER REST API ?
  19. subscriptions storage APPLICATION <BACKEND> sprayer backend IOS HTTP WEBSOCKETS ANDROID

    SMS EMAIL sms gateway email gateway GCM APNs STATUS FEEDER email sms android WS HTTP iOS Operational storage feedback ACCEPTER REST API
  20. redis Redis is an open source, advanced key- value store.

    It is often referred to as a data structure server (...) - (redis.io) why redis? - amazingly fast - easy to use - usage patterns: shared cache, queues, pubsub, distributed lock, counting things
  21. redis use cases use cases in Sprayer: ➔ group subscribers

    x channel ➔ channels x group ➔ websockets channel queues (potentially million receivers) limitations for our use cases: ➔ memory bound ➔ queries and pagination ➔ high throughput queues
  22. redis concerns ➔ what happens when dataset does not fit

    in memory? two strategies ◆ partition datasets to different redis clusters ◆ sharding: based in tenant would be easy ➔ FT and HA ◆ easy way: master-slave with virtual IPs, switch slave’s IP when master’s out. home made daemon ◆ sentinel based, some tests done, needs to be supported by client library ◆ redis cluster being implemented; limited features
  23. operational storage APPLICATION <BACKEND> sprayer backend IOS HTTP WEBSOCKETS ANDROID

    SMS EMAIL sms gateway email gateway GCM APNs STATUS FEEDER email sms android WS HTTP iOS Operational storage feedback ACCEPTER REST API ?
  24. operational storage APPLICATION <BACKEND> sprayer backend IOS HTTP WEBSOCKETS ANDROID

    SMS EMAIL sms gateway email gateway GCM APNs STATUS FEEDER email sms android WS HTTP iOS feedback ACCEPTER REST API
  25. mongodb mongoDB (from "humongous") is a document database (...) features:

    full index support, replication & HA, auto- sharding... (mongodb.org) why mongoDB? ➔ scaling & HA ➔ great performance ➔ dynamic schemas ➔ versatile
  26. mongodb use cases use cases in Sprayer: ➔ operational DB,

    administrative data ➔ message delivery feedback updates (potentially millions of records) limitations for our use cases: ➔ operations with sets of subscribers ➔ high throughput queues
  27. mongodb concerns no concerns about mongodb for our usecase. maybe,

    in the long term, can it handle the huge amount of feedback write operations without affecting the API?
  28. async queues APPLICATION <BACKEND> sprayer backend IOS HTTP WEBSOCKETS ANDROID

    SMS EMAIL sms gateway email gateway GCM APNs STATUS FEEDER email sms android WS HTTP iOS feedback ACCEPTER REST API ? ?
  29. async queues APPLICATION <BACKEND> sprayer backend IOS HTTP WEBSOCKETS ANDROID

    SMS EMAIL sms gateway email gateway GCM APNs STATUS FEEDER ACCEPTER REST API
  30. rabbitmq robust messaging for applications, easy to use (www.rabbitmq.com) why

    rabbitmq? ➔ very fast ➔ reliable ➔ builtin clustering
  31. rabbitmq use cases use cases in Sprayer: ➔ jobs for

    dispatchers (API => dispatchers) ➔ feedback status updates: message delivery, connections, device status (dispatchers => API) limitations for our use cases: ➔ not scaling well to millions of queues (websocket receiver inboxes)
  32. full tech stack APPLICATION <BACKEND> sprayer backend IOS HTTP WEBSOCKETS

    ANDROID SMS EMAIL sms gateway email gateway GCM APNs STATUS FEEDER ACCEPTER REST API
  33. design threats related data in different places: redis, rabbitmq and

    mongo we are not transactional, our components remain sane in case of a DB failure, idempotent operations help here light implementation of Unit of Work architectural pattern
  34. architecture guidelines ➔ asynchronous processing / queues everywhere ➔ dedicated

    dispatchers for each transport ➔ common API interface ➔ used the best tool for each responsibility: polyglot persistence ➔ processes as stateless as possible