Upgrade to Pro — share decks privately, control downloads, hide ads and more …

message queues

Jaigouk Kim
September 01, 2018
520

message queues

kafka vs NATS vs etc

Jaigouk Kim

September 01, 2018
Tweet

Transcript

  1. summary - Nats shouldn’t be ignored - Pulsar is still

    young - Nats + Nats-streaming brings what we need - Nats is easier to maintain and easy to use compare to kafka - Nats-streaming can be deployed with persistent storage config - It’s possible to use protobuf to version messages. Example (using go-micro for it)
  2. What do we want? - Persisting messages so that we

    can run it again in case of emergency - Subscribe to group of messages and run services based on the message - Structured payload - High performance - Easy to maintain and operate - Reliable(check # of issues) https://issues.apache.org/jira/projects/KAFKA/issues/KAFKA-7114?filter=allopenis sues https://github.com/apache/incubator-pulsar/issues https://github.com/nsqio/nsq/issues
  3. Guarantees https://kafka.apache.org/documentation/#intro_guarantees https://nats.io/documentation/faq/#gmd https://nsq.io/overview/design.html#message-delivery-guarantees https://segment.com/blog/scaling-nsq/ No replication - unlike other

    queues, NSQ doesn’t provide any sort of replication or clustering. This is part of what makes running it so simple, but it does force some hard guarantees on reliability for published messages. We partially get around this by lowering the file sync time (configurable via a flag) and backing our queues with EBS. But there’s still the possibility that a queue could indicate it’s published a message and then die immediately, effectively losing the write. http://pulsar.incubator.apache.org/docs/latest/getting-started/ConceptsAndArchitecture/#Persistentstorage-dsay9f
  4. NATS Streaming differs from Kafka in a number of key

    ways. One of these differences is that NATS Streaming attempts to provide a sort of unified API for streaming and queueing semantics not too dissimilar from Apache Pulsar. This means that while it has a notion of a log, it also has subscriptions on that log. Unlike Kafka, NATS Streaming tracks these subscriptions and metadata associated with them, such as where a client is in the log. These have definite "state machines" affiliated with them, like creating and deleting subscriptions, positions in the log, clients joining or leaving queue groups, and message-redelivery information. https://dzone.com/articles/building-a-distributed-log-from-scratch-part-2-dat
  5. NATS streaming - persistency https://nats.io/documentation/streaming/nats-streaming-intro/ https://nats.io/blog/use-cases-for-persistent-logs-with-nats-streaming/ • Message/event persistence -

    NATS Streaming offers configurable message persistence either in-memory or via flat files. The storage subsystem uses a public interface that allows contributors to develop their own custom implementations. • At-least-once-delivery - NATS Streaming offers message acknowledgements between publisher and server (for publish operations) and between subscriber and server (to confirm message delivery). Messages are persisted by the server in memory or secondary storage (or other external storage) and will be redelivered to eligible subscribing clients as needed NATS Streaming is a log-based streaming system built on top of NATS, and NATS is a lightweight pub/sub messaging system.
  6. read the entire history of the stream All changes are

    written to a log first (for durability), and then an internal process applies those changes to in-memory indexes to support fast lookups. Unless you are building a one-off index, in general you want to use a durable subscription so on restart, only a small set of the changes needs to be processed. Starting from the beginning is just another option. https://nats.io/blog/use-cases-for-persistent-logs-with-nats-streaming/ OpenFaaS is using Nats Streaming internally but in memory mode https://github.com/openfaas/nats-queue-worker/blob/63809adcf6f17c7f8acf04489f78fd90e6475b18/main.go#L204 Nats streaming - persistent logs
  7. Nats-Streaming Replay Example Subscription Start (i.e. Replay) Options NATS Streaming

    subscriptions are similar to NATS subscriptions, but clients may start their subscription at an earlier point in the message stream, allowing them to receive messages that were published before this client registered interest.
  8. In memory vs db for NATS Streaming If you use

    a NATS Streaming server with memory store, it is true that if the server is restarted, since no state is being restored, the previously "connected" clients will stop receiving messages. Publishers would fail too since the server would reject published messages for unknown client IDs. The streaming server and streaming clients communicate through some inboxes. When the Streaming server is restarted, since it lost that knowledge, it can't communicate with existing clients. Moreover, even internal subjects used to communicate between the server and its clients contain a unique id that won't be the same after the restart). Note: If the NATS Streaming server connects to a non-embedded NATS Server, then if the NATS Server itself is restarted, that is fine, the client library's use of the underlying NATS connection will reconnect and everything would work fine (some timeout may occur for the operations that were inflight when the NATS server was restarted). This is because the Streaming server would still be running and its state maintained, so the communication can continue. https://github.com/openfaas/nats-queue-worker/pull/17#issuecomment-377000157
  9. benchmark https://www.datanami.com/2018/03/06/streamlio-claims-pulsar-performance-advantages-k afka/ “So we’ve gone through a lot of

    interesting architecture lessons and came up with a new architecture that solves a lot of the pain points that Kafka and Storm had,” Ramasamy says. “That’s why we wanted [to use on the new products] that solve underlying issues that we face in production. We know very well, Twitter and Yahoo are three to five years ahead in terms of the infrastructure they use, so hence we deiced to use those new projects.” the company unveiled results of a performance benchmark performed by Gigaom that pitted Pulsar against Kafka. The OpenMessaging benchmark, as it’s called, showed up to 150% improvement for Pulsar over Kafka in terms of throughput, while maintaining up to 60% lower latency.
  10. Production users of Pulsar Pulsar at Yahoo Pulsar backs major

    Yahoo applications like Mail, Finance, Sports, Gemini Ads, and Sherpa, Yahoo’s distributed key-value service. We deployed our first Pulsar instance in Q2 2015. Pulsar use has rapidly grown since then, and as of today, Yahoo runs Pulsar at scale. • Deployed globally, in 10+ data-centers, with full mesh replication capability • Greater than 100 billion messages/day published • More than 1.4 million topics • Average publish latency across the service of less than 5 ms https://yahooeng.tumblr.com/post/150078336821/open-sourcing-pul sar-pub-sub-messaging-at-scale
  11. ref https://www.confluent.io/blog/publishing-apache-kafka-new-york-times/ NATS Intro – Colin Sullivan & Waldemar Quevedo,

    Synadia (Any Skill Level) SCaLE 13x Derek Collison NATS A new nervous system for distributed cloud platforms Pulsar intro slide