Slide 1

Slide 1 text

Event Driven Architectures with Apache Kafka on Heroku Chris Castle, Developer Advocate Rand Fitzpatrick, Director of Product November 3, 2016

Slide 2

Slide 2 text

What problems does Apache Kafka solve? What are the core concepts of Kafka? Why Apache Kafka on Heroku?

Slide 3

Slide 3 text

Forward-Looking Statements Statement under the Private Securities Litigation Reform Act of 1995: This presentation may contain forward-looking statements that involve risks, uncertainties, and assumptions. If any such uncertainties materialize or if any of the assumptions proves incorrect, the results of salesforce.com, inc. could differ materially from the results expressed or implied by the forward-looking statements we make. All statements other than statements of historical fact could be deemed forward-looking, including any projections of product or service availability, subscriber growth, earnings, revenues, or other financial items and any statements regarding strategies or plans of management for future operations, statements of belief, any statements concerning new, planned, or upgraded services or technology developments and customer contracts or use of our services. The risks and uncertainties referred to above include – but are not limited to – risks associated with developing and delivering new functionality for our service, new products and services, our new business model, our past operating losses, possible fluctuations in our operating results and rate of growth, interruptions or delays in our Web hosting, breach of our security measures, the outcome of any litigation, risks associated with completed and any possible mergers and acquisitions, the immature market in which we operate, our relatively limited operating history, our ability to expand, retain, and motivate our employees and manage our growth, new releases of our service and successful customer deployment, our limited history reselling non-salesforce.com products, and utilization and selling to larger enterprise customers. Further information on potential factors that could affect the financial results of salesforce.com, inc. is included in our annual report on Form 10-K for the most recent fiscal year and in our quarterly report on Form 10-Q for the most recent fiscal quarter. These documents and others containing important disclosures are available on the SEC Filings section of the Investor Information section of our Web site. Any unreleased services or features referenced in this or other presentations, press releases or public statements are not currently available and may not be delivered on time or at all. Customers who purchase our services should make the purchase decisions based upon features that are currently available. Salesforce.com, inc. assumes no obligation and does not intend to update these forward- looking statements.

Slide 4

Slide 4 text

What problems does Apache Kafka solve?

Slide 5

Slide 5 text

Event-Driven Architecture Event-driven architecture (EDA), also known as message-driven architecture, is a software architecture pattern promoting the production, detection, consumption of, and reaction to events. Source: Wikipedia

Slide 6

Slide 6 text

What Are Events? Context When was the event? (event time, process time)? What produced the event? (causal history, device, etc) Where did the event occur? (system location, geo location) Operation What function was applied? (create, update, delete, etc) What are the characteristics of the function? State What is the data involved in the event? How is that data identified? "Contextualized operation on state"

Slide 7

Slide 7 text

Event Examples Product views Completed sales Page visits Site logins Shipping notifications Inventory received IoT sensor values Weather data Traffic data Tweets Election polling data! Completed sale 2016-11-03T15:13:27Z Retail www site referrer Google search Inventory item purchased Amazon Echo, Black $179.99 ID B00X4WHP5E Context Operation State

Slide 8

Slide 8 text

Why Should I Care? Scaling too slowly leads to dropped data Overprovisioning leads to inefficient systems Dataflow between processing stages requires coordination Parallel pipelines with the same data can be nontrivial Service discovery must support current and future processes Sequencing service availability is critical to system function Possible loss of state when individual services fail

Slide 9

Slide 9 text

Why Should I Care? Inbound Streams Scaling too slowly leads to dropped data Overprovisioning leads to inefficient systems Backpressure and other coordination is hard! Data Pipelines Dataflow between processing stages requires coordination Parallel pipelines with the same data can be nontrivial Provenance and auditability!? Microservices Service discovery must support current and future processes Sequencing service availability is critical to system function Possible loss of state when individual services fail

Slide 10

Slide 10 text

Why Should I Care? Inbound Streams Event streams in Kafka allow other resources to pull when ready Resources can fail and reconnect without dropping events Kafka provides elasticity, reducing the need for backpressure Data Pipelines Dataflow coordination is reduced via event stream structure The immutability of data allows for trivial parallel processing Tracking provenance and lineage of data becomes possible Microservices Services now only need to discover topics in Kafka Service availability sequencing is relaxed Inter-service communication is more robust

Slide 11

Slide 11 text

Use Cases Heroku Platform Event Stream Learn more at https://blog.heroku.com/powering-the-heroku-platform-api-a-distributed-systems-approach-using-streams-and-apache-kafka

Slide 12

Slide 12 text

Use Cases Heroku Operational Experience: App Metrics

Slide 13

Slide 13 text

Use Cases Heroku App Metrics Learn more at https://engineering.heroku.com/blogs/2016-05-26-heroku-metrics-there-and-back-again/

Slide 14

Slide 14 text

Use Cases Twitter Analytics Dashboard

Slide 15

Slide 15 text

Use Cases Generalized Inbound Streams Data Pipelines Microservices Platform Event Stream App Metrics Twitter Analytics

Slide 16

Slide 16 text

What are the core concepts of Kafka?

Slide 17

Slide 17 text

Apache Kafka Core Concepts PRODUCERS CONSUMERS ​Brokers The instances running Kafka and managing streams of events in a cluster. ​Producers + Consumers Clients that write to or read from a Kafka cluster. ​Topics Streams of events that are replicated across the brokers. Configured with time based retention or log compaction. ​Partitions Discrete subsets of topics, and important tuning points for parallelism and ordering. BROKER TOPIC PARTITION

Slide 18

Slide 18 text

Example Producers Product views Completed sales Page visits Site logins Shipping notifications Inventory received IoT data Weather data Traffic data Tweets Election polling data! Web server Payment processor Browser Authentication service Shipping provider Warehouse Motion sensor Rain gauge Vehicle sensor Twitter Online/phone survey

Slide 19

Slide 19 text

Personalization engine Accounting system Reporting dashboard Security audit service Shipping provider Inventory database Actuator Climate model Traffic map Analytics dashboard Election forecast Example Consumers Product views Completed sales Page visits Site logins Shipping notifications Inventory received IoT data Weather data Traffic data Tweets Election polling data!

Slide 20

Slide 20 text

Complex Architecture

Slide 21

Slide 21 text

Complex Controls TOPIC PARTITION Other Kafka primitives to provide structure to Kafka event streams Retention Log compaction Replication factor Delivery guarantees

Slide 22

Slide 22 text

Interacting with Kafka and many more...

Slide 23

Slide 23 text

Kafka Connect Some examples: HDFS, JDBC, Elasticsearch, Couchbase, Oracle, MS SQL Server, Cassandra, DynamoDB, Salesforce Streaming API, Splunk Image credit: Confluent Kafka Connect announcement blog post

Slide 24

Slide 24 text

Why Apache Kafka on Heroku?

Slide 25

Slide 25 text

Without Heroku Apache Kafka The heart of the event management system, with a broad variety of configurations and options. Apache Zookeeper The system’s consensus and coordination cluster is vital for Kafka’s operation. OS + JVM Tuning Tuning the cluster runtimes can be an art. Instances + Networking Physical or virtual, the infrastructure behind clusters must be well considered. Myriad Moving Pieces

Slide 26

Slide 26 text

Apache Kafka on Heroku Simple Configuration

Slide 27

Slide 27 text

Apache Kafka on Heroku Automated Operations

Slide 28

Slide 28 text

Apache Kafka on Heroku Experienced Staff Self-Healing Current Version No-Downtime Upgrades Heroku engineers have contributed patches to the core open source Kafka project.

Slide 29

Slide 29 text

Apache Kafka on Heroku Global US West US East Ireland Germany Japan Sydney

Slide 30

Slide 30 text

Let's Review... ...and get you started with Kafka! Apache Kafka is a valuable tool for building architectures to support inbound event streams, data processing pipelines, and microservices coordination. The primitives provided by Kafka -- topics, partitions, retention duration, log compaction, and replication -- provide the tools to manage structured event streams. Apache Kafka on Heroku simplifies operational complexity so that any developer can get started quickly and feel confident that their application is supported by a rock-solid, production service. Get started at hrku.co/use-kafka

Slide 31

Slide 31 text

Q&A Rand Fitzpatrick, Director of Product Chris Castle, Developer Advocate But first, please take one minute to answer a few quick questions so we can make webinars like this even better for you.

Slide 32

Slide 32 text

Learn More Apache Kafka on Heroku Get Started Documentation Kafka Event Stream Modeling Podcast: Managed Kafka with Heroku Engineer Tom Crayford https://www.heroku.com/kafka https://elements.heroku.com/addons/heroku-kafka https://devcenter.heroku.com/articles/kafka-on-heroku https://devcenter.heroku.com/articles/kafka-event-stream-modeling http://softwareengineeringdaily.com/2016/10/25/managed-kafka- with-tom-crayford/

Slide 33

Slide 33 text

Thank you!