Slide 28
Slide 28 text
Spark Streaming
Provides a way to consume continual streams of data.
Scalable, high-throughput, fault-tolerant.
Built on top of Spark Core.
API is very similar to Core Spark API.
If you already know Spark, learning Streaming is easier than
learning the API for a completely unrelated product (like Twitter
Storm)
Supports many inputs, like TCP socket, Kafka, Flume,
HDFS, S3, Kinesis (and even Twitter—cool for demos).
Currently based on RDDs, but work is underway to
integrate DataFrames.