Big data processing with Apache Beam

Big data processing with Apache Beam

In this talk, we present the new Python SDK for Apache Beam - a parallel programming model that allows one to implement batch and streaming data processing jobs that can run on a variety of execution engines like Apache Spark and Google Cloud Dataflow. We will use examples to discuss some of the interesting challenges in providing a Pythonic API and execution environment for distributed processing.

0b40b3c621633157be039d55d0fd9ea0?s=128

Sourabh

July 06, 2017
Tweet