© 2016 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Apache Spark
• We use Apache Spark with Scala
• A fast and general engine for large-scale data processing (Big Data)
• API:
– Functional (Scala-like)
• map, flatMap, filter, sort
– Relational (SQL-like)
• select, where, groupBy, join
• Distributed
– A Driver node submits work to Executor nodes