new in Spark 2.0? API Improvements – SparkSession – new entry point – Unified DataFrame & DataSet API – Structured Streaming/Continuous Application Performance Improvements – Tungsten Phase 2 – Whole-stage code generation ML – ML model persistence – Distributed R algorithms (GLM, Naïve Bayes, K-Means, Survival Regression) SparkSQL – SQL 2003 support (new ANSI SQL parser, subquery support)