Apache Spark is an emerging cluster computing platform that allows data processing programs to run up to 100x faster than Hadoop MapReduce in memory, or 10x faster on disk. Spark also has a built in machine learning library, MLlib, that implements many common supervised and unsupervised machine learning algorithms. In this talk we will discuss how Spark has improved cluster computing and data processing along with an overview of the MLlib algorithms available. After getting familiar with the basics, we will explore how you can create a product recommendation engine for eCommerce utilizing Collaborative Filtering and the Alternating Least Squares algorithm.