Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Introduction to Spark with Scala

Introduction to Spark with Scala

The slide is my presentation at the monthly Scala Meetup in Lagos organized by Africa's talking.

Spark is a fast and general-purpose cluster computing system that helps you do unified data processing problems on large scale.
The presentation gives a simple introduction to Spark engine, the underlying spark core which leverages the capability of Resilient Distribution Dataset (RDD) and the operations possible on RDD.

Also, an example using Scala programming Language can be found at https://github.com/LagosScala/introduction-scala-spark

Tweet

Other Decks in Programming

Transcript

  1. Introduction to spark with scala Just a simple and quick

    but interesting introduction - Be Happy With Scala!!! Adekunle Babatunde - Seamfix Nig Ltd.
  2. What About Spark • The underlying Idea • It’s Capability

    • Introducing RDD • Transformation and Actions • Brief Intro to Dataset/DataFrame API • A simple tweet analysis
  3. The Underlying Idea • A fast and general-purpose cluster computing

    system. • Unified engine for big data applications • Why??? A cluster computing platform??? ◦ Single processor maxed out ◦ Hadoop • Why??? A new big data applications ◦ Schedule, Good distribution systame ◦ A good monitoring ◦ Speed
  4. What Spark Components Mean Spark ML/MLLIB ★ Machine Learning API

    for Spark ★ ML is DataSet/Dat aFrame based ★ MLLib is RDD based Spark SQL Structured Data: ★ Optimized for SQL like processing ★ SQL (SQL/HQL) and Dataset API Spark Streaming ★ Ingest and Process data in Real-time ★ Abstraction with DStream ★ Initiated by StreamingCo ntext() Spark GRAPHX Graph Processing ★ Graphs and graph parallel processing API ★ Has some cool other stuffs.
  5. Spark RDD ➔ Create and RDD ➔ Operations on RDD

    ◆ Transformations ◆ Actions ➔ Transformations
  6. Transformations and Actions II ➔ Actions ➔ Others ◆ take()

    ◆ takeOrdered() ◆ saveAsSequence() ◆ etc.