Upgrade to Pro — share decks privately, control downloads, hide ads and more …

An Introduction to Amazon S3, RDB, and EMR

CUSSW Hosted
November 14, 2016

An Introduction to Amazon S3, RDB, and EMR

Presentation 2 of 2 courtesy of Zhiming Shen and Weijia Song.

In this talk, we are going to give an introduction to Amazon AWS – the biggest and most widely used cloud computing platform. We are going to show live demos of allocating, connecting, and using different AWS services including virtual machines (EC2), storage (S3), databases (RDS), and distributed computation platforms (Elastic MapReduce) etc.

Presented at SSW: https://cornell-ssw.github.io/meetings/2016-11-14

CUSSW Hosted

November 14, 2016
Tweet

More Decks by CUSSW Hosted

Other Decks in Technology

Transcript

  1. Amazon Simple Storage Service(S3) • A Easy-to-use, Scalable, Reliable, and

    Secure Cloud Storage • Scalable: no size limit • Reliable: 99.99% availability and 99.999999999% durability • Secure: SSL transfer, Data encryption, and Access Control • Easy-to-use: web interface, REST api, and SDK • Price: (~3 cents per GB month, much cheaper with I/A or Glacier)
  2. Use cases of Amazon S3 • File backup storage •

    Sharing and Collaboration(Host static webpages, Git Repository) • Host data for Applications
  3. Amazon Relational Database Service(RDS) • Easy-to-use relational database in the

    Cloud. • Six Engines available: • Amazon Aurora, MySQL, MariaDB, Oracle, PostgreSQL, SQL Server • High availability: backup, multi-AZ, Read Replica, Snapshot transfer… • Scalability: vertical scaling, data sharding, and clustering • Security: SSL, data encryption • Price includes instances/Storage and IO/Data transfer
  4. Amazon RDS Demo • Creating a DB service in the

    Cloud • Manipulating data using DBMS Client
  5. Amazon Elastic MapReduce(EMR) • MapReduce is a distributed application framework.

    • Processing a vast amount of data (TBs) in parallel on large cluster • Reliable, fault-tolerant • MapReduce Input Data Map() Map() Map() Reduce( ) Reduce( ) Output Data [K1,V1] [K2,V2] [K3,V3]
  6. Amazon Elastic MapReduce(EMR) • Amazon EMR • Easy deploying/using of

    Hadoop cluster • Hadoop-based tools: Hive, PIG, Hue, HBase, … etc • Spark, Mahout, … etc • Demo • Create a Hadoop cluster. • Run a “wordcount” application.