Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Big Data and Google Cloud Platform

Big Data and Google Cloud Platform

An introduction to the services in Google Cloud Platform for use with Big Data processing including BigQuery, Google Cloud Storage and Google Compute Engine for running Hadoop.

sharifsalah

October 14, 2014
Tweet

More Decks by sharifsalah

Other Decks in Programming

Transcript

  1. “Bigtable: A Distributed Storage System for Structured Data Fay Chang,

    Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber Abstract Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers. Many projects at Google store data in Bigtable, including web indexing, Google Earth, and Google Finance. These applications place very different demands on Bigtable, both in terms of data size (from URLs to web pages to satellite imagery) and latency requirements (from backend bulk processing to real-time data serving). Despite these varied demands, Bigtable has successfully provided a flexible, high-performance solution for all of these Google products. In this paper we describe the simple data model provided by Bigtable, which gives clients dynamic control over data layout and format, and we describe the design and implementation of Bigtable.” To appear in: OSDI'06: Seventh Symposium on Operating System Design and Implementation, Seattle, WA, November, 2006. Source: http://research.google.com/archive/bigtable.html
  2. Compute Engine IaaS Compute, Storage & Network Sub-hour billing Hugely

    scalable Consistent performance Support for containers
  3. BigQuery SQL-like queries Easy to use interface Supports 3rd party

    tools Analyse terabytes of data Streaming data Batch processing
  4. Case studies Compliance information Pricing calculator Research at Google Support

    packages Further reading relating to Google Cloud Platform
  5. Dremel Getting started with BigQuery An inside look at BigQuery

    Cloud Dataflow Further reading relating to BigQuery
  6. Google Cloud Platform would like to offer you $500 of

    credit to build your applications! To get started, follow the three steps below: 1. http://g.co/cloudstarterpack 2. Click "apply now" 3. Use the code: gde-in