Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
big data
Search
ngarneau
March 09, 2012
Programming
5
400
big data
big data keynote at the opencode quebec, introducing cassandra, hadoop and pig.
ngarneau
March 09, 2012
Tweet
Share
More Decks by ngarneau
See All by ngarneau
Introduction au machine learning avec Scitkit-learn
ngarneau
0
44
Mocks, stubs & seams
ngarneau
0
110
Other Decks in Programming
See All in Programming
明示と暗黙 ー PHPとGoの インターフェイスの違いを知る
shimabox
2
610
“いい感じ“な定量評価を求めて - Four Keysとアウトカムの間の探求 -
nealle
2
12k
レトロゲームから学ぶ通信技術の歴史
kimkim0106
0
100
Flutterで備える!Accessibility Nutrition Labels完全ガイド
yuukiw00w
0
170
The Niche of CDK Grant オブジェクトって何者?/the-niche-of-cdk-what-isgrant-object
hassaku63
1
590
DMMを支える決済基盤の技術的負債にどう立ち向かうか / Addressing Technical Debt in Payment Infrastructure
yoshiyoshifujii
3
140
TypeScriptでDXを上げろ! Hono編
yusukebe
3
730
The Modern View Layer Rails Deserves: A Vision For 2025 And Beyond @ RailsConf 2025, Philadelphia, PA
marcoroth
2
690
Git Sync を超える!OSS で実現する CDK Pull 型デプロイ / Deploying CDK with PipeCD in Pull-style
tkikuc
4
330
新メンバーも今日から大活躍!SREが支えるスケールし続ける組織のオンボーディング
honmarkhunt
5
8.5k
型で語るカタ
irof
0
660
猫と暮らす Google Nest Cam生活🐈 / WebRTC with Google Nest Cam
yutailang0119
0
170
Featured
See All Featured
Docker and Python
trallard
45
3.5k
The MySQL Ecosystem @ GitHub 2015
samlambert
251
13k
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
357
30k
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
34
3.1k
Why Our Code Smells
bkeepers
PRO
337
57k
How To Stay Up To Date on Web Technology
chriscoyier
790
250k
Being A Developer After 40
akosma
90
590k
Side Projects
sachag
455
42k
Intergalactic Javascript Robots from Outer Space
tanoku
271
27k
Fashionably flexible responsive web design (full day workshop)
malarkey
407
66k
Easily Structure & Communicate Ideas using Wireframe
afnizarnur
194
16k
How to train your dragon (web standard)
notwaldorf
96
6.1k
Transcript
big data
cassandra - Facebook - 2007. - Apache - 2008. -
Netflix, Digg, Twitter, Rackspace...
cassandra - non-relationnal - schema-less - open-source - horizontally scalable
- easy replication - large datasets
cassandra - datacenters - «no single point of failure».
cassandra data model - no joins (maybe joints, we don’t
know as of version 1.0.9..) - denormalization
cassandra data model - keyspace - column family - row
key - super column - column / value
cassandra data model application = { users = { ‘ngarneau’:
{ ‘first_name’: ‘nicolas’, ‘last_name’: ‘garneau’ } } }
cassandra data model application = { users = { ‘ngarneau’:
{ ‘first_name’: ‘nicolas’, ‘last_name’: ‘garneau’ } } } keyspace
cassandra data model application = { users = { ‘ngarneau’:
{ ‘first_name’: ‘nicolas’, ‘last_name’: ‘garneau’ } } } keyspace column family
cassandra data model application = { users = { ‘ngarneau’:
{ ‘first_name’: ‘nicolas’, ‘last_name’: ‘garneau’ } } } keyspace column family row key
cassandra data model application = { users = { ‘ngarneau’:
{ ‘first_name’: ‘nicolas’, ‘last_name’: ‘garneau’ } } } keyspace column family row key column
cassandra data model application = { users = { ‘ngarneau’:
{ ‘first_name’: ‘nicolas’, ‘last_name’: ‘garneau’ } } } keyspace column family row key column value
cassandra keep in mind memory disk memtable commit log
cassandra keep in mind memory disk memtable commit log
cassandra keep in mind memory disk memtable commit log
cassandra keep in mind memory disk memtable commit log memtable
cassandra keep in mind memory disk memtable commit log memtable
memtable
cassandra keep in mind memory disk memtable commit log memtable
memtable memtables
cassandra keep in mind memory disk memtable commit log memtable
memtable memtables
cassandra keep in mind memory disk memtable commit log memtable
memtable memtables SSTables
cassandra keep in mind memory disk memtable commit log memtable
memtable memtables SSTables SSTables
cassandra keep in mind memory disk memtable commit log memtable
memtable memtables SSTables SSTables SSTables
cassandra keep in mind memory disk memtable commit log memtable
memtable memtables SSTables SSTables SSTables
cassandra keep in mind memory disk memtable commit log memtable
memtable memtables SSTables SSTables SSTables SSTables
cassandra keep in mind memory disk memtable commit log memtable
memtable memtables SSTables SSTables SSTables SSTables SSTables
cassandra keep in mind memory disk memtable commit log memtable
memtable memtables SSTables SSTables SSTables SSTables SSTables SSTables
hadoop - Yahoo! - 2006. - Apache - 2008.
hadoop - mapreduce - hadoop distributed filesystem
hadoop mapreduce - map - reduce
hadoop mapreduce
hadoop mapreduce
hadoop mapreduce
hadoop HDFS
hadoop HDFS data
hadoop HDFS hadoop data
hadoop HDFS hadoop data
hadoop HDFS hadoop data hadoop
hadoop HDFS hadoop data hadoop hadoop
hadoop HDFS hadoop data hadoop hadoop data
hadoop HDFS hadoop data hadoop hadoop data data
hadoop HDFS hadoop data hadoop hadoop data data data
hadoop HDFS hadoop data hadoop hadoop data data data
hadoop HDFS hadoop data hadoop hadoop data data data
hadoop HDFS hadoop data hadoop hadoop data data data cassandra
hadoop HDFS hadoop data hadoop hadoop data data data cassandra
cassandra
hadoop HDFS hadoop data hadoop hadoop data data data cassandra
cassandra cassandra
hadoop keep in mind - business intelligence - machine learning
- collective intelligence
pig - Yahoo! - 2007. - Apache - 2008.
pig - pigs eat anything. - pigs live anywhere. -
pigs are domestic. - pigs fly.
pig keep in mind
let’s play! https://
[email protected]
/ngarneau/opencode.git
let’s play! dataset Salons = { ’1’: { ‘id’: 1,
‘attendants’: 47, ‘name’: ‘Salon Laval’, ‘year’: 2010 } } Commandes = { ’1’: { ‘amount’: 799, ‘salon’: 1 } }
let’s play! we want to know what is the correlation
between the number of attendants and the total revenues by salon.