Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
big data
Search
ngarneau
March 09, 2012
Programming
5
400
big data
big data keynote at the opencode quebec, introducing cassandra, hadoop and pig.
ngarneau
March 09, 2012
Tweet
Share
More Decks by ngarneau
See All by ngarneau
Introduction au machine learning avec Scitkit-learn
ngarneau
0
46
Mocks, stubs & seams
ngarneau
0
110
Other Decks in Programming
See All in Programming
高速開発のためのコード整理術
sutetotanuki
1
390
LLM Observabilityによる 対話型音声AIアプリケーションの安定運用
gekko0114
2
420
疑似コードによるプロンプト記述、どのくらい正確に実行される?
kokuyouwind
0
380
MDN Web Docs に日本語翻訳でコントリビュート
ohmori_yusuke
0
640
2026年 エンジニアリング自己学習法
yumechi
0
130
登壇資料を作る時に意識していること #登壇資料_findy
konifar
4
930
CSC307 Lecture 05
javiergs
PRO
0
490
The Past, Present, and Future of Enterprise Java
ivargrimstad
0
490
[KNOTS 2026登壇資料]AIで拡張‧交差する プロダクト開発のプロセス および携わるメンバーの役割
hisatake
0
250
SourceGeneratorのススメ
htkym
0
190
Architectural Extensions
denyspoltorak
0
270
責任感のあるCloudWatchアラームを設計しよう
akihisaikeda
3
160
Featured
See All Featured
A Modern Web Designer's Workflow
chriscoyier
698
190k
Performance Is Good for Brains [We Love Speed 2024]
tammyeverts
12
1.4k
Save Time (by Creating Custom Rails Generators)
garrettdimon
PRO
32
2k
エンジニアに許された特別な時間の終わり
watany
106
230k
We Analyzed 250 Million AI Search Results: Here's What I Found
joshbly
1
670
Effective software design: The role of men in debugging patriarchy in IT @ Voxxed Days AMS
baasie
0
220
Design in an AI World
tapps
0
140
Visualization
eitanlees
150
17k
Producing Creativity
orderedlist
PRO
348
40k
WCS-LA-2024
lcolladotor
0
440
Are puppies a ranking factor?
jonoalderson
1
2.7k
Highjacked: Video Game Concept Design
rkendrick25
PRO
1
280
Transcript
big data
cassandra - Facebook - 2007. - Apache - 2008. -
Netflix, Digg, Twitter, Rackspace...
cassandra - non-relationnal - schema-less - open-source - horizontally scalable
- easy replication - large datasets
cassandra - datacenters - «no single point of failure».
cassandra data model - no joins (maybe joints, we don’t
know as of version 1.0.9..) - denormalization
cassandra data model - keyspace - column family - row
key - super column - column / value
cassandra data model application = { users = { ‘ngarneau’:
{ ‘first_name’: ‘nicolas’, ‘last_name’: ‘garneau’ } } }
cassandra data model application = { users = { ‘ngarneau’:
{ ‘first_name’: ‘nicolas’, ‘last_name’: ‘garneau’ } } } keyspace
cassandra data model application = { users = { ‘ngarneau’:
{ ‘first_name’: ‘nicolas’, ‘last_name’: ‘garneau’ } } } keyspace column family
cassandra data model application = { users = { ‘ngarneau’:
{ ‘first_name’: ‘nicolas’, ‘last_name’: ‘garneau’ } } } keyspace column family row key
cassandra data model application = { users = { ‘ngarneau’:
{ ‘first_name’: ‘nicolas’, ‘last_name’: ‘garneau’ } } } keyspace column family row key column
cassandra data model application = { users = { ‘ngarneau’:
{ ‘first_name’: ‘nicolas’, ‘last_name’: ‘garneau’ } } } keyspace column family row key column value
cassandra keep in mind memory disk memtable commit log
cassandra keep in mind memory disk memtable commit log
cassandra keep in mind memory disk memtable commit log
cassandra keep in mind memory disk memtable commit log memtable
cassandra keep in mind memory disk memtable commit log memtable
memtable
cassandra keep in mind memory disk memtable commit log memtable
memtable memtables
cassandra keep in mind memory disk memtable commit log memtable
memtable memtables
cassandra keep in mind memory disk memtable commit log memtable
memtable memtables SSTables
cassandra keep in mind memory disk memtable commit log memtable
memtable memtables SSTables SSTables
cassandra keep in mind memory disk memtable commit log memtable
memtable memtables SSTables SSTables SSTables
cassandra keep in mind memory disk memtable commit log memtable
memtable memtables SSTables SSTables SSTables
cassandra keep in mind memory disk memtable commit log memtable
memtable memtables SSTables SSTables SSTables SSTables
cassandra keep in mind memory disk memtable commit log memtable
memtable memtables SSTables SSTables SSTables SSTables SSTables
cassandra keep in mind memory disk memtable commit log memtable
memtable memtables SSTables SSTables SSTables SSTables SSTables SSTables
hadoop - Yahoo! - 2006. - Apache - 2008.
hadoop - mapreduce - hadoop distributed filesystem
hadoop mapreduce - map - reduce
hadoop mapreduce
hadoop mapreduce
hadoop mapreduce
hadoop HDFS
hadoop HDFS data
hadoop HDFS hadoop data
hadoop HDFS hadoop data
hadoop HDFS hadoop data hadoop
hadoop HDFS hadoop data hadoop hadoop
hadoop HDFS hadoop data hadoop hadoop data
hadoop HDFS hadoop data hadoop hadoop data data
hadoop HDFS hadoop data hadoop hadoop data data data
hadoop HDFS hadoop data hadoop hadoop data data data
hadoop HDFS hadoop data hadoop hadoop data data data
hadoop HDFS hadoop data hadoop hadoop data data data cassandra
hadoop HDFS hadoop data hadoop hadoop data data data cassandra
cassandra
hadoop HDFS hadoop data hadoop hadoop data data data cassandra
cassandra cassandra
hadoop keep in mind - business intelligence - machine learning
- collective intelligence
pig - Yahoo! - 2007. - Apache - 2008.
pig - pigs eat anything. - pigs live anywhere. -
pigs are domestic. - pigs fly.
pig keep in mind
let’s play! https://
[email protected]
/ngarneau/opencode.git
let’s play! dataset Salons = { ’1’: { ‘id’: 1,
‘attendants’: 47, ‘name’: ‘Salon Laval’, ‘year’: 2010 } } Commandes = { ’1’: { ‘amount’: 799, ‘salon’: 1 } }
let’s play! we want to know what is the correlation
between the number of attendants and the total revenues by salon.