Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
big data
Search
ngarneau
March 09, 2012
Programming
5
400
big data
big data keynote at the opencode quebec, introducing cassandra, hadoop and pig.
ngarneau
March 09, 2012
Tweet
Share
More Decks by ngarneau
See All by ngarneau
Introduction au machine learning avec Scitkit-learn
ngarneau
0
46
Mocks, stubs & seams
ngarneau
0
110
Other Decks in Programming
See All in Programming
perlをWebAssembly上で動かすと何が嬉しいの??? / Where does Perl-on-Wasm actually make sense?
mackee
0
330
re:Invent 2025 トレンドからみる製品開発への AI Agent 活用
yoskoh
0
630
2年のAppleウォレットパス開発の振り返り
muno92
PRO
0
180
AI Agent Dojo #4: watsonx Orchestrate ADK体験
oniak3ibm
PRO
0
130
Implementation Patterns
denyspoltorak
0
150
GISエンジニアから見たLINKSデータ
nokonoko1203
0
190
AgentCoreとHuman in the Loop
har1101
4
150
Grafana:建立系統全知視角的捷徑
blueswen
0
280
CSC307 Lecture 04
javiergs
PRO
0
630
LLM Observabilityによる 対話型音声AIアプリケーションの安定運用
gekko0114
2
300
Python札幌 LT資料
t3tra
7
1.1k
Giselleで作るAI QAアシスタント 〜 Pull Requestレビューに継続的QAを
codenote
0
340
Featured
See All Featured
Let's Do A Bunch of Simple Stuff to Make Websites Faster
chriscoyier
508
140k
Speed Design
sergeychernyshev
33
1.5k
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
9
1.1k
Game over? The fight for quality and originality in the time of robots
wayneb77
1
80
Testing 201, or: Great Expectations
jmmastey
46
7.9k
Producing Creativity
orderedlist
PRO
348
40k
Site-Speed That Sticks
csswizardry
13
1k
Building a Modern Day E-commerce SEO Strategy
aleyda
45
8.5k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
32
2.8k
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
254
22k
StorybookのUI Testing Handbookを読んだ
zakiyama
31
6.5k
Build your cross-platform service in a week with App Engine
jlugia
234
18k
Transcript
big data
cassandra - Facebook - 2007. - Apache - 2008. -
Netflix, Digg, Twitter, Rackspace...
cassandra - non-relationnal - schema-less - open-source - horizontally scalable
- easy replication - large datasets
cassandra - datacenters - «no single point of failure».
cassandra data model - no joins (maybe joints, we don’t
know as of version 1.0.9..) - denormalization
cassandra data model - keyspace - column family - row
key - super column - column / value
cassandra data model application = { users = { ‘ngarneau’:
{ ‘first_name’: ‘nicolas’, ‘last_name’: ‘garneau’ } } }
cassandra data model application = { users = { ‘ngarneau’:
{ ‘first_name’: ‘nicolas’, ‘last_name’: ‘garneau’ } } } keyspace
cassandra data model application = { users = { ‘ngarneau’:
{ ‘first_name’: ‘nicolas’, ‘last_name’: ‘garneau’ } } } keyspace column family
cassandra data model application = { users = { ‘ngarneau’:
{ ‘first_name’: ‘nicolas’, ‘last_name’: ‘garneau’ } } } keyspace column family row key
cassandra data model application = { users = { ‘ngarneau’:
{ ‘first_name’: ‘nicolas’, ‘last_name’: ‘garneau’ } } } keyspace column family row key column
cassandra data model application = { users = { ‘ngarneau’:
{ ‘first_name’: ‘nicolas’, ‘last_name’: ‘garneau’ } } } keyspace column family row key column value
cassandra keep in mind memory disk memtable commit log
cassandra keep in mind memory disk memtable commit log
cassandra keep in mind memory disk memtable commit log
cassandra keep in mind memory disk memtable commit log memtable
cassandra keep in mind memory disk memtable commit log memtable
memtable
cassandra keep in mind memory disk memtable commit log memtable
memtable memtables
cassandra keep in mind memory disk memtable commit log memtable
memtable memtables
cassandra keep in mind memory disk memtable commit log memtable
memtable memtables SSTables
cassandra keep in mind memory disk memtable commit log memtable
memtable memtables SSTables SSTables
cassandra keep in mind memory disk memtable commit log memtable
memtable memtables SSTables SSTables SSTables
cassandra keep in mind memory disk memtable commit log memtable
memtable memtables SSTables SSTables SSTables
cassandra keep in mind memory disk memtable commit log memtable
memtable memtables SSTables SSTables SSTables SSTables
cassandra keep in mind memory disk memtable commit log memtable
memtable memtables SSTables SSTables SSTables SSTables SSTables
cassandra keep in mind memory disk memtable commit log memtable
memtable memtables SSTables SSTables SSTables SSTables SSTables SSTables
hadoop - Yahoo! - 2006. - Apache - 2008.
hadoop - mapreduce - hadoop distributed filesystem
hadoop mapreduce - map - reduce
hadoop mapreduce
hadoop mapreduce
hadoop mapreduce
hadoop HDFS
hadoop HDFS data
hadoop HDFS hadoop data
hadoop HDFS hadoop data
hadoop HDFS hadoop data hadoop
hadoop HDFS hadoop data hadoop hadoop
hadoop HDFS hadoop data hadoop hadoop data
hadoop HDFS hadoop data hadoop hadoop data data
hadoop HDFS hadoop data hadoop hadoop data data data
hadoop HDFS hadoop data hadoop hadoop data data data
hadoop HDFS hadoop data hadoop hadoop data data data
hadoop HDFS hadoop data hadoop hadoop data data data cassandra
hadoop HDFS hadoop data hadoop hadoop data data data cassandra
cassandra
hadoop HDFS hadoop data hadoop hadoop data data data cassandra
cassandra cassandra
hadoop keep in mind - business intelligence - machine learning
- collective intelligence
pig - Yahoo! - 2007. - Apache - 2008.
pig - pigs eat anything. - pigs live anywhere. -
pigs are domestic. - pigs fly.
pig keep in mind
let’s play! https://
[email protected]
/ngarneau/opencode.git
let’s play! dataset Salons = { ’1’: { ‘id’: 1,
‘attendants’: 47, ‘name’: ‘Salon Laval’, ‘year’: 2010 } } Commandes = { ’1’: { ‘amount’: 799, ‘salon’: 1 } }
let’s play! we want to know what is the correlation
between the number of attendants and the total revenues by salon.