Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
big data
Search
ngarneau
March 09, 2012
Programming
400
5
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
big data
big data keynote at the opencode quebec, introducing cassandra, hadoop and pig.
ngarneau
March 09, 2012
More Decks by ngarneau
See All by ngarneau
Introduction au machine learning avec Scitkit-learn
ngarneau
0
49
Mocks, stubs & seams
ngarneau
0
110
Other Decks in Programming
See All in Programming
ECSアプリログをFireLensでコスト削減しようとしたけど諦めた話 in Fargate×Node.js
akihisaikeda
2
4.2k
ユニットテストの先へ:テスト技法で要求・仕様を整理するJava開発実践 / Beyond_Unit_Testing_Practical_Java_Development_Techniques_for_Organizing_Requirements_and_Specifications
shimashima35
0
400
Semantic Version 単位で戦略を柔軟に変えて、パッケージアップデートを自動化する
daitasu
1
240
[2026年度第1回ORセミナー] 計画最適化ベンチャーと競技プログラミング人材
terryu16
0
260
LLMによるContent Moderationの本番運用の裏側と品質担保への挑戦
suikabar
3
680
PHPで使える日時の表現と、その知り方 #frontend_phpcon_do
o0h
PRO
0
240
メソッドのジェネリクスでGoの夢は広がるか? / Kyoto.go #65
utgwkk
3
780
Skillsは効率化、Agentsは"自分の拡張"——Builder時代のエージェント編成(CC Night 2026)
wemra
1
130
その問い、本当に正しいですか?AI時代のエンジニアに必要な哲学と認知科学 / ai-philosophy-cognitive-science
minodriven
9
5.4k
依存関係から依存物へ―Dependencyという言葉の歴史をひも解く
j_lee
0
120
Observability in Practice:Grafana 與 Edge Device SRE 的那些事
blueswen
0
160
コンテキストの使い捨てをやめる — ビジネスルール駆動開発と miko —
ioki
0
200
Featured
See All Featured
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
128
56k
Digital Projects Gone Horribly Wrong (And the UX Pros Who Still Save the Day) - Dean Schuster
uxyall
1
1.7k
How GitHub (no longer) Works
holman
316
150k
Optimising Largest Contentful Paint
csswizardry
37
3.7k
The State of eCommerce SEO: How to Win in Today's Products SERPs - #SEOweek
aleyda
2
11k
Improving Core Web Vitals using Speculation Rules API
sergeychernyshev
21
1.5k
Abbi's Birthday
coloredviolet
2
8.1k
Fashionably flexible responsive web design (full day workshop)
malarkey
408
66k
SEO for Brand Visibility & Recognition
aleyda
0
4.6k
Noah Learner - AI + Me: how we built a GSC Bulk Export data pipeline
techseoconnect
PRO
0
200
Beyond borders and beyond the search box: How to win the global "messy middle" with AI-driven SEO
davidcarrasco
3
160
Effective software design: The role of men in debugging patriarchy in IT @ Voxxed Days AMS
baasie
0
410
Transcript
big data
cassandra - Facebook - 2007. - Apache - 2008. -
Netflix, Digg, Twitter, Rackspace...
cassandra - non-relationnal - schema-less - open-source - horizontally scalable
- easy replication - large datasets
cassandra - datacenters - «no single point of failure».
cassandra data model - no joins (maybe joints, we don’t
know as of version 1.0.9..) - denormalization
cassandra data model - keyspace - column family - row
key - super column - column / value
cassandra data model application = { users = { ‘ngarneau’:
{ ‘first_name’: ‘nicolas’, ‘last_name’: ‘garneau’ } } }
cassandra data model application = { users = { ‘ngarneau’:
{ ‘first_name’: ‘nicolas’, ‘last_name’: ‘garneau’ } } } keyspace
cassandra data model application = { users = { ‘ngarneau’:
{ ‘first_name’: ‘nicolas’, ‘last_name’: ‘garneau’ } } } keyspace column family
cassandra data model application = { users = { ‘ngarneau’:
{ ‘first_name’: ‘nicolas’, ‘last_name’: ‘garneau’ } } } keyspace column family row key
cassandra data model application = { users = { ‘ngarneau’:
{ ‘first_name’: ‘nicolas’, ‘last_name’: ‘garneau’ } } } keyspace column family row key column
cassandra data model application = { users = { ‘ngarneau’:
{ ‘first_name’: ‘nicolas’, ‘last_name’: ‘garneau’ } } } keyspace column family row key column value
cassandra keep in mind memory disk memtable commit log
cassandra keep in mind memory disk memtable commit log
cassandra keep in mind memory disk memtable commit log
cassandra keep in mind memory disk memtable commit log memtable
cassandra keep in mind memory disk memtable commit log memtable
memtable
cassandra keep in mind memory disk memtable commit log memtable
memtable memtables
cassandra keep in mind memory disk memtable commit log memtable
memtable memtables
cassandra keep in mind memory disk memtable commit log memtable
memtable memtables SSTables
cassandra keep in mind memory disk memtable commit log memtable
memtable memtables SSTables SSTables
cassandra keep in mind memory disk memtable commit log memtable
memtable memtables SSTables SSTables SSTables
cassandra keep in mind memory disk memtable commit log memtable
memtable memtables SSTables SSTables SSTables
cassandra keep in mind memory disk memtable commit log memtable
memtable memtables SSTables SSTables SSTables SSTables
cassandra keep in mind memory disk memtable commit log memtable
memtable memtables SSTables SSTables SSTables SSTables SSTables
cassandra keep in mind memory disk memtable commit log memtable
memtable memtables SSTables SSTables SSTables SSTables SSTables SSTables
hadoop - Yahoo! - 2006. - Apache - 2008.
hadoop - mapreduce - hadoop distributed filesystem
hadoop mapreduce - map - reduce
hadoop mapreduce
hadoop mapreduce
hadoop mapreduce
hadoop HDFS
hadoop HDFS data
hadoop HDFS hadoop data
hadoop HDFS hadoop data
hadoop HDFS hadoop data hadoop
hadoop HDFS hadoop data hadoop hadoop
hadoop HDFS hadoop data hadoop hadoop data
hadoop HDFS hadoop data hadoop hadoop data data
hadoop HDFS hadoop data hadoop hadoop data data data
hadoop HDFS hadoop data hadoop hadoop data data data
hadoop HDFS hadoop data hadoop hadoop data data data
hadoop HDFS hadoop data hadoop hadoop data data data cassandra
hadoop HDFS hadoop data hadoop hadoop data data data cassandra
cassandra
hadoop HDFS hadoop data hadoop hadoop data data data cassandra
cassandra cassandra
hadoop keep in mind - business intelligence - machine learning
- collective intelligence
pig - Yahoo! - 2007. - Apache - 2008.
pig - pigs eat anything. - pigs live anywhere. -
pigs are domestic. - pigs fly.
pig keep in mind
let’s play! https://
[email protected]
/ngarneau/opencode.git
let’s play! dataset Salons = { ’1’: { ‘id’: 1,
‘attendants’: 47, ‘name’: ‘Salon Laval’, ‘year’: 2010 } } Commandes = { ’1’: { ‘amount’: 799, ‘salon’: 1 } }
let’s play! we want to know what is the correlation
between the number of attendants and the total revenues by salon.