Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Distributed Systems
Search
Albert Bifet
August 25, 2012
Research
1
270
Distributed Systems
Albert Bifet
August 25, 2012
Tweet
Share
More Decks by Albert Bifet
See All by Albert Bifet
Frequent Pattern Mining
abifet
1
240
Regression
abifet
0
260
Evaluation
abifet
1
220
Stream Algorithmics
abifet
1
370
Introduction to Data Stream Mining
abifet
1
240
Clustering
abifet
2
290
Ensemble Methods
abifet
0
250
Classification
abifet
0
330
Concept Drift
abifet
0
390
Other Decks in Research
See All in Research
SSII2025 [TS2] リモートセンシング画像処理の最前線
ssii
PRO
7
2.9k
CSP: Self-Supervised Contrastive Spatial Pre-Training for Geospatial-Visual Representations
satai
3
220
最適化と機械学習による問題解決
mickey_kubo
0
140
とあるSREの博士「過程」 / A Certain SRE’s Ph.D. Journey
yuukit
6
2.9k
[CV勉強会@関東 CVPR2025] VLM自動運転model S4-Driver
shinkyoto
2
280
データサイエンティストの就労意識~2015→2024 一般(個人)会員アンケートより
datascientistsociety
PRO
0
720
90 分で学ぶ P 対 NP 問題
e869120
18
7.6k
NLP2025参加報告会 LT資料
hargon24
1
330
SSII2025 [TS3] 医工連携における画像情報学研究
ssii
PRO
2
1.2k
Google Agent Development Kit (ADK) 入門 🚀
mickey_kubo
2
1.2k
RapidPen: AIエージェントによるペネトレーションテスト 初期侵入全自動化の研究
laysakura
0
1.6k
Pix2Poly: A Sequence Prediction Method for End-to-end Polygonal Building Footprint Extraction from Remote Sensing Imagery
satai
3
500
Featured
See All Featured
Documentation Writing (for coders)
carmenintech
72
4.9k
We Have a Design System, Now What?
morganepeng
53
7.7k
Git: the NoSQL Database
bkeepers
PRO
430
65k
VelocityConf: Rendering Performance Case Studies
addyosmani
332
24k
The World Runs on Bad Software
bkeepers
PRO
69
11k
Being A Developer After 40
akosma
90
590k
How to Ace a Technical Interview
jacobian
278
23k
Build The Right Thing And Hit Your Dates
maggiecrowley
37
2.8k
What's in a price? How to price your products and services
michaelherold
246
12k
A better future with KSS
kneath
238
17k
Automating Front-end Workflow
addyosmani
1370
200k
Code Reviewing Like a Champion
maltzj
524
40k
Transcript
Distributed Streaming Albert Bifet May 2012
COMP423A/COMP523A Data Stream Mining Outline 1. Introduction 2. Stream Algorithmics
3. Concept drift 4. Evaluation 5. Classification 6. Ensemble Methods 7. Regression 8. Clustering 9. Frequent Pattern Mining 10. Distributed Streaming
Data Streams Big Data & Real Time
Distributed Systems Hadoop, S4 and Storm
Hadoop Hadoop
Hadoop Hadoop architecture
Apache Mahout Mahout: open source framework
Pig Pig: Similar to SQL
Pig A = LOAD ’data’ USING PigStorage() AS (f1:int, f2:int,
f3:int); B = GROUP A BY f1; C = FOREACH B GENERATE COUNT ($0); DUMP C; Pig: Similar to SQL
Apache S4 Apache S4
Apache S4
Storm Storm from Twitter
Storm Stream, Spout, Bolt, Topology