Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Distributed Systems
Search
Albert Bifet
August 25, 2012
Research
1
280
Distributed Systems
Albert Bifet
August 25, 2012
Tweet
Share
More Decks by Albert Bifet
See All by Albert Bifet
Frequent Pattern Mining
abifet
1
260
Regression
abifet
0
260
Evaluation
abifet
1
230
Stream Algorithmics
abifet
1
390
Introduction to Data Stream Mining
abifet
1
240
Clustering
abifet
2
310
Ensemble Methods
abifet
0
260
Classification
abifet
0
340
Concept Drift
abifet
0
410
Other Decks in Research
See All in Research
Towards a More Efficient Reasoning LLM: AIMO2 Solution Summary and Introduction to Fast-Math Models
analokmaus
2
960
MetaEarth: A Generative Foundation Model for Global-Scale Remote Sensing Image Generation
satai
4
360
Nullspace MPC
mizuhoaoki
1
230
20250605_新交通システム推進議連_熊本都市圏「車1割削減、渋滞半減、公共交通2倍」から考える地方都市交通政策
trafficbrain
0
910
ロボット学習における大規模検索技術の展開と応用
denkiwakame
1
140
[CV勉強会@関東 CVPR2025] VLM自動運転model S4-Driver
shinkyoto
2
550
地域丸ごとデイサービス「Go トレ」の紹介
smartfukushilab1
0
270
Language Models Are Implicitly Continuous
eumesy
PRO
0
310
大学見本市2025 JSTさきがけ事業セミナー「顔の見えないセンシング技術:多様なセンサにもとづく個人情報に配慮した人物状態推定」
miso2024
0
170
「どう育てるか」より「どう働きたいか」〜スクラムマスターの最初の一歩〜
hirakawa51
0
970
AI in Enterprises - Java and Open Source to the Rescue
ivargrimstad
0
820
J-RAGBench: 日本語RAGにおける Generator評価ベンチマークの構築
koki_itai
0
850
Featured
See All Featured
YesSQL, Process and Tooling at Scale
rocio
173
15k
Sharpening the Axe: The Primacy of Toolmaking
bcantrill
46
2.5k
Dealing with People You Can't Stand - Big Design 2015
cassininazir
367
27k
Navigating Team Friction
lara
190
15k
Speed Design
sergeychernyshev
32
1.2k
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
34
2.3k
Git: the NoSQL Database
bkeepers
PRO
431
66k
Embracing the Ebb and Flow
colly
88
4.9k
What’s in a name? Adding method to the madness
productmarketing
PRO
24
3.7k
Optimizing for Happiness
mojombo
379
70k
[RailsConf 2023 Opening Keynote] The Magic of Rails
eileencodes
31
9.7k
Reflections from 52 weeks, 52 projects
jeffersonlam
355
21k
Transcript
Distributed Streaming Albert Bifet May 2012
COMP423A/COMP523A Data Stream Mining Outline 1. Introduction 2. Stream Algorithmics
3. Concept drift 4. Evaluation 5. Classification 6. Ensemble Methods 7. Regression 8. Clustering 9. Frequent Pattern Mining 10. Distributed Streaming
Data Streams Big Data & Real Time
Distributed Systems Hadoop, S4 and Storm
Hadoop Hadoop
Hadoop Hadoop architecture
Apache Mahout Mahout: open source framework
Pig Pig: Similar to SQL
Pig A = LOAD ’data’ USING PigStorage() AS (f1:int, f2:int,
f3:int); B = GROUP A BY f1; C = FOREACH B GENERATE COUNT ($0); DUMP C; Pig: Similar to SQL
Apache S4 Apache S4
Apache S4
Storm Storm from Twitter
Storm Stream, Spout, Bolt, Topology