Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Introduction to Data Stream Mining
Search
Albert Bifet
August 25, 2012
Research
1
240
Introduction to Data Stream Mining
Albert Bifet
August 25, 2012
Tweet
Share
More Decks by Albert Bifet
See All by Albert Bifet
Distributed Systems
abifet
1
270
Frequent Pattern Mining
abifet
1
250
Regression
abifet
0
260
Evaluation
abifet
1
230
Stream Algorithmics
abifet
1
380
Clustering
abifet
2
300
Ensemble Methods
abifet
0
260
Classification
abifet
0
340
Concept Drift
abifet
0
400
Other Decks in Research
See All in Research
Learning to (Learn at Test Time): RNNs with Expressive Hidden States
kurita
1
250
Pythonでジオを使い倒そう! 〜それとFOSS4G Hiroshima 2026のご紹介を少し〜
wata909
0
720
[論文紹介] Intuitive Fine-Tuning
ryou0634
0
120
When Submarine Cables Go Dark: Examining the Web Services Resilience Amid Global Internet Disruptions
irvin
0
320
最適決定木を用いた処方的価格最適化
mickey_kubo
4
1.9k
単施設でできる臨床研究の考え方
shuntaros
0
3k
SSII2025 [TS3] 医工連携における画像情報学研究
ssii
PRO
3
1.3k
日本語新聞記事を用いた大規模言語モデルの暗記定量化 / LLMC2025
upura
0
240
Mechanistic Interpretability:解釈可能性研究の新たな潮流
koshiro_aoki
1
440
snlp2025_prevent_llm_spikes
takase
0
350
MetaEarth: A Generative Foundation Model for Global-Scale Remote Sensing Image Generation
satai
4
300
「どう育てるか」より「どう働きたいか」〜スクラムマスターの最初の一歩〜
hirakawa51
0
920
Featured
See All Featured
Keith and Marios Guide to Fast Websites
keithpitt
411
22k
Typedesign – Prime Four
hannesfritz
42
2.8k
KATA
mclloyd
32
15k
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
7
890
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
35
3.2k
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
358
30k
GraphQLとの向き合い方2022年版
quramy
49
14k
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
114
20k
Save Time (by Creating Custom Rails Generators)
garrettdimon
PRO
32
1.6k
Optimising Largest Contentful Paint
csswizardry
37
3.4k
Stop Working from a Prison Cell
hatefulcrawdad
271
21k
Bootstrapping a Software Product
garrettdimon
PRO
307
110k
Transcript
Introduction to Data Stream Mining Albert Bifet March 2012
Motivation Source: IDC’s Digital Universe Study (EMC), June 2011 Data
is growing
Motivation Memory unit Size Binary size kilobyte (kB/KB) 103 210
megabyte (MB) 106 220 gigabyte (GB) 109 230 terabyte (TB) 1012 240 petabyte (PB) 1015 250 exabyte (EB) 1018 260 zettabyte (ZB) 1021 270 yottabyte (YB) 1024 280 Data is growing
Motivation Source: IDC’s Digital Universe Study (EMC), June 2011 Data
is growing
Motivation Source: IDC’s Digital Universe Study (EMC), June 2011 Data
is growing
Motivation Source: IDC’s Digital Universe Study (EMC), June 2011 Data
is growing
Streaming Data Big Data & Real Time
Big Data McKinsey Global Institute (MGI) Report on Big Data,
2011. Big data refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze.
Big Data McKinsey Global Institute (MGI) Report on Big Data,
2011. Big data refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze.
Methodology Sampling and distributed systems
Methodology Paolo Boldi Big Data does not need big machines,
it needs big intelligence
Real time analytics We want to analyze what is happening
now.
Real time analytics We want to analyze what is happening
now.
Time and Memory Number 8 Wire Mentality Time and memory
are the resource dimensions of the process.
Time and Memory Time and memory are the resource dimensions
of the process.
Algorithms Classification, Regression, Clustering, Frequent Pattern Mining.
Applications sensor data: industry, cities telecomm data social networks: twitter,
facebook, yahoo marketing: sales business Data may come from: humans, sensors, or machines.
Data Streams Big Data & Real Time