Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Introduction to Data Stream Mining
Search
Albert Bifet
August 25, 2012
Research
1
240
Introduction to Data Stream Mining
Albert Bifet
August 25, 2012
Tweet
Share
More Decks by Albert Bifet
See All by Albert Bifet
Distributed Systems
abifet
1
270
Frequent Pattern Mining
abifet
1
250
Regression
abifet
0
260
Evaluation
abifet
1
230
Stream Algorithmics
abifet
1
380
Clustering
abifet
2
300
Ensemble Methods
abifet
0
250
Classification
abifet
0
330
Concept Drift
abifet
0
400
Other Decks in Research
See All in Research
Combinatorial Search with Generators
kei18
0
740
[RSJ25] Enhancing VLA Performance in Understanding and Executing Free-form Instructions via Visual Prompt-based Paraphrasing
keio_smilab
PRO
0
100
EOGS: Gaussian Splatting for Efficient Satellite Image Photogrammetry
satai
4
500
20250624_熊本経済同友会6月例会講演
trafficbrain
1
600
SSII2025 [TS1] 光学・物理原理に基づく深層画像生成
ssii
PRO
4
4.2k
「どう育てるか」より「どう働きたいか」〜スクラムマスターの最初の一歩〜
hirakawa51
0
850
大規模な2値整数計画問題に対する 効率的な重み付き局所探索法
mickey_kubo
1
360
論文紹介:Not All Tokens Are What You Need for Pretraining
kosuken
0
170
SNLP2025:Can Language Models Reason about Individualistic Human Values and Preferences?
yukizenimoto
0
120
MetaEarth: A Generative Foundation Model for Global-Scale Remote Sensing Image Generation
satai
4
180
2025年度人工知能学会全国大会チュートリアル講演「深層基盤モデルの数理」
taiji_suzuki
25
18k
SSII2025 [SS1] レンズレスカメラ
ssii
PRO
2
1.1k
Featured
See All Featured
GraphQLとの向き合い方2022年版
quramy
49
14k
Code Reviewing Like a Champion
maltzj
525
40k
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
PRO
23
1.4k
How STYLIGHT went responsive
nonsquared
100
5.8k
Imperfection Machines: The Place of Print at Facebook
scottboms
268
13k
Performance Is Good for Brains [We Love Speed 2024]
tammyeverts
12
1.1k
The Invisible Side of Design
smashingmag
301
51k
Keith and Marios Guide to Fast Websites
keithpitt
411
22k
"I'm Feeling Lucky" - Building Great Search Experiences for Today's Users (#IAC19)
danielanewman
229
22k
Mobile First: as difficult as doing things right
swwweet
224
9.9k
GraphQLの誤解/rethinking-graphql
sonatard
72
11k
Bash Introduction
62gerente
615
210k
Transcript
Introduction to Data Stream Mining Albert Bifet March 2012
Motivation Source: IDC’s Digital Universe Study (EMC), June 2011 Data
is growing
Motivation Memory unit Size Binary size kilobyte (kB/KB) 103 210
megabyte (MB) 106 220 gigabyte (GB) 109 230 terabyte (TB) 1012 240 petabyte (PB) 1015 250 exabyte (EB) 1018 260 zettabyte (ZB) 1021 270 yottabyte (YB) 1024 280 Data is growing
Motivation Source: IDC’s Digital Universe Study (EMC), June 2011 Data
is growing
Motivation Source: IDC’s Digital Universe Study (EMC), June 2011 Data
is growing
Motivation Source: IDC’s Digital Universe Study (EMC), June 2011 Data
is growing
Streaming Data Big Data & Real Time
Big Data McKinsey Global Institute (MGI) Report on Big Data,
2011. Big data refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze.
Big Data McKinsey Global Institute (MGI) Report on Big Data,
2011. Big data refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze.
Methodology Sampling and distributed systems
Methodology Paolo Boldi Big Data does not need big machines,
it needs big intelligence
Real time analytics We want to analyze what is happening
now.
Real time analytics We want to analyze what is happening
now.
Time and Memory Number 8 Wire Mentality Time and memory
are the resource dimensions of the process.
Time and Memory Time and memory are the resource dimensions
of the process.
Algorithms Classification, Regression, Clustering, Frequent Pattern Mining.
Applications sensor data: industry, cities telecomm data social networks: twitter,
facebook, yahoo marketing: sales business Data may come from: humans, sensors, or machines.
Data Streams Big Data & Real Time