Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Introduction to Data Stream Mining
Search
Albert Bifet
August 25, 2012
Research
1
240
Introduction to Data Stream Mining
Albert Bifet
August 25, 2012
Tweet
Share
More Decks by Albert Bifet
See All by Albert Bifet
Distributed Systems
abifet
1
280
Frequent Pattern Mining
abifet
1
260
Regression
abifet
0
260
Evaluation
abifet
1
230
Stream Algorithmics
abifet
1
390
Clustering
abifet
2
310
Ensemble Methods
abifet
0
260
Classification
abifet
0
340
Concept Drift
abifet
0
410
Other Decks in Research
See All in Research
地域丸ごとデイサービス「Go トレ」の紹介
smartfukushilab1
0
310
EarthDial: Turning Multi-sensory Earth Observations to Interactive Dialogues
satai
3
290
論文紹介: ReGenesis: LLMs can Grow into Reasoning Generalists via Self-Improvement
hisaokatsumi
0
110
説明可能な機械学習と数理最適化
kelicht
0
360
令和最新技術で伝統掲示板を再構築: HonoX で作る型安全なスレッドフロート型掲示板 / かろっく@calloc134 - Hono Conference 2025
calloc134
0
390
Towards a More Efficient Reasoning LLM: AIMO2 Solution Summary and Introduction to Fast-Math Models
analokmaus
2
980
AlphaEarth Foundations: An embedding field model for accurate and efficient global mapping from sparse label data
satai
3
400
Integrating Static Optimization and Dynamic Nature in JavaScript (GPCE 2025)
tadd
0
110
若手研究者が国際会議(例えばIROS)でワークショップを企画するメリットと成功法!
tanichu
0
100
論文紹介:Not All Tokens Are What You Need for Pretraining
kosuken
0
200
AWSで実現した大規模日本語VLM学習用データセット "MOMIJI" 構築パイプライン/buiding-momiji
studio_graph
2
820
LLM-jp-3 and beyond: Training Large Language Models
odashi
1
540
Featured
See All Featured
Building a Scalable Design System with Sketch
lauravandoore
463
33k
Balancing Empowerment & Direction
lara
5
710
Side Projects
sachag
455
43k
Let's Do A Bunch of Simple Stuff to Make Websites Faster
chriscoyier
508
140k
Art, The Web, and Tiny UX
lynnandtonic
303
21k
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
46
7.8k
"I'm Feeling Lucky" - Building Great Search Experiences for Today's Users (#IAC19)
danielanewman
231
22k
Java REST API Framework Comparison - PWX 2021
mraible
34
8.9k
The Straight Up "How To Draw Better" Workshop
denniskardys
239
140k
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
16
1.7k
Music & Morning Musume
bryan
46
6.9k
Testing 201, or: Great Expectations
jmmastey
46
7.7k
Transcript
Introduction to Data Stream Mining Albert Bifet March 2012
Motivation Source: IDC’s Digital Universe Study (EMC), June 2011 Data
is growing
Motivation Memory unit Size Binary size kilobyte (kB/KB) 103 210
megabyte (MB) 106 220 gigabyte (GB) 109 230 terabyte (TB) 1012 240 petabyte (PB) 1015 250 exabyte (EB) 1018 260 zettabyte (ZB) 1021 270 yottabyte (YB) 1024 280 Data is growing
Motivation Source: IDC’s Digital Universe Study (EMC), June 2011 Data
is growing
Motivation Source: IDC’s Digital Universe Study (EMC), June 2011 Data
is growing
Motivation Source: IDC’s Digital Universe Study (EMC), June 2011 Data
is growing
Streaming Data Big Data & Real Time
Big Data McKinsey Global Institute (MGI) Report on Big Data,
2011. Big data refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze.
Big Data McKinsey Global Institute (MGI) Report on Big Data,
2011. Big data refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze.
Methodology Sampling and distributed systems
Methodology Paolo Boldi Big Data does not need big machines,
it needs big intelligence
Real time analytics We want to analyze what is happening
now.
Real time analytics We want to analyze what is happening
now.
Time and Memory Number 8 Wire Mentality Time and memory
are the resource dimensions of the process.
Time and Memory Time and memory are the resource dimensions
of the process.
Algorithms Classification, Regression, Clustering, Frequent Pattern Mining.
Applications sensor data: industry, cities telecomm data social networks: twitter,
facebook, yahoo marketing: sales business Data may come from: humans, sensors, or machines.
Data Streams Big Data & Real Time