Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Introduction to Data Stream Mining
Search
Albert Bifet
August 25, 2012
Research
1
250
Introduction to Data Stream Mining
Albert Bifet
August 25, 2012
Tweet
Share
More Decks by Albert Bifet
See All by Albert Bifet
Distributed Systems
abifet
1
280
Frequent Pattern Mining
abifet
1
270
Regression
abifet
0
280
Evaluation
abifet
1
240
Stream Algorithmics
abifet
1
400
Clustering
abifet
2
320
Ensemble Methods
abifet
0
290
Classification
abifet
0
350
Concept Drift
abifet
0
420
Other Decks in Research
See All in Research
Can AI Generated Ambrotype Chain the Aura of Alternative Process? In SIGGRAPH Asia 2024 Art Papers
toremolo72
0
110
Self-Hosted WebAssembly Runtime for Runtime-Neutral Checkpoint/Restore in Edge–Cloud Continuum
chikuwait
0
260
単施設でできる臨床研究の考え方
shuntaros
0
3.4k
An Open and Reproducible Deep Research Agent for Long-Form Question Answering
ikuyamada
0
160
Aurora Serverless からAurora Serverless v2への課題と知見を論文から読み解く/Understanding the challenges and insights of moving from Aurora Serverless to Aurora Serverless v2 from a paper
bootjp
6
1.3k
音声感情認識技術の進展と展望
nagase
0
420
"主観で終わらせない"定性データ活用 ― プロダクトディスカバリーを加速させるインサイトマネジメント / Utilizing qualitative data that "doesn't end with subjectivity" - Insight management that accelerates product discovery
kaminashi
15
18k
Tiaccoon: Unified Access Control with Multiple Transports in Container Networks
hiroyaonoe
0
270
ForestCast: Forecasting Deforestation Risk at Scale with Deep Learning
satai
2
180
Unsupervised Domain Adaptation Architecture Search with Self-Training for Land Cover Mapping
satai
3
490
SREのためのテレメトリー技術の探究 / Telemetry for SRE
yuukit
13
2.7k
高畑鬼界ヶ島と重文・称名寺本薬師如来像の来歴を追って/kikaigashima
kochizufan
0
110
Featured
See All Featured
So, you think you're a good person
axbom
PRO
0
1.9k
How to make the Groovebox
asonas
2
1.9k
Collaborative Software Design: How to facilitate domain modelling decisions
baasie
0
110
Let's Do A Bunch of Simple Stuff to Make Websites Faster
chriscoyier
508
140k
[SF Ruby Conf 2025] Rails X
palkan
0
660
Google's AI Overviews - The New Search
badams
0
880
DBのスキルで生き残る技術 - AI時代におけるテーブル設計の勘所
soudai
PRO
61
47k
Navigating the moral maze — ethical principles for Al-driven product design
skipperchong
1
210
Dominate Local Search Results - an insider guide to GBP, reviews, and Local SEO
greggifford
PRO
0
25
Hiding What from Whom? A Critical Review of the History of Programming languages for Music
tomoyanonymous
0
320
How To Speak Unicorn (iThemes Webinar)
marktimemedia
1
350
Amusing Abliteration
ianozsvald
0
78
Transcript
Introduction to Data Stream Mining Albert Bifet March 2012
Motivation Source: IDC’s Digital Universe Study (EMC), June 2011 Data
is growing
Motivation Memory unit Size Binary size kilobyte (kB/KB) 103 210
megabyte (MB) 106 220 gigabyte (GB) 109 230 terabyte (TB) 1012 240 petabyte (PB) 1015 250 exabyte (EB) 1018 260 zettabyte (ZB) 1021 270 yottabyte (YB) 1024 280 Data is growing
Motivation Source: IDC’s Digital Universe Study (EMC), June 2011 Data
is growing
Motivation Source: IDC’s Digital Universe Study (EMC), June 2011 Data
is growing
Motivation Source: IDC’s Digital Universe Study (EMC), June 2011 Data
is growing
Streaming Data Big Data & Real Time
Big Data McKinsey Global Institute (MGI) Report on Big Data,
2011. Big data refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze.
Big Data McKinsey Global Institute (MGI) Report on Big Data,
2011. Big data refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze.
Methodology Sampling and distributed systems
Methodology Paolo Boldi Big Data does not need big machines,
it needs big intelligence
Real time analytics We want to analyze what is happening
now.
Real time analytics We want to analyze what is happening
now.
Time and Memory Number 8 Wire Mentality Time and memory
are the resource dimensions of the process.
Time and Memory Time and memory are the resource dimensions
of the process.
Algorithms Classification, Regression, Clustering, Frequent Pattern Mining.
Applications sensor data: industry, cities telecomm data social networks: twitter,
facebook, yahoo marketing: sales business Data may come from: humans, sensors, or machines.
Data Streams Big Data & Real Time