Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Introduction to Data Stream Mining
Search
Albert Bifet
August 25, 2012
Research
1
240
Introduction to Data Stream Mining
Albert Bifet
August 25, 2012
Tweet
Share
More Decks by Albert Bifet
See All by Albert Bifet
Distributed Systems
abifet
1
270
Frequent Pattern Mining
abifet
1
250
Regression
abifet
0
260
Evaluation
abifet
1
230
Stream Algorithmics
abifet
1
380
Clustering
abifet
2
300
Ensemble Methods
abifet
0
250
Classification
abifet
0
330
Concept Drift
abifet
0
400
Other Decks in Research
See All in Research
VectorLLM: Human-like Extraction of Structured Building Contours via Multimodal LLMs
satai
4
190
ストレス計測方法の確立に向けたマルチモーダルデータの活用
yurikomium
0
1.4k
「どう育てるか」より「どう働きたいか」〜スクラムマスターの最初の一歩〜
hirakawa51
0
850
AlphaEarth Foundations: An embedding field model for accurate and efficient global mapping from sparse label data
satai
1
190
AI エージェントを活用した研究再現性の自動定量評価 / scisci2025
upura
1
150
電力システム最適化入門
mickey_kubo
1
910
業界横断 副業・兼業者の実態調査
fkske
0
240
問いを起点に、社会と共鳴する知を育む場へ
matsumoto_r
PRO
0
610
生成的推薦の人気バイアスの分析:暗記の観点から / JSAI2025
upura
0
260
Delta Airlines® Customer Care in the U.S.: How to Reach Them Now
bookingcomcustomersupportusa
0
110
単施設でできる臨床研究の考え方
shuntaros
0
2.7k
SSII2025 [SS2] 横浜DeNAベイスターズの躍進を支えたAIプロダクト
ssii
PRO
7
4k
Featured
See All Featured
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
9
800
JavaScript: Past, Present, and Future - NDC Porto 2020
reverentgeek
51
5.6k
Embracing the Ebb and Flow
colly
87
4.8k
Typedesign – Prime Four
hannesfritz
42
2.8k
Designing Experiences People Love
moore
142
24k
Scaling GitHub
holman
463
140k
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
29
2.9k
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
31
2.2k
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
46
7.6k
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
34
3.1k
Speed Design
sergeychernyshev
32
1.1k
Imperfection Machines: The Place of Print at Facebook
scottboms
268
13k
Transcript
Introduction to Data Stream Mining Albert Bifet March 2012
Motivation Source: IDC’s Digital Universe Study (EMC), June 2011 Data
is growing
Motivation Memory unit Size Binary size kilobyte (kB/KB) 103 210
megabyte (MB) 106 220 gigabyte (GB) 109 230 terabyte (TB) 1012 240 petabyte (PB) 1015 250 exabyte (EB) 1018 260 zettabyte (ZB) 1021 270 yottabyte (YB) 1024 280 Data is growing
Motivation Source: IDC’s Digital Universe Study (EMC), June 2011 Data
is growing
Motivation Source: IDC’s Digital Universe Study (EMC), June 2011 Data
is growing
Motivation Source: IDC’s Digital Universe Study (EMC), June 2011 Data
is growing
Streaming Data Big Data & Real Time
Big Data McKinsey Global Institute (MGI) Report on Big Data,
2011. Big data refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze.
Big Data McKinsey Global Institute (MGI) Report on Big Data,
2011. Big data refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze.
Methodology Sampling and distributed systems
Methodology Paolo Boldi Big Data does not need big machines,
it needs big intelligence
Real time analytics We want to analyze what is happening
now.
Real time analytics We want to analyze what is happening
now.
Time and Memory Number 8 Wire Mentality Time and memory
are the resource dimensions of the process.
Time and Memory Time and memory are the resource dimensions
of the process.
Algorithms Classification, Regression, Clustering, Frequent Pattern Mining.
Applications sensor data: industry, cities telecomm data social networks: twitter,
facebook, yahoo marketing: sales business Data may come from: humans, sensors, or machines.
Data Streams Big Data & Real Time