Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Introduction to Data Stream Mining
Search
Albert Bifet
August 25, 2012
Research
1
220
Introduction to Data Stream Mining
Albert Bifet
August 25, 2012
Tweet
Share
More Decks by Albert Bifet
See All by Albert Bifet
Distributed Systems
abifet
1
260
Frequent Pattern Mining
abifet
1
220
Regression
abifet
0
230
Evaluation
abifet
1
200
Stream Algorithmics
abifet
1
340
Clustering
abifet
2
270
Ensemble Methods
abifet
0
220
Classification
abifet
0
310
Concept Drift
abifet
0
360
Other Decks in Research
See All in Research
20250226 NLP colloquium: "SoftMatcha: 10億単語規模コーパス検索のための柔らかくも高速なパターンマッチャー"
de9uch1
0
120
移動ビッグデータに基づく地理情報の埋め込みベクトル化
tam1110
0
250
A Segment Anything Model based weakly supervised learning method for crop mapping using Sentinel-2 time series images
satai
3
160
論文紹介: COSMO: A Large-Scale E-commerce Common Sense Knowledge Generation and Serving System at Amazon (SIGMOD 2024)
ynakano
1
410
ことばの意味を計算するしくみ
verypluming
8
1.3k
Vision Language Modelと完全自動運転AIの最新動向
tsubasashi
0
270
PostgreSQLにおける分散トレーシングの現在 - 第50回PostgreSQLアンカンファレンス
seinoyu
0
260
国際会議ACL2024参加報告
chemical_tree
1
450
Satellite Sunroof: High-res Digital Surface Models and Roof Segmentation for Global Solar Mapping
satai
3
170
言語モデルLUKEを経済の知識に特化させたモデル「UBKE-LUKE」について
petter0201
0
250
Weekly AI Agents News! 11月号 論文のアーカイブ
masatoto
0
300
BtoB プロダクトにおけるインサイトマネジメントの必要性 現場ドリブンなカミナシがインサイトマネジメントに取り組むワケ / Why field-driven Kaminashi is working on insight management
kaminashi
1
310
Featured
See All Featured
Measuring & Analyzing Core Web Vitals
bluesmoon
6
270
Intergalactic Javascript Robots from Outer Space
tanoku
270
27k
GraphQLの誤解/rethinking-graphql
sonatard
69
10k
Speed Design
sergeychernyshev
28
820
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
PRO
13
1k
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
4
450
The Power of CSS Pseudo Elements
geoffreycrofte
75
5.5k
Writing Fast Ruby
sferik
628
61k
How To Stay Up To Date on Web Technology
chriscoyier
790
250k
GitHub's CSS Performance
jonrohan
1030
460k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
27
1.6k
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
44
7.1k
Transcript
Introduction to Data Stream Mining Albert Bifet March 2012
Motivation Source: IDC’s Digital Universe Study (EMC), June 2011 Data
is growing
Motivation Memory unit Size Binary size kilobyte (kB/KB) 103 210
megabyte (MB) 106 220 gigabyte (GB) 109 230 terabyte (TB) 1012 240 petabyte (PB) 1015 250 exabyte (EB) 1018 260 zettabyte (ZB) 1021 270 yottabyte (YB) 1024 280 Data is growing
Motivation Source: IDC’s Digital Universe Study (EMC), June 2011 Data
is growing
Motivation Source: IDC’s Digital Universe Study (EMC), June 2011 Data
is growing
Motivation Source: IDC’s Digital Universe Study (EMC), June 2011 Data
is growing
Streaming Data Big Data & Real Time
Big Data McKinsey Global Institute (MGI) Report on Big Data,
2011. Big data refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze.
Big Data McKinsey Global Institute (MGI) Report on Big Data,
2011. Big data refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze.
Methodology Sampling and distributed systems
Methodology Paolo Boldi Big Data does not need big machines,
it needs big intelligence
Real time analytics We want to analyze what is happening
now.
Real time analytics We want to analyze what is happening
now.
Time and Memory Number 8 Wire Mentality Time and memory
are the resource dimensions of the process.
Time and Memory Time and memory are the resource dimensions
of the process.
Algorithms Classification, Regression, Clustering, Frequent Pattern Mining.
Applications sensor data: industry, cities telecomm data social networks: twitter,
facebook, yahoo marketing: sales business Data may come from: humans, sensors, or machines.
Data Streams Big Data & Real Time