Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Introduction to Data Stream Mining
Search
Albert Bifet
August 25, 2012
Research
1
240
Introduction to Data Stream Mining
Albert Bifet
August 25, 2012
Tweet
Share
More Decks by Albert Bifet
See All by Albert Bifet
Distributed Systems
abifet
1
270
Frequent Pattern Mining
abifet
1
250
Regression
abifet
0
260
Evaluation
abifet
1
220
Stream Algorithmics
abifet
1
380
Clustering
abifet
2
300
Ensemble Methods
abifet
0
250
Classification
abifet
0
330
Concept Drift
abifet
0
390
Other Decks in Research
See All in Research
Galileo: Learning Global & Local Features of Many Remote Sensing Modalities
satai
3
120
Type Theory as a Formal Basis of Natural Language Semantics
daikimatsuoka
1
270
「どう育てるか」より「どう働きたいか」〜スクラムマスターの最初の一歩〜
hirakawa51
0
420
SatCLIP: Global, General-Purpose Location Embeddings with Satellite Imagery
satai
3
240
EOGS: Gaussian Splatting for Efficient Satellite Image Photogrammetry
satai
4
350
Adaptive fusion of multi-modal remote sensing data for optimal sub-field crop yield prediction
satai
3
230
Delta Airlines® Customer Care in the U.S.: How to Reach Them Now
bookingcomcustomersupportusa
PRO
0
100
20250605_新交通システム推進議連_熊本都市圏「車1割削減、渋滞半減、公共交通2倍」から考える地方都市交通政策
trafficbrain
0
630
電力システム最適化入門
mickey_kubo
1
750
定性データ、どう活かす? 〜定性データのための分析基盤、はじめました〜 / How to utilize qualitative data? ~We have launched an analysis platform for qualitative data~
kaminashi
7
1.1k
問いを起点に、社会と共鳴する知を育む場へ
matsumoto_r
PRO
0
480
データサイエンティストの就労意識~2015→2024 一般(個人)会員アンケートより
datascientistsociety
PRO
0
750
Featured
See All Featured
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
29
2.8k
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
8
850
Gamification - CAS2011
davidbonilla
81
5.4k
Embracing the Ebb and Flow
colly
86
4.8k
The Art of Programming - Codeland 2020
erikaheidi
54
13k
A Tale of Four Properties
chriscoyier
160
23k
How to train your dragon (web standard)
notwaldorf
96
6.1k
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
53
2.9k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
29
1.8k
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
PRO
21
1.4k
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
331
22k
Java REST API Framework Comparison - PWX 2021
mraible
31
8.7k
Transcript
Introduction to Data Stream Mining Albert Bifet March 2012
Motivation Source: IDC’s Digital Universe Study (EMC), June 2011 Data
is growing
Motivation Memory unit Size Binary size kilobyte (kB/KB) 103 210
megabyte (MB) 106 220 gigabyte (GB) 109 230 terabyte (TB) 1012 240 petabyte (PB) 1015 250 exabyte (EB) 1018 260 zettabyte (ZB) 1021 270 yottabyte (YB) 1024 280 Data is growing
Motivation Source: IDC’s Digital Universe Study (EMC), June 2011 Data
is growing
Motivation Source: IDC’s Digital Universe Study (EMC), June 2011 Data
is growing
Motivation Source: IDC’s Digital Universe Study (EMC), June 2011 Data
is growing
Streaming Data Big Data & Real Time
Big Data McKinsey Global Institute (MGI) Report on Big Data,
2011. Big data refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze.
Big Data McKinsey Global Institute (MGI) Report on Big Data,
2011. Big data refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze.
Methodology Sampling and distributed systems
Methodology Paolo Boldi Big Data does not need big machines,
it needs big intelligence
Real time analytics We want to analyze what is happening
now.
Real time analytics We want to analyze what is happening
now.
Time and Memory Number 8 Wire Mentality Time and memory
are the resource dimensions of the process.
Time and Memory Time and memory are the resource dimensions
of the process.
Algorithms Classification, Regression, Clustering, Frequent Pattern Mining.
Applications sensor data: industry, cities telecomm data social networks: twitter,
facebook, yahoo marketing: sales business Data may come from: humans, sensors, or machines.
Data Streams Big Data & Real Time