Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Introduction to Data Stream Mining

B9a39fa1f84007e78dc6e0d95e882991?s=47 Albert Bifet
August 25, 2012

Introduction to Data Stream Mining

B9a39fa1f84007e78dc6e0d95e882991?s=128

Albert Bifet

August 25, 2012
Tweet

Transcript

  1. Introduction to Data Stream Mining Albert Bifet March 2012

  2. Motivation Source: IDC’s Digital Universe Study (EMC), June 2011 Data

    is growing
  3. Motivation Memory unit Size Binary size kilobyte (kB/KB) 103 210

    megabyte (MB) 106 220 gigabyte (GB) 109 230 terabyte (TB) 1012 240 petabyte (PB) 1015 250 exabyte (EB) 1018 260 zettabyte (ZB) 1021 270 yottabyte (YB) 1024 280 Data is growing
  4. Motivation Source: IDC’s Digital Universe Study (EMC), June 2011 Data

    is growing
  5. Motivation Source: IDC’s Digital Universe Study (EMC), June 2011 Data

    is growing
  6. Motivation Source: IDC’s Digital Universe Study (EMC), June 2011 Data

    is growing
  7. Streaming Data Big Data & Real Time

  8. Big Data McKinsey Global Institute (MGI) Report on Big Data,

    2011. Big data refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze.
  9. Big Data McKinsey Global Institute (MGI) Report on Big Data,

    2011. Big data refers to datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze.
  10. Methodology Sampling and distributed systems

  11. Methodology Paolo Boldi Big Data does not need big machines,

    it needs big intelligence
  12. Real time analytics We want to analyze what is happening

    now.
  13. Real time analytics We want to analyze what is happening

    now.
  14. Time and Memory Number 8 Wire Mentality Time and memory

    are the resource dimensions of the process.
  15. Time and Memory Time and memory are the resource dimensions

    of the process.
  16. Algorithms Classification, Regression, Clustering, Frequent Pattern Mining.

  17. Applications sensor data: industry, cities telecomm data social networks: twitter,

    facebook, yahoo marketing: sales business Data may come from: humans, sensors, or machines.
  18. Data Streams Big Data & Real Time