Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Streaming Algorithms

majek04
March 15, 2016

Streaming Algorithms

majek04

March 15, 2016
Tweet

More Decks by majek04

Other Decks in Technology

Transcript

  1. • L3 - spoofed IP packets • source IP addresses

    are fake • very large • this is what you hear in news • L7 - fully established TCP connections • IP reputation is effective 7 Two DoS types
  2. • infinite data stream on input • approximate 11 Streaming

    algorithms Streaming algorithm Data stream Results
  3. • sflow packets samples as input • detected attacks on

    output 12 Attack detection is streaming! Streaming algorithms Packet samples Attacks
  4. • EWMA - Exponentially weighted moving average • Counting rates

    of packets • Space saving • Known as Top-N or Heavy Hitters • Simplified hierarchical heavy hitters • Hyper log log • Cardinality estimation - Counting unique things 13 Streaming algorithms
  5. 14 The problem: PPS ! Mpps Descr! 3.878 --ip=141.245.59.191/32! 2.878

    --ip=141.245.59.192/32! 1.878 --ip=141.245.59.193/32! 1.878 --ip=141.245.59.194/32! 1.878 --ip=141.245.59.195/32! 1.878 --ip=141.245.59.196/32! 1.878 --ip=141.245.59.197/32! 1.878 --ip=141.245.59.198/32! 1.878 --ip=141.245.59.199/32! ...!
  6. 17 Naive: Moving average 1.0s 1.1s 1.3s 1.8s 1.99s 2.1s

    2.4s 2.41s t=2.50s Precisely 5 samples
  7. 23

  8. 24

  9. • Smoothed average • The same maths as Linux "load

    average" • Charges slow (half-life) • Discharges quickly • Can be also used to count rates of packets 25 EWMA - summary
  10. 26 The problem: PPS ! Mpps Descr! 3.878 --ip=141.245.59.191/32! 2.878

    --ip=141.245.59.192/32! 1.878 --ip=141.245.59.193/32! 1.878 --ip=141.245.59.194/32! 1.878 --ip=141.245.59.195/32! 1.878 --ip=141.245.59.196/32! 1.878 --ip=141.245.59.197/32! 1.878 --ip=141.245.59.198/32! 1.878 --ip=141.245.59.199/32! ...!
  11. 27 The problem: Memory pps IP 12.2M 1.2.3.4 2.4M 42.1.2.4

    0.01M 2.4.3.1 0.01M 192.168.1.1 ...
  12. • aka: heavy hitters • A fixed-memory data structure •

    That can "count" top-N items • think: top url's, top customer IP's, etc • Count-Min sketch, Space Saving 28 Top-N problem
  13. 38 Space saving error count key 0 2 Alice 1

    1 Eric 0 1 Charlie 2 Counter? 1 .. 2 1
  14. 39

  15. What about rates? 40 • It's hard • was: GetAll()

    • now: GetAll(time.Time) • No longer O(1) • Instead O(log n)
  16. 41

  17. • Top-N / Heavy-hitter algorithm • Fixed memory size •

    Strong error guarantees 42 Summary - Space saving
  18. 43 Aggregating attacks ! Mpps Descr! 3.878 --ip=141.245.59.191/32! 2.878 --ip=141.245.59.192/32!

    1.878 --ip=141.245.59.193/32! 1.878 --ip=141.245.59.194/32! 1.878 --ip=141.245.59.195/32! 1.878 --ip=141.245.59.196/32! 1.878 --ip=141.245.59.197/32! 1.878 --ip=141.245.59.198/32! 1.878 --ip=141.245.59.199/32! ...! ! Mpps Descr! 35.878 --ip=141.245.59.0/24! vs
  19. 46 Multiple dimensions pps IP:port 12.2M 1.2.3.4:53 2.4M 42.1.2.4:80 0.01M

    2.4.3.1:80 0.01M 192.168.1.1:443 pps IP 12.2M 1.2.3.4 2.4M 42.1.2.4 0.01M 2.4.3.1 0.01M 192.168.1.1 pps subnet 12.2M 1.2.3.0/24 2.4M 42.1.2.0/24 0.01M 2.4.3.0/24 0.01M 192.168.1.0/24
  20. 47 Multiple dimensions pps IP:port 12.2M 1.2.3.4:53 2.4M 42.1.2.4:80 0.01M

    2.4.3.1:80 0.01M 192.168.1.1:443 pps IP 12.2M 1.2.3.4 2.4M 42.1.2.4 0.01M 2.4.3.1 0.01M 192.168.1.1 pps subnet 12.2M 1.2.3.0/24 2.4M 42.1.2.0/24 0.01M 2.4.3.0/24 0.01M 192.168.1.0/24 incoming sample: 42.1.2.4:80
  21. 48 Multiple dimensions pps IP:port 12.2M 1.2.3.4:53 2.4M 42.1.2.4:80 0.01M

    2.4.3.1:80 0.01M 192.168.1.1:443 pps IP 12.2M 1.2.3.4 2.4M 42.1.2.4 0.01M 2.4.3.1 0.01M 192.168.1.1 pps subnet 12.2M 1.2.3.0/24 2.4M 42.1.2.0/24 0.01M 2.4.3.0/24 0.01M 192.168.1.0/24 reporting threshold: 1M
  22. 49 Attack report ! Mpps Descr! 12.2 --ip=1.2.3.4 --port=53! 2.4

    --ip=42.1.2.4 --port=80! 12.2 --ip=1.2.3.4! 2.4 --ip=42.1.2.4! 12.2 --ip=1.2.3.0/24! 2.4 --ip=42.1.2.0/24!
  23. 50 Multiple dimensions pps IP:port 12.2M 1.2.3.4:53 2.4M 42.1.2.4:80 0.01M

    2.4.3.1:80 0.01M 192.168.1.1:443 pps IP 0.1M 1.2.3.4 0M 42.1.2.4 0.01M 2.4.3.1 0.01M 192.168.1.1 pps subnet 0.1M 1.2.3.0/24 0M 42.1.2.0/24 0.01M 2.4.3.0/24 0.01M 192.168.1.0/24 incoming sample: 42.1.2.4:80
  24. • Approximate • High error in pps • Works well

    in practice • Scales well • Fast and simple to implement 53 Summary - SHHH
  25. 54 Spoofed Source IP? ! Mpps Description! 23.833 --target=173.245.59.2 --agent=WAW

    --iface=659 Est= 57364! 23.067 --target=173.245.58.1 --agent=WAW --iface=659 Est= 56995! 7.139 --target=173.245.58.1 --agent=DUS --iface=893 Est= 11493! 6.366 --target=173.245.59.2 --agent=DUS --iface=893 Est= 11240! 2.590 --target=173.245.58.1 --agent=SIN --iface=657 Est=219987! 2.557 --target=173.245.59.2 --agent=SIN --iface=657 Est=220380! 1.045 --target=173.245.58.1 --agent=MAN --iface=756 Est= 207! 1.039 --target=173.245.59.2 --agent=MAN --iface=756 Est= 200!
  26. 57

  27. • Attack detection is a streaming problem • Streaming algorithms

    are awesome • Applicable to many more problems 60 Summary Thanks! marek@cloudflare.com