Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Streaming Algorithms

Sponsored · Ship Features Fearlessly Turn features on and off without deploys. Used by thousands of Ruby developers.
Avatar for majek04 majek04
March 15, 2016

Streaming Algorithms

Avatar for majek04

majek04

March 15, 2016
Tweet

More Decks by majek04

Other Decks in Technology

Transcript

  1. • L3 - spoofed IP packets • source IP addresses

    are fake • very large • this is what you hear in news • L7 - fully established TCP connections • IP reputation is effective 7 Two DoS types
  2. • infinite data stream on input • approximate 11 Streaming

    algorithms Streaming algorithm Data stream Results
  3. • sflow packets samples as input • detected attacks on

    output 12 Attack detection is streaming! Streaming algorithms Packet samples Attacks
  4. • EWMA - Exponentially weighted moving average • Counting rates

    of packets • Space saving • Known as Top-N or Heavy Hitters • Simplified hierarchical heavy hitters • Hyper log log • Cardinality estimation - Counting unique things 13 Streaming algorithms
  5. 14 The problem: PPS ! Mpps Descr! 3.878 --ip=141.245.59.191/32! 2.878

    --ip=141.245.59.192/32! 1.878 --ip=141.245.59.193/32! 1.878 --ip=141.245.59.194/32! 1.878 --ip=141.245.59.195/32! 1.878 --ip=141.245.59.196/32! 1.878 --ip=141.245.59.197/32! 1.878 --ip=141.245.59.198/32! 1.878 --ip=141.245.59.199/32! ...!
  6. 17 Naive: Moving average 1.0s 1.1s 1.3s 1.8s 1.99s 2.1s

    2.4s 2.41s t=2.50s Precisely 5 samples
  7. 23

  8. 24

  9. • Smoothed average • The same maths as Linux "load

    average" • Charges slow (half-life) • Discharges quickly • Can be also used to count rates of packets 25 EWMA - summary
  10. 26 The problem: PPS ! Mpps Descr! 3.878 --ip=141.245.59.191/32! 2.878

    --ip=141.245.59.192/32! 1.878 --ip=141.245.59.193/32! 1.878 --ip=141.245.59.194/32! 1.878 --ip=141.245.59.195/32! 1.878 --ip=141.245.59.196/32! 1.878 --ip=141.245.59.197/32! 1.878 --ip=141.245.59.198/32! 1.878 --ip=141.245.59.199/32! ...!
  11. 27 The problem: Memory pps IP 12.2M 1.2.3.4 2.4M 42.1.2.4

    0.01M 2.4.3.1 0.01M 192.168.1.1 ...
  12. • aka: heavy hitters • A fixed-memory data structure •

    That can "count" top-N items • think: top url's, top customer IP's, etc • Count-Min sketch, Space Saving 28 Top-N problem
  13. 38 Space saving error count key 0 2 Alice 1

    1 Eric 0 1 Charlie 2 Counter? 1 .. 2 1
  14. 39

  15. What about rates? 40 • It's hard • was: GetAll()

    • now: GetAll(time.Time) • No longer O(1) • Instead O(log n)
  16. 41

  17. • Top-N / Heavy-hitter algorithm • Fixed memory size •

    Strong error guarantees 42 Summary - Space saving
  18. 43 Aggregating attacks ! Mpps Descr! 3.878 --ip=141.245.59.191/32! 2.878 --ip=141.245.59.192/32!

    1.878 --ip=141.245.59.193/32! 1.878 --ip=141.245.59.194/32! 1.878 --ip=141.245.59.195/32! 1.878 --ip=141.245.59.196/32! 1.878 --ip=141.245.59.197/32! 1.878 --ip=141.245.59.198/32! 1.878 --ip=141.245.59.199/32! ...! ! Mpps Descr! 35.878 --ip=141.245.59.0/24! vs
  19. 46 Multiple dimensions pps IP:port 12.2M 1.2.3.4:53 2.4M 42.1.2.4:80 0.01M

    2.4.3.1:80 0.01M 192.168.1.1:443 pps IP 12.2M 1.2.3.4 2.4M 42.1.2.4 0.01M 2.4.3.1 0.01M 192.168.1.1 pps subnet 12.2M 1.2.3.0/24 2.4M 42.1.2.0/24 0.01M 2.4.3.0/24 0.01M 192.168.1.0/24
  20. 47 Multiple dimensions pps IP:port 12.2M 1.2.3.4:53 2.4M 42.1.2.4:80 0.01M

    2.4.3.1:80 0.01M 192.168.1.1:443 pps IP 12.2M 1.2.3.4 2.4M 42.1.2.4 0.01M 2.4.3.1 0.01M 192.168.1.1 pps subnet 12.2M 1.2.3.0/24 2.4M 42.1.2.0/24 0.01M 2.4.3.0/24 0.01M 192.168.1.0/24 incoming sample: 42.1.2.4:80
  21. 48 Multiple dimensions pps IP:port 12.2M 1.2.3.4:53 2.4M 42.1.2.4:80 0.01M

    2.4.3.1:80 0.01M 192.168.1.1:443 pps IP 12.2M 1.2.3.4 2.4M 42.1.2.4 0.01M 2.4.3.1 0.01M 192.168.1.1 pps subnet 12.2M 1.2.3.0/24 2.4M 42.1.2.0/24 0.01M 2.4.3.0/24 0.01M 192.168.1.0/24 reporting threshold: 1M
  22. 49 Attack report ! Mpps Descr! 12.2 --ip=1.2.3.4 --port=53! 2.4

    --ip=42.1.2.4 --port=80! 12.2 --ip=1.2.3.4! 2.4 --ip=42.1.2.4! 12.2 --ip=1.2.3.0/24! 2.4 --ip=42.1.2.0/24!
  23. 50 Multiple dimensions pps IP:port 12.2M 1.2.3.4:53 2.4M 42.1.2.4:80 0.01M

    2.4.3.1:80 0.01M 192.168.1.1:443 pps IP 0.1M 1.2.3.4 0M 42.1.2.4 0.01M 2.4.3.1 0.01M 192.168.1.1 pps subnet 0.1M 1.2.3.0/24 0M 42.1.2.0/24 0.01M 2.4.3.0/24 0.01M 192.168.1.0/24 incoming sample: 42.1.2.4:80
  24. • Approximate • High error in pps • Works well

    in practice • Scales well • Fast and simple to implement 53 Summary - SHHH
  25. 54 Spoofed Source IP? ! Mpps Description! 23.833 --target=173.245.59.2 --agent=WAW

    --iface=659 Est= 57364! 23.067 --target=173.245.58.1 --agent=WAW --iface=659 Est= 56995! 7.139 --target=173.245.58.1 --agent=DUS --iface=893 Est= 11493! 6.366 --target=173.245.59.2 --agent=DUS --iface=893 Est= 11240! 2.590 --target=173.245.58.1 --agent=SIN --iface=657 Est=219987! 2.557 --target=173.245.59.2 --agent=SIN --iface=657 Est=220380! 1.045 --target=173.245.58.1 --agent=MAN --iface=756 Est= 207! 1.039 --target=173.245.59.2 --agent=MAN --iface=756 Est= 200!
  26. 57

  27. • Attack detection is a streaming problem • Streaming algorithms

    are awesome • Applicable to many more problems 60 Summary Thanks! marek@cloudflare.com