Streaming Algorithms

D4e1d473a995ef37b3e03e9e6006c3e3?s=47 majek04
March 15, 2016

Streaming Algorithms

D4e1d473a995ef37b3e03e9e6006c3e3?s=128

majek04

March 15, 2016
Tweet

Transcript

  1. Automatic DoS mitigation (streaming algorithms) Marek Majkowski marek@cloudflare.com @majek04

  2. 2 Who we are

  3. 3 Large network

  4. 4 Content neutral

  5. 5 DoS is a problem DoS events per day

  6. 6 X example.com Defending from DoS is hard

  7. • L3 - spoofed IP packets • source IP addresses

    are fake • very large • this is what you hear in news • L7 - fully established TCP connections • IP reputation is effective 7 Two DoS types
  8. 8 L3 Volume per server Packets per second

  9. 9 Automatic attack handling Attack Detection Mitigation database Reactive Automation

    sflow iptables
  10. 10 Automatic attack detection Attack Detection sflow

  11. • infinite data stream on input • approximate 11 Streaming

    algorithms Streaming algorithm Data stream Results
  12. • sflow packets samples as input • detected attacks on

    output 12 Attack detection is streaming! Streaming algorithms Packet samples Attacks
  13. • EWMA - Exponentially weighted moving average • Counting rates

    of packets • Space saving • Known as Top-N or Heavy Hitters • Simplified hierarchical heavy hitters • Hyper log log • Cardinality estimation - Counting unique things 13 Streaming algorithms
  14. 14 The problem: PPS ! Mpps Descr! 3.878 --ip=141.245.59.191/32! 2.878

    --ip=141.245.59.192/32! 1.878 --ip=141.245.59.193/32! 1.878 --ip=141.245.59.194/32! 1.878 --ip=141.245.59.195/32! 1.878 --ip=141.245.59.196/32! 1.878 --ip=141.245.59.197/32! 1.878 --ip=141.245.59.198/32! 1.878 --ip=141.245.59.199/32! ...!
  15. 15 Naive approach pps IP 12.2M 1.2.3.4 2.4M 42.1.2.4 0.01M

    2.4.3.1 0.01M 192.168.1.1
  16. 16 There is no such thing as pps

  17. 17 Naive: Moving average 1.0s 1.1s 1.3s 1.8s 1.99s 2.1s

    2.4s 2.41s t=2.50s Precisely 5 samples
  18. 18 Not-smoothed values 1.0s 1.1s 1.3s 1.8s 1.99s 2.1s 2.4s

    2.41s 100 3 50 5 2 5 10 raw pps=
  19. 19 Not-smoothed values

  20. 20 Linux load average - charge

  21. 21 Linux load average - discharge

  22. 22 Better: EWMA old load difference dampening factor measurement frequency

    half-life time
  23. 23

  24. 24

  25. • Smoothed average • The same maths as Linux "load

    average" • Charges slow (half-life) • Discharges quickly • Can be also used to count rates of packets 25 EWMA - summary
  26. 26 The problem: PPS ! Mpps Descr! 3.878 --ip=141.245.59.191/32! 2.878

    --ip=141.245.59.192/32! 1.878 --ip=141.245.59.193/32! 1.878 --ip=141.245.59.194/32! 1.878 --ip=141.245.59.195/32! 1.878 --ip=141.245.59.196/32! 1.878 --ip=141.245.59.197/32! 1.878 --ip=141.245.59.198/32! 1.878 --ip=141.245.59.199/32! ...!
  27. 27 The problem: Memory pps IP 12.2M 1.2.3.4 2.4M 42.1.2.4

    0.01M 2.4.3.1 0.01M 192.168.1.1 ...
  28. • aka: heavy hitters • A fixed-memory data structure •

    That can "count" top-N items • think: top url's, top customer IP's, etc • Count-Min sketch, Space Saving 28 Top-N problem
  29. 29 Space saving error count key

  30. 30 Space saving error count key 0 1 Alice Alice

  31. 31 Space saving error count key 0 2 Alice Alice

  32. 32 Space saving error count key 0 2 Alice 0

    1 Ben Ben
  33. 33 Space saving error count key 0 2 Alice 0

    1 Ben 0 1 Charlie Charlie
  34. 34 Space saving error count key 0 2 Alice 0

    1 Ben 0 1 Charlie Eric?
  35. 35 Space saving error count key 0 2 Alice 0

    1 Ben 0 1 Charlie Eric?
  36. 36 Space saving error count key 0 2 Alice 1

    0 Eric 0 1 Charlie + Eric
  37. 37 Space saving error count key 0 2 Alice 1

    1 Eric 0 1 Charlie Eric
  38. 38 Space saving error count key 0 2 Alice 1

    1 Eric 0 1 Charlie 2 Counter? 1 .. 2 1
  39. 39

  40. What about rates? 40 • It's hard • was: GetAll()

    • now: GetAll(time.Time) • No longer O(1) • Instead O(log n)
  41. 41

  42. • Top-N / Heavy-hitter algorithm • Fixed memory size •

    Strong error guarantees 42 Summary - Space saving
  43. 43 Aggregating attacks ! Mpps Descr! 3.878 --ip=141.245.59.191/32! 2.878 --ip=141.245.59.192/32!

    1.878 --ip=141.245.59.193/32! 1.878 --ip=141.245.59.194/32! 1.878 --ip=141.245.59.195/32! 1.878 --ip=141.245.59.196/32! 1.878 --ip=141.245.59.197/32! 1.878 --ip=141.245.59.198/32! 1.878 --ip=141.245.59.199/32! ...! ! Mpps Descr! 35.878 --ip=141.245.59.0/24! vs
  44. 44 Hierarchical Heavy Hitters

  45. 45 Simplified HHH

  46. 46 Multiple dimensions pps IP:port 12.2M 1.2.3.4:53 2.4M 42.1.2.4:80 0.01M

    2.4.3.1:80 0.01M 192.168.1.1:443 pps IP 12.2M 1.2.3.4 2.4M 42.1.2.4 0.01M 2.4.3.1 0.01M 192.168.1.1 pps subnet 12.2M 1.2.3.0/24 2.4M 42.1.2.0/24 0.01M 2.4.3.0/24 0.01M 192.168.1.0/24
  47. 47 Multiple dimensions pps IP:port 12.2M 1.2.3.4:53 2.4M 42.1.2.4:80 0.01M

    2.4.3.1:80 0.01M 192.168.1.1:443 pps IP 12.2M 1.2.3.4 2.4M 42.1.2.4 0.01M 2.4.3.1 0.01M 192.168.1.1 pps subnet 12.2M 1.2.3.0/24 2.4M 42.1.2.0/24 0.01M 2.4.3.0/24 0.01M 192.168.1.0/24 incoming sample: 42.1.2.4:80
  48. 48 Multiple dimensions pps IP:port 12.2M 1.2.3.4:53 2.4M 42.1.2.4:80 0.01M

    2.4.3.1:80 0.01M 192.168.1.1:443 pps IP 12.2M 1.2.3.4 2.4M 42.1.2.4 0.01M 2.4.3.1 0.01M 192.168.1.1 pps subnet 12.2M 1.2.3.0/24 2.4M 42.1.2.0/24 0.01M 2.4.3.0/24 0.01M 192.168.1.0/24 reporting threshold: 1M
  49. 49 Attack report ! Mpps Descr! 12.2 --ip=1.2.3.4 --port=53! 2.4

    --ip=42.1.2.4 --port=80! 12.2 --ip=1.2.3.4! 2.4 --ip=42.1.2.4! 12.2 --ip=1.2.3.0/24! 2.4 --ip=42.1.2.0/24!
  50. 50 Multiple dimensions pps IP:port 12.2M 1.2.3.4:53 2.4M 42.1.2.4:80 0.01M

    2.4.3.1:80 0.01M 192.168.1.1:443 pps IP 0.1M 1.2.3.4 0M 42.1.2.4 0.01M 2.4.3.1 0.01M 192.168.1.1 pps subnet 0.1M 1.2.3.0/24 0M 42.1.2.0/24 0.01M 2.4.3.0/24 0.01M 192.168.1.0/24 incoming sample: 42.1.2.4:80
  51. 51 Attack report ! Mpps Descr! 12.2 --ip=1.2.3.4 --port=53! 2.4

    --ip=42.1.2.4 --port=80!
  52. 52 Scales well

  53. • Approximate • High error in pps • Works well

    in practice • Scales well • Fast and simple to implement 53 Summary - SHHH
  54. 54 Spoofed Source IP? ! Mpps Description! 23.833 --target=173.245.59.2 --agent=WAW

    --iface=659 Est= 57364! 23.067 --target=173.245.58.1 --agent=WAW --iface=659 Est= 56995! 7.139 --target=173.245.58.1 --agent=DUS --iface=893 Est= 11493! 6.366 --target=173.245.59.2 --agent=DUS --iface=893 Est= 11240! 2.590 --target=173.245.58.1 --agent=SIN --iface=657 Est=219987! 2.557 --target=173.245.59.2 --agent=SIN --iface=657 Est=220380! 1.045 --target=173.245.58.1 --agent=MAN --iface=756 Est= 207! 1.039 --target=173.245.59.2 --agent=MAN --iface=756 Est= 200!
  55. 55 Hyper log log "Alice" 22 unique items! HLL

  56. 56 Hyper log log OR 44 unique items ( )

    = HLL#1 HLL#2
  57. 57

  58. 58 What about rates? HLL #1 HLL #2 HLL #3

    HLL #4
  59. 59 Hard drives

  60. • Attack detection is a streaming problem • Streaming algorithms

    are awesome • Applicable to many more problems 60 Summary Thanks! marek@cloudflare.com