How to double* the performance of vSwitch-based deployments

How to double* the performance of vSwitch-based deployments

NUMA-aware vSwitches (and more) in action

8fbd28ad59a1aa317a5ec175b0778359?s=128

Stephen Finucane

November 13, 2018
Tweet

Transcript

  1. How to double* the performance of vSwitch-based deployments NUMA-aware vSwitches

    (and more) in action Stephen Finucane OpenStack Software Developer 13th November 2018
  2. INSERT DESIGNATOR, IF NEEDED 2 Agenda • What is NUMA?

    • The Problem • A Solution • Common Questions • Bonus Section • Summary • Questions?
  3. What is NUMA?

  4. INSERT DESIGNATOR, IF NEEDED 4 What is NUMA? Non-Uniform Memory

    Architecture
  5. INSERT DESIGNATOR, IF NEEDED 5 What is NUMA? UMA (Uniform

    Memory Access) Historically, all memory on x86 systems is equally accessible by all CPUs. Known as Uniform Memory Access (UMA), access times are the same no matter which CPU performs the operation. NUMA (Non-Uniform Memory Access) In Non-Uniform Memory Access (NUMA), system memory is divided into zones (called nodes), which are allocated to particular CPUs or sockets. Access to memory that is local to a CPU is faster than memory connected to remote CPUs on that system.
  6. INSERT DESIGNATOR, IF NEEDED 6 What is NUMA? node A

    node B Local Access Remote Access Memory Channel Interconnect Memory Channel
  7. INSERT DESIGNATOR, IF NEEDED 7 What is NUMA? node A

    node B node C node D
  8. INSERT DESIGNATOR, IF NEEDED 8 What is NUMA? node A

    node B node C node D
  9. INSERT DESIGNATOR, IF NEEDED 9

  10. The Problem

  11. INSERT DESIGNATOR, IF NEEDED 11 Types of Networking* Kernel vHost

    (or virtio) Low performance, flexible Userspace vHost (DPDK) High performance, moderately flexible SR-IOV High performance, inflexible
  12. INSERT DESIGNATOR, IF NEEDED 12 Types of Networking* Kernel vHost

    (or virtio) Low performance, flexible Userspace vHost (DPDK) High performance, moderately flexible SR-IOV High performance, inflexible
  13. INSERT DESIGNATOR, IF NEEDED 13

  14. INSERT DESIGNATOR, IF NEEDED 14

  15. INSERT DESIGNATOR, IF NEEDED 15

  16. INSERT DESIGNATOR, IF NEEDED 16

  17. INSERT DESIGNATOR, IF NEEDED 17

  18. INSERT DESIGNATOR, IF NEEDED 18

  19. INSERT DESIGNATOR, IF NEEDED 19

  20. INSERT DESIGNATOR, IF NEEDED 20

  21. A Solution

  22. INSERT DESIGNATOR, IF NEEDED Neutron? 22

  23. INSERT DESIGNATOR, IF NEEDED 23 What do nova and neutron

    know? • Nova knows ◦ How much RAM, DISK, CPU do I have? ◦ What is the NUMA topology of my hardware? ◦ What hypervisor am I using? etc. • Neutron Knows ◦ What networking drivers(s) are available? ◦ How much bandwidth is available for a given interface? ◦ How do networks map to NICs? etc.
  24. INSERT DESIGNATOR, IF NEEDED 24 What do nova and neutron

    know? • Nova knows ◦ How much RAM, DISK, CPU do I have? ◦ What is the NUMA topology of my hardware? ◦ What hypervisor am I using? etc. • Neutron Knows ◦ What networking drivers(s) are available? ◦ How much bandwidth is available for a given interface? ◦ How do networks map to NICs? etc.
  25. INSERT DESIGNATOR, IF NEEDED Placement? 25

  26. INSERT DESIGNATOR, IF NEEDED 26 Placement couldn’t do it… •

    No nested resource providers • No NUMA modelling • No interaction between different services • Placement models what it’s told to
  27. INSERT DESIGNATOR, IF NEEDED Nova? 27

  28. INSERT DESIGNATOR, IF NEEDED 28 What do nova and neutron

    know? • Nova knows ◦ How much RAM, DISK, CPU do I have? ◦ What is the NUMA topology of my hardware? ◦ What hypervisor am I using? etc. • Neutron Knows ◦ What networking drivers(s) are available? ◦ How much bandwidth is available for a given interface? ◦ How do networks map to NICs? etc.
  29. INSERT DESIGNATOR, IF NEEDED 29 What do nova and neutron

    know? • Nova knows ◦ How much RAM, DISK, CPU do I have? ◦ What is the NUMA topology of my hardware? ◦ What hypervisor am I using? etc. • Neutron Knows ◦ What networking drivers(s) are available? ◦ How much bandwidth is available for a given interface? ◦ How do networks map to NICs? etc.
  30. INSERT DESIGNATOR, IF NEEDED Nova ✅ (with caveats) 30

  31. INSERT DESIGNATOR, IF NEEDED 31 Determining NUMA affinity of networks

    • Provider networks vs. Tenant networks? ❌
  32. INSERT DESIGNATOR, IF NEEDED 32 Determining NUMA affinity of networks

    • Provider networks vs. Tenant networks? ❌ • Pre-created networking vs. Self-serviced networking? ❌
  33. INSERT DESIGNATOR, IF NEEDED 33 Determining NUMA affinity of networks

    • Provider networks vs. Tenant networks? ❌ • Pre-created networking vs. Self-serviced networking? ❌ • L2 networks vs. L3 networks ✅
  34. INSERT DESIGNATOR, IF NEEDED 34 L2 network configuration (neutron) [ovs]

    bridge_mappings = physnet0:br-physnet0 openvswitch_agent.ini
  35. INSERT DESIGNATOR, IF NEEDED 35 L2 network configuration (nova) [neutron]

    physnets = physnet0,physnet1 [neutron_physnet_physnet0] numa_nodes = 0 nova.conf
  36. INSERT DESIGNATOR, IF NEEDED 36

  37. INSERT DESIGNATOR, IF NEEDED 37 L2 network configuration (nova) [neutron]

    physnets = physnet0,physnet1 [neutron_physnet_physnet0] numa_nodes = 0 nova.conf
  38. INSERT DESIGNATOR, IF NEEDED 38 L2 network configuration (nova) [neutron]

    physnets = physnet0,physnet1 [neutron_physnet_physnet0] numa_nodes = 0,1 nova.conf
  39. INSERT DESIGNATOR, IF NEEDED 39 L3 network configuration (neutron) [ovs]

    local_ip = OVERLAY_INTERFACE_IP_ADDRESS openvswitch_agent.ini
  40. INSERT DESIGNATOR, IF NEEDED 40 L3 network configuration (nova) [neutron_tunnel]

    numa_nodes = 1 nova.conf
  41. INSERT DESIGNATOR, IF NEEDED 41 L3 network configuration (nova) [neutron_tunnel]

    numa_nodes = 0,1 nova.conf
  42. Common Questions

  43. INSERT DESIGNATOR, IF NEEDED 43 Common Questions • Why so

    manual?
  44. INSERT DESIGNATOR, IF NEEDED 44 Common Questions • Why so

    manual? • Can I automate any of this?
  45. INSERT DESIGNATOR, IF NEEDED 45 Common Questions • Why so

    manual? • Can I automate any of this? • Will this ever move to placement?
  46. Bonus Section

  47. INSERT DESIGNATOR, IF NEEDED 47 Configurable TX/RX Queue Size •

    Pre-emption can result in packet drops Rocky
  48. INSERT DESIGNATOR, IF NEEDED 48 Configurable TX/RX Queue Size •

    Pre-emption can result in packet drops • Solution: make queues sizes bigger! (256 → 1024) Rocky
  49. INSERT DESIGNATOR, IF NEEDED 49 Configurable TX/RX Queue Size [libvirt]

    tx_queue_size = 1024 rx_queue_size = 1024 nova.conf Rocky
  50. INSERT DESIGNATOR, IF NEEDED 50 Emulator Thread Pinning • Hypervisors

    overhead tasks can steal resources from your vCPUs Ocata
  51. INSERT DESIGNATOR, IF NEEDED 51 Emulator Thread Pinning • Hypervisors

    overhead tasks can steal resources from your vCPUs • Solution: ensure overhead tasks run on a dedicated core Ocata
  52. INSERT DESIGNATOR, IF NEEDED 52 Emulator Thread Pinning • Hypervisors

    overhead tasks can steal resources from your vCPUs • Solution: ensure overhead tasks run on a dedicated core • Solution: ensure overhead tasks run on a dedicated pool of cores Ocata Rocky
  53. INSERT DESIGNATOR, IF NEEDED 53 Emulator Thread Pinning $ openstack

    flavor set $flavor \ --property 'hw:emulator_threads_policy=isolate' Ocata
  54. INSERT DESIGNATOR, IF NEEDED 54 Emulator Thread Pinning $ openstack

    flavor set $flavor \ --property 'hw:emulator_threads_policy=isolate' $ openstack flavor set $flavor \ --property 'hw:emulator_threads_policy=share' Rocky Ocata
  55. INSERT DESIGNATOR, IF NEEDED 55 Emulator Thread Pinning [compute] cpu_shared_set

    = 0-1 nova.conf Ocata Rocky
  56. INSERT DESIGNATOR, IF NEEDED 56 Tracking pCPUs via Placement •

    CPU pinning and NUMA are hard to configure and understand • No way to use use vCPUs and pCPUs on the same host • No way to use use vCPUs and pCPUs in the same instance Stein?
  57. INSERT DESIGNATOR, IF NEEDED 57 Tracking pCPUs via Placement •

    CPU pinning and NUMA are hard to configure and understand • No way to use use vCPUs and pCPUs on the same host • No way to use use vCPUs and pCPUs in the same instance • Solution: track PCPUs as resources in placement Stein?
  58. INSERT DESIGNATOR, IF NEEDED 58 Tracking pCPUs via Placement $

    openstack flavor set $flavor \ --property 'resources:PCPU=10' \ --property 'resources:VCPU=10' Stein?
  59. INSERT DESIGNATOR, IF NEEDED 59 Tracking pCPUs via Placement [compute]

    cpu_shared_set = 0-9,20-29 cpu_dedicated_set = 10-19,30-39 nova.conf Stein?
  60. INSERT DESIGNATOR, IF NEEDED 60 Live migration with pCPUs •

    Live migration of instances with a NUMA topology is broken Stein?
  61. INSERT DESIGNATOR, IF NEEDED 61 Live migration with pCPUs •

    Live migration of instances with a NUMA topology is broken • Solution: fix it Stein?
  62. Summary

  63. INSERT DESIGNATOR, IF NEEDED 63 Summary • Not accounting for

    NUMA can cause huge performance hits
  64. INSERT DESIGNATOR, IF NEEDED 64 Summary • Not accounting for

    NUMA can cause huge performance hits • NUMA-aware vSwitches are thing since Rocky ◦ nova.conf based configuration, mostly a deployment issue
  65. INSERT DESIGNATOR, IF NEEDED 65 Summary • Not accounting for

    NUMA can cause huge performance hits • NUMA-aware vSwitches are thing since Rocky ◦ nova.conf based configuration, mostly a deployment issue • Future work will explore moving this to placement
  66. INSERT DESIGNATOR, IF NEEDED 66 Summary • Not accounting for

    NUMA can cause huge performance hits • NUMA-aware vSwitches are thing since Rocky ◦ nova.conf based configuration, mostly a deployment issue • Future work will explore moving this to placement • Lots of other features that can also help, now and in the future ◦ TX/RX queue sizes, emulator thread pinning, vCPU-pCPU coexistence, live migration with NUMA topologies
  67. Questions?

  68. THANK YOU plus.google.com/+RedHat linkedin.com/company/red-hat youtube.com/user/RedHatVideos facebook.com/redhatinc twitter.com/RedHatNews

  69. INSERT DESIGNATOR, IF NEEDED 69 Resources You might want to

    know about these... • RHEL NUMA Tuning Guide • Attaching physical PCI devices to guests • Nova Flavors Guide • NUMA-aware vSwitches spec • Emulator Thread Pinning spec (out-of-date!) • TX/RX Queue Sizes spec • CPU Tracking via Placement spec (draft)