Upgrade to Pro — share decks privately, control downloads, hide ads and more …

How to double* the performance of vSwitch-based deployments

How to double* the performance of vSwitch-based deployments

NUMA-aware vSwitches (and more) in action

Stephen Finucane

November 13, 2018
Tweet

More Decks by Stephen Finucane

Other Decks in Technology

Transcript

  1. How to double* the performance of vSwitch-based deployments NUMA-aware vSwitches

    (and more) in action Stephen Finucane OpenStack Software Developer 13th November 2018
  2. INSERT DESIGNATOR, IF NEEDED 2 Agenda • What is NUMA?

    • The Problem • A Solution • Common Questions • Bonus Section • Summary • Questions?
  3. INSERT DESIGNATOR, IF NEEDED 5 What is NUMA? UMA (Uniform

    Memory Access) Historically, all memory on x86 systems is equally accessible by all CPUs. Known as Uniform Memory Access (UMA), access times are the same no matter which CPU performs the operation. NUMA (Non-Uniform Memory Access) In Non-Uniform Memory Access (NUMA), system memory is divided into zones (called nodes), which are allocated to particular CPUs or sockets. Access to memory that is local to a CPU is faster than memory connected to remote CPUs on that system.
  4. INSERT DESIGNATOR, IF NEEDED 6 What is NUMA? node A

    node B Local Access Remote Access Memory Channel Interconnect Memory Channel
  5. INSERT DESIGNATOR, IF NEEDED 11 Types of Networking* Kernel vHost

    (or virtio) Low performance, flexible Userspace vHost (DPDK) High performance, moderately flexible SR-IOV High performance, inflexible
  6. INSERT DESIGNATOR, IF NEEDED 12 Types of Networking* Kernel vHost

    (or virtio) Low performance, flexible Userspace vHost (DPDK) High performance, moderately flexible SR-IOV High performance, inflexible
  7. INSERT DESIGNATOR, IF NEEDED 23 What do nova and neutron

    know? • Nova knows ◦ How much RAM, DISK, CPU do I have? ◦ What is the NUMA topology of my hardware? ◦ What hypervisor am I using? etc. • Neutron Knows ◦ What networking drivers(s) are available? ◦ How much bandwidth is available for a given interface? ◦ How do networks map to NICs? etc.
  8. INSERT DESIGNATOR, IF NEEDED 24 What do nova and neutron

    know? • Nova knows ◦ How much RAM, DISK, CPU do I have? ◦ What is the NUMA topology of my hardware? ◦ What hypervisor am I using? etc. • Neutron Knows ◦ What networking drivers(s) are available? ◦ How much bandwidth is available for a given interface? ◦ How do networks map to NICs? etc.
  9. INSERT DESIGNATOR, IF NEEDED 26 Placement couldn’t do it… •

    No nested resource providers • No NUMA modelling • No interaction between different services • Placement models what it’s told to
  10. INSERT DESIGNATOR, IF NEEDED 28 What do nova and neutron

    know? • Nova knows ◦ How much RAM, DISK, CPU do I have? ◦ What is the NUMA topology of my hardware? ◦ What hypervisor am I using? etc. • Neutron Knows ◦ What networking drivers(s) are available? ◦ How much bandwidth is available for a given interface? ◦ How do networks map to NICs? etc.
  11. INSERT DESIGNATOR, IF NEEDED 29 What do nova and neutron

    know? • Nova knows ◦ How much RAM, DISK, CPU do I have? ◦ What is the NUMA topology of my hardware? ◦ What hypervisor am I using? etc. • Neutron Knows ◦ What networking drivers(s) are available? ◦ How much bandwidth is available for a given interface? ◦ How do networks map to NICs? etc.
  12. INSERT DESIGNATOR, IF NEEDED 31 Determining NUMA affinity of networks

    • Provider networks vs. Tenant networks? ❌
  13. INSERT DESIGNATOR, IF NEEDED 32 Determining NUMA affinity of networks

    • Provider networks vs. Tenant networks? ❌ • Pre-created networking vs. Self-serviced networking? ❌
  14. INSERT DESIGNATOR, IF NEEDED 33 Determining NUMA affinity of networks

    • Provider networks vs. Tenant networks? ❌ • Pre-created networking vs. Self-serviced networking? ❌ • L2 networks vs. L3 networks ✅
  15. INSERT DESIGNATOR, IF NEEDED 34 L2 network configuration (neutron) [ovs]

    bridge_mappings = physnet0:br-physnet0 openvswitch_agent.ini
  16. INSERT DESIGNATOR, IF NEEDED 35 L2 network configuration (nova) [neutron]

    physnets = physnet0,physnet1 [neutron_physnet_physnet0] numa_nodes = 0 nova.conf
  17. INSERT DESIGNATOR, IF NEEDED 37 L2 network configuration (nova) [neutron]

    physnets = physnet0,physnet1 [neutron_physnet_physnet0] numa_nodes = 0 nova.conf
  18. INSERT DESIGNATOR, IF NEEDED 38 L2 network configuration (nova) [neutron]

    physnets = physnet0,physnet1 [neutron_physnet_physnet0] numa_nodes = 0,1 nova.conf
  19. INSERT DESIGNATOR, IF NEEDED 39 L3 network configuration (neutron) [ovs]

    local_ip = OVERLAY_INTERFACE_IP_ADDRESS openvswitch_agent.ini
  20. INSERT DESIGNATOR, IF NEEDED 44 Common Questions • Why so

    manual? • Can I automate any of this?
  21. INSERT DESIGNATOR, IF NEEDED 45 Common Questions • Why so

    manual? • Can I automate any of this? • Will this ever move to placement?
  22. INSERT DESIGNATOR, IF NEEDED 47 Configurable TX/RX Queue Size •

    Pre-emption can result in packet drops Rocky
  23. INSERT DESIGNATOR, IF NEEDED 48 Configurable TX/RX Queue Size •

    Pre-emption can result in packet drops • Solution: make queues sizes bigger! (256 → 1024) Rocky
  24. INSERT DESIGNATOR, IF NEEDED 49 Configurable TX/RX Queue Size [libvirt]

    tx_queue_size = 1024 rx_queue_size = 1024 nova.conf Rocky
  25. INSERT DESIGNATOR, IF NEEDED 50 Emulator Thread Pinning • Hypervisors

    overhead tasks can steal resources from your vCPUs Ocata
  26. INSERT DESIGNATOR, IF NEEDED 51 Emulator Thread Pinning • Hypervisors

    overhead tasks can steal resources from your vCPUs • Solution: ensure overhead tasks run on a dedicated core Ocata
  27. INSERT DESIGNATOR, IF NEEDED 52 Emulator Thread Pinning • Hypervisors

    overhead tasks can steal resources from your vCPUs • Solution: ensure overhead tasks run on a dedicated core • Solution: ensure overhead tasks run on a dedicated pool of cores Ocata Rocky
  28. INSERT DESIGNATOR, IF NEEDED 53 Emulator Thread Pinning $ openstack

    flavor set $flavor \ --property 'hw:emulator_threads_policy=isolate' Ocata
  29. INSERT DESIGNATOR, IF NEEDED 54 Emulator Thread Pinning $ openstack

    flavor set $flavor \ --property 'hw:emulator_threads_policy=isolate' $ openstack flavor set $flavor \ --property 'hw:emulator_threads_policy=share' Rocky Ocata
  30. INSERT DESIGNATOR, IF NEEDED 56 Tracking pCPUs via Placement •

    CPU pinning and NUMA are hard to configure and understand • No way to use use vCPUs and pCPUs on the same host • No way to use use vCPUs and pCPUs in the same instance Stein?
  31. INSERT DESIGNATOR, IF NEEDED 57 Tracking pCPUs via Placement •

    CPU pinning and NUMA are hard to configure and understand • No way to use use vCPUs and pCPUs on the same host • No way to use use vCPUs and pCPUs in the same instance • Solution: track PCPUs as resources in placement Stein?
  32. INSERT DESIGNATOR, IF NEEDED 58 Tracking pCPUs via Placement $

    openstack flavor set $flavor \ --property 'resources:PCPU=10' \ --property 'resources:VCPU=10' Stein?
  33. INSERT DESIGNATOR, IF NEEDED 59 Tracking pCPUs via Placement [compute]

    cpu_shared_set = 0-9,20-29 cpu_dedicated_set = 10-19,30-39 nova.conf Stein?
  34. INSERT DESIGNATOR, IF NEEDED 60 Live migration with pCPUs •

    Live migration of instances with a NUMA topology is broken Stein?
  35. INSERT DESIGNATOR, IF NEEDED 61 Live migration with pCPUs •

    Live migration of instances with a NUMA topology is broken • Solution: fix it Stein?
  36. INSERT DESIGNATOR, IF NEEDED 64 Summary • Not accounting for

    NUMA can cause huge performance hits • NUMA-aware vSwitches are thing since Rocky ◦ nova.conf based configuration, mostly a deployment issue
  37. INSERT DESIGNATOR, IF NEEDED 65 Summary • Not accounting for

    NUMA can cause huge performance hits • NUMA-aware vSwitches are thing since Rocky ◦ nova.conf based configuration, mostly a deployment issue • Future work will explore moving this to placement
  38. INSERT DESIGNATOR, IF NEEDED 66 Summary • Not accounting for

    NUMA can cause huge performance hits • NUMA-aware vSwitches are thing since Rocky ◦ nova.conf based configuration, mostly a deployment issue • Future work will explore moving this to placement • Lots of other features that can also help, now and in the future ◦ TX/RX queue sizes, emulator thread pinning, vCPU-pCPU coexistence, live migration with NUMA topologies
  39. INSERT DESIGNATOR, IF NEEDED 69 Resources You might want to

    know about these... • RHEL NUMA Tuning Guide • Attaching physical PCI devices to guests • Nova Flavors Guide • NUMA-aware vSwitches spec • Emulator Thread Pinning spec (out-of-date!) • TX/RX Queue Sizes spec • CPU Tracking via Placement spec (draft)