Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Neutron L2 and L3 agents, How They Work and How Kilo Improves Them

Neutron L2 and L3 agents, How They Work and How Kilo Improves Them

Rossella Sblendido

May 18, 2015
Tweet

More Decks by Rossella Sblendido

Other Decks in Programming

Transcript

  1. Neutron L2 and L3 agents How They Work and How

    Kilo Improves Them Carl Baldwin, Rossella Sblendido / May 18, 2014
  2. 4 L2 Agent • Runs on compute node • Configures

    the local vbridges (br-int, br-tun) • Wires new devices • Applies Security Group Rules • Communicates with the Neutron server over RPC
  3. 6 Agent loop events • OVSDB monitor has updates •

    Neutron server messages  Security groups change (rule updated, member added, provider rule updated)  Port update • OVS restarted
  4. 7 Detect ports changes • OVSDB monitor signals if something

    has changed on the host • OVS agent scans all the ports in the machine • It keeps track of the ports that has already processed using an internal dict (registered_ports) • Diff registered_ports with the result of the scanning → infer devices added and deleted
  5. 8 Process network ports – Port added • request the

    device details • provision local VLAN and install proper flows • set up port filters • update_device_up
  6. 9 Process network ports – Port deleted • Remove filters

    • update_device_down • claim local VLAN if it's the last device
  7. 10 Processing Neutron server messages • Updated port, same process

    as added ports • Security group changes, filters are reapplied for the all devices affected
  8. 11 OVS restarted • Detected using a canary flow •

    Reconfigure bridges • registered_ports is cleared, all ports are reprocessed
  9. 12 If an exception is throw? • registered_ports is cleared,

    all the ports are reprocessed • Full resync!
  10. © Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained

    herein is subject to change without notice. The OpenStack TM attribution statement should used: The OpenStack wordmark and the Square O Design, together or part, are trademarks or registered trademarks of OpenStack Foundation in the United States and other countries, and are used with the OpenStack Foundation’s permission. L3 Agent
  11. © Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained

    herein is subject to change without notice. 14 Deployment • Network Hosts – Legacy with 1 Agent – HA with more than 1 Agent – DVR • Centralized part is like Legacy – API Available to manage association • Compute Hosts – DVR • Distributed part bound to multiple hypervisors
  12. © Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained

    herein is subject to change without notice. 15 L3 Agent • Receives update notifications for routers • Router Processing Queue – Prioritize user actions so agent is responsive – Less priority to full sync • Sends status updates Router Status 51812f4e-e0a8-479a-a116- f588cb020b91 Processing… 5b80e13e-cd2d-40d6-aaea- 856bcc4242f6 Processing… d95effe5-11ca-4450-ba45- 615e40d159c6 Processing… e50750d2-42e3-4e34-888f- cef236a993f7 Processing… be19c28c-6789-44ce-bb29- 8dd4a9944deb Waiting… 6f81708c-404e-4738-a21c- 73eb2b8c2599 Waiting… 4206b114-2e97-4963-9a5d- 140cfec95977 Waiting… … …
  13. © Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained

    herein is subject to change without notice. 16 Router Internals • Network namespaces (ip netns) • L2 Interfaces moved into namespace – OVS port – Veth pair (virtual cables) • IP address configured on interfaces • Simple routing and extra routes • Iptables for NAT and metadata • Proxy for metadata access • External access for instances without floating IP • Advanced Services – FWaaS – VPNaaS
  14. © Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained

    herein is subject to change without notice. 17 Compute Host L3 Agent • DVR only – Floating IPs for north/south IPv4 routing – East/west IPv4 routing • FWaaS Integrated (partially) “VM1- 1 “VM1- 1 patch-tun br-int eth0 QRouter-X QRouter-X “VM2- 1 “VM2- 1 patch-tun br-int eth0 QRouter-X QRouter-X
  15. 19 Restructuring work • Get more info from OVSDB monitor

    • Improve RPC calls • Improve resync
  16. 20 OVSDB monitor get events • Improve OvsdbMonitor so that

    it can pass to the agent the devices that were added or deleted • The agent consumes the events, don't scans the ports all the time
  17. 21 Improve RPC calls • Use a bulk call to

    update the status (up/down) of several devices • Add a parameter: failed devices • Don't refresh all the devices when security_groups_provider_updated is got but just those affected • Add the attributes modified in port update so that the L2 agent can decide if reprocessing is needed
  18. 22 Improve resync • Don't resync all the devices if

    an error is got • Add a parameter in the RPC calls that collects the devices that caused an error • The OVS agent can resync only the devices that failed The operation can be retried or failure ignored
  19. 23 Did this improve the situation? Let's test! • VM

    running Devstack • Rally scenario "args": { "flavor": { "name": "m1.tiny" }, "image": { "name": "cirros-0.3.4-x86_64-uec" },"runner": { "concurrency": 2, "times": 20, "type": "constant" }
  20. 26 It worked! • Min time 0.6% better • Avg

    time 4% better • 95th percentile 5.9% better
  21. 27 There's still work to do... • Instead of using

    the command line for OVSDB monitor use the OVS Python library • Create a queue of events to be processed so that multiple workers can be introduced • Add priority to events so that higher priority events can be processed first • Improve state convergence between agent and the server (resilience in case of failure)
  22. © Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained

    herein is subject to change without notice. The OpenStack TM attribution statement should used: The OpenStack wordmark and the Square O Design, together or part, are trademarks or registered trademarks of OpenStack Foundation in the United States and other countries, and are used with the OpenStack Foundation’s permission. L3 Agent Restructuring
  23. © Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained

    herein is subject to change without notice. 29 Handyman Model • One big file, one object: the agent • Jack of all trades – Worse: it was a bit forgetful
  24. © Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained

    herein is subject to change without notice. 30 Contractor Model • Time to move to a contractor model – Agent is the contractor – Calls in specialists to do the work – One contractor for network node, other for hypervisor
  25. © Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained

    herein is subject to change without notice. 31 Specialists • New specialist for each type of router
  26. © Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained

    herein is subject to change without notice. 32 More Specialized
  27. © Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained

    herein is subject to change without notice. 33 Future Work • Eliminate full sync on router • Too much internal state • Simplify DVR • L3 VPN • Eliminate IPv4 waste • DVR for IPv6