eXpress Data Path (XDP) の概要とLINEにおける利活用 / Brief summary of XDP and use-case at LINE

7389db41aa4054d74d533abefd698a94?s=47 yunazuno
November 28, 2018

eXpress Data Path (XDP) の概要とLINEにおける利活用 / Brief summary of XDP and use-case at LINE

Internet Week 2018 ソフトウェアルータ・スイッチBoF 「eXpress Data Path (XDP) の概要とLINEにおける利活用」

7389db41aa4054d74d533abefd698a94?s=128

yunazuno

November 28, 2018
Tweet

Transcript

  1. F9QSFTT %BUB1BUI 9%1 ͷ֓ཁͱ -*/&ʹ͓͚Δར׆༻ :PIFJ ,BOFNBSV -*/&$PSQPSBUJPO ZPIFJLBOFNBSV BUMJOFDPSQDPN

    *OUFSOFU8FFLιϑτ΢ΣΞϧʔλɾεΠον#P' r 
  2. ࣗݾ঺հ ؙۚ ༸ฏ ZVOB[VOP ωοτϫʔΫνʔϜ d σʔληϯλωοτϫʔΫͷӡ༻ɺϐΞϦϯάௐ੔ ΦϑΟεແઢ-"/ͷߏஙɾӡ༻ 0QFO4UBDLϕʔεͷϓϥΠϕʔτΫϥ΢υ7FSEBͷ։ൃνʔϜ d

    -PBE#BMBODFSBTB4FSWJDF -#BB4 ͷઃܭɾ։ൃɾߏஙɾӡ༻ Internet Week 2018    BoF 2018-11-28 2
  3. "HFOEB F9QSFTT %BUB 1BUI 9%1 ͷ • ֓ཁ • ར׆༻ྫ

    • ߏ੒ཁૉͱಈ࡞ • ϓϩάϥϜͷ࣮૷ɾ഑ஔྫ Internet Week 2018    BoF 2018-11-28 3
  4. 9%1ͷ֓ཁ Internet Week 2018    BoF 2018-11-28 4

  5. 9%1 -JOVYLFSOFMͷ಺Ͱಈ࡞͢ΔɺF#1'Λ༻͍ͨߴ଎ύέοτॲཧج൫ /*$υϥΠόͷϨϕϧͰύέοτॲཧΛద༻ ,FSOFM ೥݄ ͕࠷ॳͷϦϦʔε ଞͷߴ଎ύέοτॲཧج൫ͱൺֱͯ͠ -JOVYʹ࣮૷͞Εͨଟ͘ͷػೳ΍࢓૊ΈΛ׆༻Մೳ طଘͷΞϓϦέʔγϣϯͱಁաతʹ૊Έ߹ΘͤՄೳ ύέοτॲཧઐ༻ͷ$16ׂΓ౰͕ͯෆཁ

    ࣮ߦதͷϓϩάϥϜΛಈతʹࠩ͠ସ͑Մೳ Internet Week 2018    BoF 2018-11-28 5
  6. 9%1ͱF#1'ͰͰ͖Δ͜ͱ Internet Week 2018    BoF Kernel User

    Device Driver Kernel NIC Device Driver XDP skb Network Interface Network Interface IP TCP / UDP / ICMP Socket BPF maps AF_XDP App App App Transmit Redirect Redirect Pass Drop Parse Lookup Modify Push/Pop Hdr Verdict 2018-11-28 6 App
  7. 9%1ͷύϑΥʔϚϯε εϧʔϓοτ͸ ௨ৗͷωοτϫʔΫελοΫ 9%1%1%,<> Internet Week 2018   

    BoF [1] T. Høiland-Jørgensen et al. The eXpress Data Path: Fast Programmable Packet Processing in the Operating System Kernel, ACM CoNEXT ‘18, Heraklion, Greece, December 2018. Packet drop Packet forwarding (L2) 2018-11-28 7
  8. 9%1ͷར׆༻ྫ Internet Week 2018    BoF 2018-11-28 8

  9. ͞·͟·ͳར༻ྫ -PBE#BMBODJOH 'BDFCPPL,BUSBO -*/&'BCSJD-# $POUBJOFS/FUXPSLJOH $JMJVN % %P4 .JUJHBUJPO4FDVSJUZ $MPVEGMBSF(BUFCPU

    4VSJDBUB Internet Week 2018    BoF 2018-11-28 9
  10. -*/&Ͱͷར༻ྫ -BZFS ϩʔυόϥϯα -#BB4ͷ--#෦෼Ͱ࠾༻ɺࠓ೥݄͔ΒϓϩμΫγϣϯ౤ೖ <><> Internet Week 2018 $-(1%#'& BoF

    L4LB (Software) Real Server L7LB (Software) L3 Switch (Hardware) Client TCP/HTTP(S) Proxy Stateless L3DSR ECMP [2] $-(+ '(=7  >AB2)*03"59 C https://www.slideshare.net/linecorp/ss-116879618 [3] LINE 3-0.0'(-/4<@",## 1 ;8!#(6 3-0:?  https://www.slideshare.net/linecorp/lines-infrastructure-platform-how-it-scales-massive-services-and-maintains-low-operational-cost 2018-11-28 10
  11. L4LB Program (XDP) 9%1Λ༻͍ͨ--#࣮૷ͷ֓೦ ࡉ͔ͳϓϩτίϧॲཧ͕ෆཁ͔ͭεςʔτϨεͳσʔλϓϨʔϯ Internet Week 2018  

     BoF Rx Tx Match Lookup Rewrite BFP maps VIP-Backend table Statistics table From L3 Switch To L7 LB NIC 1. Match Dst. IP/Port  2. Lookup Hash " Real Server   3. Rewrite IP  Dst. IP, DSCP !    2018-11-28 11
  12. 9%1ͷߏ੒ཁૉͱಈ࡞ 9%1ͷ։ൃऀΒʹΑΔ࿦จΑΓ<> Internet Week 2018    BoF [1]

    T. Høiland-Jørgensen et al. The eXpress Data Path: Fast Programmable Packet Processing in the Operating System Kernel, ACM CoNEXT ‘18, Heraklion, Greece, December 2018. 2018-11-28 12
  13. Device Driver 9%1ϓϩάϥϜ͕ಈ࡞͢Δ·Ͱ Internet Week 2018    BoF

    BPF maps C eBPF C code (D-Plane) Clang/LLVM eBPF Bytecode C-Plane Program eBPF JIT eBPF VM Kernel User eBPF Verifier XDP Program Kernel NIC Compiler BPFFS bpf(2) Loader 2018-11-28 13 XDP Driver Hook Packet Create Read/Write Read/Write
  14. F#1' WFSJGJFSʹΑΔࣄલ੩తݕূ 9%1ϓϩάϥϜ͕ΧʔωϧʹѱӨڹΛ༩͑ͳ͍͜ͱΛϩʔυ࣌ʹ֬ೝ ఀࢭੑνΣοΫ ϧʔϓΛڐՄ͠ͳ͍  ϓϩάϥϜͷ௕͞Λ੍ݶ ࠷େJOTUSVDUJPOT  

    ϝϞϦΞΫηεͷ҆શੑνΣοΫ ϝϞϦΞΫηεલʹ/6--νΣοΫɾڥքνΣοΫࡁΈͰ͋Δ͜ͱΛڧ੍ ؔ਺ݺͼग़͠ͷҾ਺νΣοΫ ϓϩάϥϜࣗମͷਖ਼͠͞ͷݕূͰ͸ͳ͍͜ͱʹ஫ҙ Internet Week 2018 BoF (*1) ), %' '*& "! (*2) ( tail call $  +&# 2018-11-28 14
  15. #1' NBQT 9%1ϓϩάϥϜ͔ΒಡΈॻ͖Մೳͳ,FZ7BMVFTUPSF Ұൠʹ9%1ϓϩάϥϜͷϩʔμ JQSPVUF CDD FUD ͕࡞੒ Ϣʔβεϖʔε͔Β΋ಡΈॻ͖Մೳ enum

    bpf_map_type Ͱ༷ʑͳछྨ͕ఆٛ (PERCPU_)ARRAY, (PERCPU_)HASH, LRU_(PERCPU_)HASH, LPM_TRIE ARRAY_OF_MAPS, HASH_OF_MAPS PROG_ARRAY, DEVMAP, CPUMAP, XSKMAP Internet Week 2018    BoF 2018-11-28 15
  16. ࠷ऴతͳύέοτͷॲ۰͸໭Γ஋ Ћ Ͱܾఆ enum xdp_action Ͱఆٛ͞Ε͍ͯΔ XDP_DROPύέοτΛυϩοϓ XDP_PASSύέοτΛޙଓͷωοτϫʔΫελοΫʹ౉͢ XDP_TXύέοτΛಉҰϙʔτ͔Βૹ৴ XDP_REDIRECT

    ύέοτΛผͷ\ΠϯλϑΣʔε $16 ϓϩάϥϜ^ʹ౉͢ +DEVMAPndo_xdp_xmitܦ༝ͰผͷΠϯλϑΣʔεʹ౉͢ +CPUMAPड৴ͨ͠ίΞͱ͸ผͷίΞͷωοτϫʔΫελοΫʹ౉͢ +XSKMAPAF_XDPܦ༝ͰϢʔβεϖʔεʹ౉͢ Internet Week 2018    BoF 2018-11-28 16
  17. 9%1ϓϩάϥϜͷ࣮૷ɾ഑ஔྫ Internet Week 2018    BoF 2018-11-28 17

  18. ྫ ύέοτΛΧ΢ϯτͯ͠υϩοϓ 2018-11-28 Internet Week 2018    BoF

    18 #include <linux/bpf.h> #include "bpf_helpers.h" struct bpf_map_def SEC("maps") counter = { .type = BPF_MAP_TYPE_ARRAY, .key_size = sizeof(__u32), .value_size = sizeof(__u64), .max_entries = XDP_REDIRECT + 1, }; SEC("prog") int xdp_prog(struct xdp_md *ctx) { __u64 data_len = ctx->data_end - ctx->data; int action = XDP_DROP; long *value = bpf_map_lookup_elem(&counter, &action); if (value) *value += data_len; // non-atomic (*1) return action; } #1'NBQΛఆٛ ΤϯτϦϙΠϯτʹTUSVDUYEQ@NE ͕౉Δ #1'NBQΛIFMQFSGVODUJPOܦ༝Ͱ ࢀরͯ͠ߋ৽ ύέοτͷॲ۰Λܾఆ $ clang –target bpf –c prog.c –o prog.o $ ip link set dev IF xdp obj prog.o ίϯύΠϧ ϩʔυ (*1)  __sync_fetch_and_add 
  19. ஫ҙ ࢖༻؀ڥʹΑͬͯॻ͖ํ͕एׯҟͳΔ 2018-11-28 Internet Week 2018    BoF

    19 #include <linux/bpf.h> #include "bpf_helpers.h" struct bpf_map_def SEC("maps") counter = { .type = BPF_MAP_TYPE_ARRAY, .key_size = sizeof(__u32), .value_size = sizeof(__u64), .max_entries = XDP_REDIRECT + 1, }; SEC("prog") int xdp_prog(struct xdp_md *ctx) { __u64 data_len = ctx->data_end - ctx->data; int action = XDP_DROP; long *value = bpf_map_lookup_elem(&counter, &action); if (value) *value += data_len; // non-atomic return action; } BPF_ARRAY(counter, u64, XDP_REDIRECT + 1); int xdp_prog(struct xdp_md *ctx) { u64 data_len = ctx->data_end - ctx->data; int action = XDP_DROP; // non-atomic counter.increment(action, data_len); return action; } BCC iproute2 + bpf_helpers.h • include • Mapsyntax sugar
  20. ஫ҙ ҆શͰͳ͍ϝϞϦΞΫηε͸ڐՄ͞Εͳ͍ 2018-11-28 Internet Week 2018    BoF

    20 SEC("xdp1") int xdp_prog1(struct xdp_md *ctx) { void *data_end = (void *)(long)ctx->data_end; void *data = (void *)(long)ctx->data; struct ethhdr *eth = data; int rc = XDP_DROP; long *value; u16 h_proto; u64 nh_off; u32 ipproto; nh_off = sizeof(*eth); if (data + nh_off > data_end) return rc; h_proto = eth->h_proto; ... value = bpf_map_lookup_elem(&rxcnt, &ipproto); if (value) *value += 1; ... } $ sudo ip link set dev eth1 xdp obj omit_check1.o sec xdp1 Prog section 'xdp1' rejected: Permission denied (13)! ... Verifier analysis: 0: (61) r2 = *(u32 *)(r1 +4) 1: (61) r1 = *(u32 *)(r1 +0) 2: (71) r4 = *(u8 *)(r1 +13) invalid access to packet, off=13 size=1, R1(id=0,off=0,r=0) R1 offset is outside of the packet $ sudo ip link set dev eth1 xdp obj omit_check2.o sec xdp1 Prog section 'xdp1' rejected: Permission denied (13)! ... Verifier analysis: 0: (61) r2 = *(u32 *)(r1 +4) ... 54: (85) call bpf_map_lookup_elem#1 55: (79) r1 = *(u64 *)(r0 +0) R0 invalid mem access 'map_value_or_null’ samples/bpf/xdp1_kern.c[4] [4] https://github.com/torvalds/linux/blob/v4.19/samples/bpf/xdp1_kern.c
  21. VRF (l3mdev) -*/&ʹ͓͚Δ9%1ϓϩάϥϜͷ഑ஔྫ 5$1 *$.1 73'͸ΧʔωϧͷػೳΛͦͷ··࢖༻ Internet Week 2018 

      BoF Kernel User Device Driver Kernel NIC Device Driver XDP skb Network Interface Network Interface IP TCP / UDP / ICMP BPF maps C-Plane Agent (on top of bcc) BGP Daemon (FRR) 2018-11-28 21 BGP BGP Normal packets Management “Special” packets
  22. ·ͱΊ Internet Week 2018    BoF 2018-11-28 22

  23. ·ͱΊ 9%1͸༷ʑͳ৔ॴͰঃʑʹར׆༻͕࢝·͍ͬͯΔ %1 $1ͷ։ൃʹઐ೦͠΍͍༷͢ʑͳ࢓૊Έ͕༻ҙ͞Ε͍ͯΔͷ͕ڧΈ ͱ͸͍͑ɺ·ͩ·ͩൃల్্ ύϑΥʔϚϯεվળ 2P4 ʜ "'@9%1[FSPDPQZ )BSEXBSF0GGMPBE

    ʜ )FMQFS'VODUJPOͷػೳڧԽ F#1'΋ؚΊɺࠓޙͷൃల΍ར׆༻ͷ૿Ճʹظ଴ Internet Week 2018    BoF 2018-11-28 23
  24. ࢀߟࢿྉ Internet Week 2018 #BoF • A practical introduction to

    XDP •  "#!% ,2 from LPC 2018 • https://linuxplumbersconf.org/event/2/contributions/71/attachments/17/9/presentation-lpc2018-xdp-tutorial.pdf • XDP - challenges and future work • XDP .3 4/ ),2 from LCP 2018 • https://linuxplumbersconf.org/event/2/contributions/92/attachments/91/102/presentation-lpc2018-xdp-future.pdf • The eXpress Data Path: Fast Programmable Packet Processing in the Operating System Kernel • XDP *(&author paper • https://github.com/tohojo/xdp-paper/blob/master/xdp-the-express-data-path.pdf • BPF and XDP Reference Guide • Cilium *(& eBPF5 XDP "$% • https://cilium.readthedocs.io/en/latest/bpf/ • XDP: 1.5 years in production. Evolution and lessons learned. • Facebook XDP 1-+6  0' from LPC 2018 • http://vger.kernel.org/lpc_net2018_talks/LPC_XDP_Shirokov_v2.pdf 2018-11-28 24
  25. "QQFOEJY 2018-11-28 Internet Week 2018    BoF 25

  26. %SJWFSͷαϙʔτঢ়گ 2018-11-28 Internet Week 2018 "BoF 26 linux-4.19.4/drivers/net$ grep -R

    -F -l .ndo_bpf . ./tun.c ./ethernet/intel/i40e/i40e_main.c ./ethernet/intel/ixgbe/ixgbe_main.c ./ethernet/intel/ixgbevf/ixgbevf_main.c ./ethernet/qlogic/qede/qede_main.c ./ethernet/broadcom/bnxt/bnxt.c ./ethernet/mellanox/mlx4/en_netdev.c ./ethernet/mellanox/mlx5/core/en_main.c ./ethernet/netronome/nfp/nfp_net_common.c ./ethernet/cavium/thunder/nicvf_main.c ./veth.c ./netdevsim/netdev.c ./virtio_net.c linux-4.19.4/drivers/net$ grep -R -F -l .ndo_xdp_xmit . ./tun.c ./ethernet/intel/i40e/i40e_main.c ./ethernet/intel/ixgbe/ixgbe_main.c ./ethernet/mellanox/mlx5/core/en_main.c ./veth.c ./virtio_net.c $) -!  Generic XDP &+,* ( #/) %.' XDP (* XDP_REDIRECT + Interface