Slide 1

Slide 1 text

F9QSFTT %BUB1BUI 9%1 ͷ֓ཁͱ -*/&ʹ͓͚Δར׆༻ :PIFJ ,BOFNBSV -*/&$PSQPSBUJPO ZPIFJLBOFNBSV BUMJOFDPSQDPN *OUFSOFU8FFLιϑτ΢ΣΞϧʔλɾεΠον#P' r 

Slide 2

Slide 2 text

ࣗݾ঺հ ؙۚ ༸ฏ ZVOB[VOP ωοτϫʔΫνʔϜ d σʔληϯλωοτϫʔΫͷӡ༻ɺϐΞϦϯάௐ੔ ΦϑΟεແઢ-"/ͷߏஙɾӡ༻ 0QFO4UBDLϕʔεͷϓϥΠϕʔτΫϥ΢υ7FSEBͷ։ൃνʔϜ d -PBE#BMBODFSBTB4FSWJDF -#BB4 ͷઃܭɾ։ൃɾߏஙɾӡ༻ Internet Week 2018    BoF 2018-11-28 2

Slide 3

Slide 3 text

"HFOEB F9QSFTT %BUB 1BUI 9%1 ͷ • ֓ཁ • ར׆༻ྫ • ߏ੒ཁૉͱಈ࡞ • ϓϩάϥϜͷ࣮૷ɾ഑ஔྫ Internet Week 2018    BoF 2018-11-28 3

Slide 4

Slide 4 text

9%1ͷ֓ཁ Internet Week 2018    BoF 2018-11-28 4

Slide 5

Slide 5 text

9%1 -JOVYLFSOFMͷ಺Ͱಈ࡞͢ΔɺF#1'Λ༻͍ͨߴ଎ύέοτॲཧج൫ /*$υϥΠόͷϨϕϧͰύέοτॲཧΛద༻ ,FSOFM ೥݄ ͕࠷ॳͷϦϦʔε ଞͷߴ଎ύέοτॲཧج൫ͱൺֱͯ͠ -JOVYʹ࣮૷͞Εͨଟ͘ͷػೳ΍࢓૊ΈΛ׆༻Մೳ طଘͷΞϓϦέʔγϣϯͱಁաతʹ૊Έ߹ΘͤՄೳ ύέοτॲཧઐ༻ͷ$16ׂΓ౰͕ͯෆཁ ࣮ߦதͷϓϩάϥϜΛಈతʹࠩ͠ସ͑Մೳ Internet Week 2018    BoF 2018-11-28 5

Slide 6

Slide 6 text

9%1ͱF#1'ͰͰ͖Δ͜ͱ Internet Week 2018    BoF Kernel User Device Driver Kernel NIC Device Driver XDP skb Network Interface Network Interface IP TCP / UDP / ICMP Socket BPF maps AF_XDP App App App Transmit Redirect Redirect Pass Drop Parse Lookup Modify Push/Pop Hdr Verdict 2018-11-28 6 App

Slide 7

Slide 7 text

9%1ͷύϑΥʔϚϯε εϧʔϓοτ͸ ௨ৗͷωοτϫʔΫελοΫ 9%1%1%,<> Internet Week 2018    BoF [1] T. Høiland-Jørgensen et al. The eXpress Data Path: Fast Programmable Packet Processing in the Operating System Kernel, ACM CoNEXT ‘18, Heraklion, Greece, December 2018. Packet drop Packet forwarding (L2) 2018-11-28 7

Slide 8

Slide 8 text

9%1ͷར׆༻ྫ Internet Week 2018    BoF 2018-11-28 8

Slide 9

Slide 9 text

͞·͟·ͳར༻ྫ -PBE#BMBODJOH 'BDFCPPL,BUSBO -*/&'BCSJD-# $POUBJOFS/FUXPSLJOH $JMJVN % %P4 .JUJHBUJPO4FDVSJUZ $MPVEGMBSF(BUFCPU 4VSJDBUB Internet Week 2018    BoF 2018-11-28 9

Slide 10

Slide 10 text

-*/&Ͱͷར༻ྫ -BZFS ϩʔυόϥϯα -#BB4ͷ--#෦෼Ͱ࠾༻ɺࠓ೥݄͔ΒϓϩμΫγϣϯ౤ೖ <><> Internet Week 2018 $-(1%#'& BoF L4LB (Software) Real Server L7LB (Software) L3 Switch (Hardware) Client TCP/HTTP(S) Proxy Stateless L3DSR ECMP [2] $-(+ '(=7  >AB2)*03"59 C https://www.slideshare.net/linecorp/ss-116879618 [3] LINE 3-0.0'(-/4<@",## 1 ;8!#(6 3-0:?  https://www.slideshare.net/linecorp/lines-infrastructure-platform-how-it-scales-massive-services-and-maintains-low-operational-cost 2018-11-28 10

Slide 11

Slide 11 text

L4LB Program (XDP) 9%1Λ༻͍ͨ--#࣮૷ͷ֓೦ ࡉ͔ͳϓϩτίϧॲཧ͕ෆཁ͔ͭεςʔτϨεͳσʔλϓϨʔϯ Internet Week 2018    BoF Rx Tx Match Lookup Rewrite BFP maps VIP-Backend table Statistics table From L3 Switch To L7 LB NIC 1. Match Dst. IP/Port  2. Lookup Hash " Real Server   3. Rewrite IP  Dst. IP, DSCP !    2018-11-28 11

Slide 12

Slide 12 text

9%1ͷߏ੒ཁૉͱಈ࡞ 9%1ͷ։ൃऀΒʹΑΔ࿦จΑΓ<> Internet Week 2018    BoF [1] T. Høiland-Jørgensen et al. The eXpress Data Path: Fast Programmable Packet Processing in the Operating System Kernel, ACM CoNEXT ‘18, Heraklion, Greece, December 2018. 2018-11-28 12

Slide 13

Slide 13 text

Device Driver 9%1ϓϩάϥϜ͕ಈ࡞͢Δ·Ͱ Internet Week 2018    BoF BPF maps C eBPF C code (D-Plane) Clang/LLVM eBPF Bytecode C-Plane Program eBPF JIT eBPF VM Kernel User eBPF Verifier XDP Program Kernel NIC Compiler BPFFS bpf(2) Loader 2018-11-28 13 XDP Driver Hook Packet Create Read/Write Read/Write

Slide 14

Slide 14 text

F#1' WFSJGJFSʹΑΔࣄલ੩తݕূ 9%1ϓϩάϥϜ͕ΧʔωϧʹѱӨڹΛ༩͑ͳ͍͜ͱΛϩʔυ࣌ʹ֬ೝ ఀࢭੑνΣοΫ ϧʔϓΛڐՄ͠ͳ͍  ϓϩάϥϜͷ௕͞Λ੍ݶ ࠷େJOTUSVDUJPOT   ϝϞϦΞΫηεͷ҆શੑνΣοΫ ϝϞϦΞΫηεલʹ/6--νΣοΫɾڥքνΣοΫࡁΈͰ͋Δ͜ͱΛڧ੍ ؔ਺ݺͼग़͠ͷҾ਺νΣοΫ ϓϩάϥϜࣗମͷਖ਼͠͞ͷݕূͰ͸ͳ͍͜ͱʹ஫ҙ Internet Week 2018 BoF (*1) ), %' '*& "! (*2) ( tail call $  +&# 2018-11-28 14

Slide 15

Slide 15 text

#1' NBQT 9%1ϓϩάϥϜ͔ΒಡΈॻ͖Մೳͳ,FZ7BMVFTUPSF Ұൠʹ9%1ϓϩάϥϜͷϩʔμ JQSPVUF CDD FUD ͕࡞੒ Ϣʔβεϖʔε͔Β΋ಡΈॻ͖Մೳ enum bpf_map_type Ͱ༷ʑͳछྨ͕ఆٛ (PERCPU_)ARRAY, (PERCPU_)HASH, LRU_(PERCPU_)HASH, LPM_TRIE ARRAY_OF_MAPS, HASH_OF_MAPS PROG_ARRAY, DEVMAP, CPUMAP, XSKMAP Internet Week 2018    BoF 2018-11-28 15

Slide 16

Slide 16 text

࠷ऴతͳύέοτͷॲ۰͸໭Γ஋ Ћ Ͱܾఆ enum xdp_action Ͱఆٛ͞Ε͍ͯΔ XDP_DROPύέοτΛυϩοϓ XDP_PASSύέοτΛޙଓͷωοτϫʔΫελοΫʹ౉͢ XDP_TXύέοτΛಉҰϙʔτ͔Βૹ৴ XDP_REDIRECT ύέοτΛผͷ\ΠϯλϑΣʔε $16 ϓϩάϥϜ^ʹ౉͢ +DEVMAPndo_xdp_xmitܦ༝ͰผͷΠϯλϑΣʔεʹ౉͢ +CPUMAPड৴ͨ͠ίΞͱ͸ผͷίΞͷωοτϫʔΫελοΫʹ౉͢ +XSKMAPAF_XDPܦ༝ͰϢʔβεϖʔεʹ౉͢ Internet Week 2018    BoF 2018-11-28 16

Slide 17

Slide 17 text

9%1ϓϩάϥϜͷ࣮૷ɾ഑ஔྫ Internet Week 2018    BoF 2018-11-28 17

Slide 18

Slide 18 text

ྫ ύέοτΛΧ΢ϯτͯ͠υϩοϓ 2018-11-28 Internet Week 2018    BoF 18 #include #include "bpf_helpers.h" struct bpf_map_def SEC("maps") counter = { .type = BPF_MAP_TYPE_ARRAY, .key_size = sizeof(__u32), .value_size = sizeof(__u64), .max_entries = XDP_REDIRECT + 1, }; SEC("prog") int xdp_prog(struct xdp_md *ctx) { __u64 data_len = ctx->data_end - ctx->data; int action = XDP_DROP; long *value = bpf_map_lookup_elem(&counter, &action); if (value) *value += data_len; // non-atomic (*1) return action; } #1'NBQΛఆٛ ΤϯτϦϙΠϯτʹTUSVDUYEQ@NE ͕౉Δ #1'NBQΛIFMQFSGVODUJPOܦ༝Ͱ ࢀরͯ͠ߋ৽ ύέοτͷॲ۰Λܾఆ $ clang –target bpf –c prog.c –o prog.o $ ip link set dev IF xdp obj prog.o ίϯύΠϧ ϩʔυ (*1)  __sync_fetch_and_add 

Slide 19

Slide 19 text

஫ҙ ࢖༻؀ڥʹΑͬͯॻ͖ํ͕एׯҟͳΔ 2018-11-28 Internet Week 2018    BoF 19 #include #include "bpf_helpers.h" struct bpf_map_def SEC("maps") counter = { .type = BPF_MAP_TYPE_ARRAY, .key_size = sizeof(__u32), .value_size = sizeof(__u64), .max_entries = XDP_REDIRECT + 1, }; SEC("prog") int xdp_prog(struct xdp_md *ctx) { __u64 data_len = ctx->data_end - ctx->data; int action = XDP_DROP; long *value = bpf_map_lookup_elem(&counter, &action); if (value) *value += data_len; // non-atomic return action; } BPF_ARRAY(counter, u64, XDP_REDIRECT + 1); int xdp_prog(struct xdp_md *ctx) { u64 data_len = ctx->data_end - ctx->data; int action = XDP_DROP; // non-atomic counter.increment(action, data_len); return action; } BCC iproute2 + bpf_helpers.h • include • Mapsyntax sugar

Slide 20

Slide 20 text

஫ҙ ҆શͰͳ͍ϝϞϦΞΫηε͸ڐՄ͞Εͳ͍ 2018-11-28 Internet Week 2018    BoF 20 SEC("xdp1") int xdp_prog1(struct xdp_md *ctx) { void *data_end = (void *)(long)ctx->data_end; void *data = (void *)(long)ctx->data; struct ethhdr *eth = data; int rc = XDP_DROP; long *value; u16 h_proto; u64 nh_off; u32 ipproto; nh_off = sizeof(*eth); if (data + nh_off > data_end) return rc; h_proto = eth->h_proto; ... value = bpf_map_lookup_elem(&rxcnt, &ipproto); if (value) *value += 1; ... } $ sudo ip link set dev eth1 xdp obj omit_check1.o sec xdp1 Prog section 'xdp1' rejected: Permission denied (13)! ... Verifier analysis: 0: (61) r2 = *(u32 *)(r1 +4) 1: (61) r1 = *(u32 *)(r1 +0) 2: (71) r4 = *(u8 *)(r1 +13) invalid access to packet, off=13 size=1, R1(id=0,off=0,r=0) R1 offset is outside of the packet $ sudo ip link set dev eth1 xdp obj omit_check2.o sec xdp1 Prog section 'xdp1' rejected: Permission denied (13)! ... Verifier analysis: 0: (61) r2 = *(u32 *)(r1 +4) ... 54: (85) call bpf_map_lookup_elem#1 55: (79) r1 = *(u64 *)(r0 +0) R0 invalid mem access 'map_value_or_null’ samples/bpf/xdp1_kern.c[4] [4] https://github.com/torvalds/linux/blob/v4.19/samples/bpf/xdp1_kern.c

Slide 21

Slide 21 text

VRF (l3mdev) -*/&ʹ͓͚Δ9%1ϓϩάϥϜͷ഑ஔྫ 5$1 *$.1 73'͸ΧʔωϧͷػೳΛͦͷ··࢖༻ Internet Week 2018    BoF Kernel User Device Driver Kernel NIC Device Driver XDP skb Network Interface Network Interface IP TCP / UDP / ICMP BPF maps C-Plane Agent (on top of bcc) BGP Daemon (FRR) 2018-11-28 21 BGP BGP Normal packets Management “Special” packets

Slide 22

Slide 22 text

·ͱΊ Internet Week 2018    BoF 2018-11-28 22

Slide 23

Slide 23 text

·ͱΊ 9%1͸༷ʑͳ৔ॴͰঃʑʹར׆༻͕࢝·͍ͬͯΔ %1 $1ͷ։ൃʹઐ೦͠΍͍༷͢ʑͳ࢓૊Έ͕༻ҙ͞Ε͍ͯΔͷ͕ڧΈ ͱ͸͍͑ɺ·ͩ·ͩൃల్্ ύϑΥʔϚϯεվળ 2P4 ʜ "'@9%1[FSPDPQZ )BSEXBSF0GGMPBE ʜ )FMQFS'VODUJPOͷػೳڧԽ F#1'΋ؚΊɺࠓޙͷൃల΍ར׆༻ͷ૿Ճʹظ଴ Internet Week 2018    BoF 2018-11-28 23

Slide 24

Slide 24 text

ࢀߟࢿྉ Internet Week 2018 #BoF • A practical introduction to XDP •  "#!% ,2 from LPC 2018 • https://linuxplumbersconf.org/event/2/contributions/71/attachments/17/9/presentation-lpc2018-xdp-tutorial.pdf • XDP - challenges and future work • XDP .3 4/ ),2 from LCP 2018 • https://linuxplumbersconf.org/event/2/contributions/92/attachments/91/102/presentation-lpc2018-xdp-future.pdf • The eXpress Data Path: Fast Programmable Packet Processing in the Operating System Kernel • XDP *(&author paper • https://github.com/tohojo/xdp-paper/blob/master/xdp-the-express-data-path.pdf • BPF and XDP Reference Guide • Cilium *(& eBPF5 XDP "$% • https://cilium.readthedocs.io/en/latest/bpf/ • XDP: 1.5 years in production. Evolution and lessons learned. • Facebook XDP 1-+6  0' from LPC 2018 • http://vger.kernel.org/lpc_net2018_talks/LPC_XDP_Shirokov_v2.pdf 2018-11-28 24

Slide 25

Slide 25 text

"QQFOEJY 2018-11-28 Internet Week 2018    BoF 25

Slide 26

Slide 26 text

%SJWFSͷαϙʔτঢ়گ 2018-11-28 Internet Week 2018 "BoF 26 linux-4.19.4/drivers/net$ grep -R -F -l .ndo_bpf . ./tun.c ./ethernet/intel/i40e/i40e_main.c ./ethernet/intel/ixgbe/ixgbe_main.c ./ethernet/intel/ixgbevf/ixgbevf_main.c ./ethernet/qlogic/qede/qede_main.c ./ethernet/broadcom/bnxt/bnxt.c ./ethernet/mellanox/mlx4/en_netdev.c ./ethernet/mellanox/mlx5/core/en_main.c ./ethernet/netronome/nfp/nfp_net_common.c ./ethernet/cavium/thunder/nicvf_main.c ./veth.c ./netdevsim/netdev.c ./virtio_net.c linux-4.19.4/drivers/net$ grep -R -F -l .ndo_xdp_xmit . ./tun.c ./ethernet/intel/i40e/i40e_main.c ./ethernet/intel/ixgbe/ixgbe_main.c ./ethernet/mellanox/mlx5/core/en_main.c ./veth.c ./virtio_net.c $) -!  Generic XDP &+,* ( #/) %.' XDP (* XDP_REDIRECT + Interface