Upgrade to Pro — share decks privately, control downloads, hide ads and more …

超高速なパケットI/Oフレームワーク netmap の紹介

超高速なパケットI/Oフレームワーク netmap の紹介

netmap: a novel framework for fast packet I/O

Yuuki Tsubouchi (yuuk1)

August 01, 2013
Tweet

More Decks by Yuuki Tsubouchi (yuuk1)

Other Decks in Research

Transcript

  1. netmap: a novel framework for fast packet I/O Luigi Rizzo,

    Universita` di Pisa, Italy In Proceedings of the 2012 USENIX Annual Technical Conference, June 2012. (Best Paper award at Usenix ATC'12) id:y_uuki ྠߨࢿྉ
  2. Introduction (1) ωοτϫʔΫϞχλʔ, τϥϑΟοΫδΣωϨʔλͳͲͷΞ ϓϦέʔγϣϯʹٻΊΒΕΔߴϨʔτͳRawύέοτI/OΛ൚ ༻OSͰ͸αϙʔτ͍ͯ͠ͳ͍ APIͱͯ͠Raw Socket, Berkeley Packet

    Filter, AF SOCKET familyͳͲ͕࢖ΘΕΔ ύϑΥʔϚϯε͕े෼Ͱ͸ͳ͍ طଘͷੑೳ޲্ख๏Ͱ͸ಛघͳϋʔυ΢ΣΞͷػೳ(NICͳ Ͳ)ʹґଘ͕ͪ͠
  3. Kernel and User API OS͸NICͷσʔλߏ଄ͷίϐʔΛ΋ͭ όοϑΝ͸OSઐ༻·ͨ͸σόΠεʹґଘ͠ͳ͍ίϯςφ (mbufs, sk_buffs)ʹϦϯΫ͍ͯ͠Δ ֤ύέοτʹؔ͢ΔେྔͷϝλσʔλΛؚΉ Driver/OS

    σόΠευϥΠόͱOS͸ύέοτΛϑϥάϝϯτʹ෼ׂ͢Δ ϑϥάϝϯτԽͷͨΊͷΦʔόϔου͕େ͖͍ RawύέοτI/O RawύέοτΛಡΈॻ͖͢Δඪ४API͸ΧʔωϧɾϢʔβۭؒ ͷؒͷσʔλίϐʔʹ࠷௿1ճͷϝϞϦίϐʔΛཁ͢Δ 1ύέοτ͋ͨΓ1ճͷγεςϜίʔϧΛཁ͢Δ
  4. Related Work ύέοτॲཧ଎౓޲্ͷͨΊͷطଘख๏ Socket APIs BPF, AF_PACKETͳͲ ͍ΘΏΔRawιέοτɻύέοτΛෳ੡ͯ͠userlandʹΈͤΔ Packet Filter

    hooks Netgraph(FreeBSD), Netfilter(FreeBSD) ύέοτͷෳ੡͕ඞཁͳ͍ (in kernel) application͸ύέοτॲཧʹڬ·ΕΔ (firewall)ͳͲ Direct buffer access Kernel mode Click (applicationΛkernelͰ࣮ߦ͢Δ) PF_RING, PACKET_MMAP(userlandʹpacket bufferΛΈͤΔ) NIC DMA engine, NetChannels, PacketShader I/O Engine Hardware solutions (FPGA)
  5. netmap netmap modeͰ͸NIC rings͸host stack͔Β઀ଓ Λ੾அ͞Εͯnetmap APIΛհͯ͠ύέοτΛަ׵͢Δ ௥Ճ͞Εͨ2ͭͷnetmap ringsʹΑΓApplication ͸host

    stackͱ࿩ͤΔ netmap rings͸ڞ༗ϝϞϦ্ɹɹɹɹɹɹɹɹɹɹɹ ʹ࣮૷͞ΕΔ OS͸੾அʹ͸ؾ͔ͮͣʹɼɹɹɹɹɹɹɹɹɹɹɹɹɹ ௨ৗͲ͓ΓΠϯλϑΣʔεɹɹɹɹɹɹɹɹɹɹɹɹɹ Λ࢖༻ɾ؅ཧ͢Δ select(2)/poll(2)͸ɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹಉ ظʹ࢖༻͞ΕΔ
  6. Data Structures(2) packet buffers (pkt_buf) ݻఆ௕ͰNICͱϢʔβϓϩηεʹΑΓڞ༗͞ΕΔ netmap modeʹҠߦ͢Δͱશͯͷnetmap ringsʹରԠ͢Δ buffer͕ࣄલʹ֬อ͞ΕΔ

    (࠶֬อ͸͞Εͳ͍) netmap ring: NIC ringɹɹɹɹɹɹɹɹɹɹɹɹɹ ͷσόΠεඇґଘͳෳ੡ ring-size:εϩοτͷݸ਺ cur: ring্ͷݱࡏͷɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹ read/writeҐஔ avail:ར༻Մೳͳbufferͷݸ਺ buf_ofs, slots netmap_if: read-onlyͳ৘ใ
  7. The netmap API /dev/netmapΛopenͯ͠ɼioctl(fd,NIOCREQ,arg) Λ࣮ߦ͢Δ͜ͱʹΑΓɼnetmap modeʹͳΔ mmap(2)ʹΑΓϓϩηεͷΞυϨεۭ͔ؒΒڞ༗ϝϞϦ ΁ΞΫηεՄೳʹ͢Δ ioctl(2)͕ύέοτͷૹड৴Λαϙʔτ ioctl(fd,

    NIOCTXSYNC) OSʹ৽͍͠ύέοτͷૹ৴Λ௨஌ ioctl(fd, NIOCRXSYNC) ಡΈࠐΈՄೳͳύέοτͷݸ਺ΛOSʹฉ͘ non blockingͳͨΊσʔλίϐʔ͕ͳ͘(netmapͱ hardware ringͷಉظҎ֎)ͷෳ਺ύέοτΛಉ࣌ʹѻ͑Δ per-packetΦʔόϔουΛ࡟ݮͰ͖Δ
  8. Talkinkg to the host stack netmap client͸2ͭͷnetmap ringʹΑΓOSελο Ϋͱ΍ΓͱΓ͢Δ ૹ৴ύέοτ͸·ΔͰ෺ཧΠϯλϑΣʔε͔Βདྷ͔ͨͷΑ͏ʹ

    OSελοΫʹ౉͞ΕΔ OSελοΫ͔Βདྷͨύέοτ͸netmap ringʹܨ͕ΕΔ netmap client͸ɼOSελοΫʹ઀ଓ͞Εͨnetmap ringͱNICʹ઀ଓ͞Εͨnetmap ringͱͷؒͰύέοτ ͕ਖ਼͘͠΍ΓͱΓ͞Ε͔ͨΛ֬ೝ͢Δ
  9. Performance metrics ύέοτॲཧ͸ෳ਺ͷαϒγεςϜͱؔ࿈͢Δ CPUύΠϓϥΠϯɼΩϟογϡɼϝϞϦɼI/Oόε ର৅ͷΞϓϦέʔγϣϯ͸CPUό΢ϯυ -> CPUίετ Λܭଌ ΞϓϦέʔγϣϯ͔ΒNIC·ͰͷύέοτҠಈ͕ର৅ Per-byte

    costs NICͷbuffer͔ΒͷσʔλҠಈʹফඅ͢ΔCPUαΠΫϧ Per-packet costs ֤ύέοτʹରͯ͠NIC ringͷεϩοτΛߋ৽͕ඞཁ memory allocation, system calls, ... very simple test programs a packet generator, a packet receiver
  10. Transmit speed VS clock rate ύέοτ௕64όΠτͰɹɹɹɹɹɹɹɹɹɹɹɹɹɹɹ ΫϩοΫϨʔτͱίΞ਺Λɹɹɹɹɹɹɹɹɹɹɹɹɹ มԽͤͨ͞ͱ͖ͷૹ৴ੑೳ 1ίΞͷͱ͖900MHz·Ͱ͸ɹɹɹɹɹɹɹɹɹɹɹɹɹ ΫϩοΫʹରͯ͠εϧʔϓɹɹɹɹɹɹɹɹɹɹɹɹɹɹ

    οτ͸εέʔϧ͢Δ ɹɹɹɹɹɹɹɹɹɹɹɹɹɹ (60-65 cycles/packet) ͜ͷςετͰ͸per-packetॲཧ͸ҎԼͷ2͔ͭ͠ͳ͍ netmap ringεϩοτͷத਎ͷݕূ ରԠ͢ΔNIC ringεϩοτͷߋ৽
  11. Transmit speed VS batch size ύέοτΛ·ͱΊͯѻ͏͜ͱʹΑΔੑೳ޲্Λ֬ೝ͢Δ γεςϜίʔϧͷίετ΍NICͷϨδελ΁ͷΞΫηεͳͲ͕ ࡟ݮ͞Ε͍ͯΔ͸ͣ batch size=ύέοτͷݸ਺

    batch size=1ͷͱ͖ɹɹɹɹɹɹɹɹɹɹɹɹɹɹ 2.45 Mpps(408ns/pkt) batch size=8ͷͱ͖ɹɹɹɹɹɹɹɹɹɹɹɹɹ 14.88 Mpps FreeBSDͷඪ४తͳpoll(2)͸1ճ 250ns 1 callͰෳ਺ͷύέοτΛѻ͏͜ͱ͸ඞਢ