Upgrade to Pro — share decks privately, control downloads, hide ads and more …

vIOMMU implementation in BitVisor

December 12, 2019

vIOMMU implementation in BitVisor

@BitVisor Summit 8


December 12, 2019

More Decks by mmisono

Other Decks in Technology


  1. NOTE š This talk is mainly about Intel VT-d š

    AMD IOMMU has different architecture 2019-12-12 BitVisor Summit 8 2
  2. What is IOMMU? š MMU for I/O device, obviously š

    Address Translation Service (ATS) from PCI-SIG specification point of view š Implementation š Intel VT-d š AMD IOMMU 2019-12-12 BitVisor Summit 8 4 https://en.wikipedia.org/wiki/File:MMU_and_IOMMU.svg
  3. 2019-12-12 BitVisor Summit 8 Usage of IOMMU 6 IOVA PA

    (bus address) mapped region Main Memory IOMMU Device IOTLB Geust Memory Device Host Guest Pass-through (static mapping) Main Memory Device IOMMU IOTLB 2. Inter-OS Protection 1. Intra-OS Protection (To limit DMA-able region in guest memory, vIOMMU is needed)
  4. BitVisor and Intel VT-d š BitVisor conceals VT-d information in

    ACPI table (DMAR) š Reason 1. The guest will not configure IOMMU for devices BitVisor uses š Would result in IOMMU #PF 2. BitVisor uses IOMMU to protect itself (if VTD_TRANS = 1) 2019-12-12 BitVisor Summit 8 7 BitVisor VT-d Guest Devices
  5. š BitVisor passes AMD IOMMU through to the guest now!

    š It will cause problems if the guest uses it… ! š (Currently, most OSs does not utilize IOMMU by default) BitVisor and AMD IOMMU 2019-12-12 BitVisor Summit 8 8
  6. Motivation š The guest wants to use IOMMU! š DMA

    attacks are now common š ThunderClap [NDSS’19] š Protect from buggy device drivers š For virtualization š BitVisor now supports (unsafe) nested virtualization š (For my research) š VTD_TRANS only protects BitVisor, not the guest 2019-12-12 BitVisor Summit 8 9 CC-BY Brett Gutstein 2019
  7. Goal š Let the guest use IOMMU while BitVisor managing

    some devices ! š Target is VT-d š Because that is what I have 2019-12-12 BitVisor Summit 8 10
  8. Approach š No “EPT” for VT-d š Actually, “Scalable Mode

    Address Translation” enables nested translations, but currently no hardware is available š We need IOMMU emulation š i.e. vIOMMU š This is BitVisor. Only emulate necessary parts! 2019-12-12 BitVisor Summit 8 11
  9. Intel VT-d š VT-d is more than just “IOMMU” š

    DMA remapping š Interrupt remapping š Interrupt posting (Posted interrupts) š Address translation faults reporting 2019-12-12 BitVisor Summit 8 13
  10. Intel VT-d š VT-d is more than just “IOMMU” š

    DMA remapping š Interrupt remapping š Interrupt posting (Posted interrupts) š Address translation faults reporting 2019-12-12 BitVisor Summit 8 14 This is the today’s topic
  11. DMA Remapping (Legacy Mode) 2019-12-12 BitVisor Summit 8 15 Bus

    = b Dev=0, Fun=0 Dev=31, Fun=7 Bus = 0 Bus = 255 Root Table Context Table For Bus = b Root Table Address Register Dev=d, Fun=f Second-Level Page Table Structures  This is what current HW supports
  12. DMA Remapping (Legacy Mode) 2019-12-12 BitVisor Summit 8 16 Dev=0,

    Fun=0 Dev=31, Fun=7 Bus = 0 Bus = 255 Root Table Context Table For Bus = b Root Table Address Register Second-Level Page Table Structures P=1 P=1 TT=0 Translation Type
  13. DMA Remapping (Legacy Mode) 2019-12-12 BitVisor Summit 8 17 Dev=0,

    Fun=0 Dev=31, Fun=7 Bus = 0 Bus = 255 Root Table Context Table For Bus = b Root Table Address Register Second-Level Page Table Structures P=1 P=1 TT=10 If TT=10, untranslated requests are processed as pass-through. (i.e., no address translation is performed) Translation Type
  14. Relation between Translation Type (TT) and Address Type (AT) in

    TLP 2019-12-12 BitVisor Summit 8 18 AT  TT 00 01 10 Untranslated (00) Translate Translate? (implementation dependent) Pass through Translation Request (01) Block Allow Block Translated (10) Block Allow Block ← PCI Translation Layer Packet (TLP)
  15. (Unsafe) vIOMMU Overview š Let the guest directly configure VT-d

    š Show DMAR in ACPI Table š Allow to access to the VT-d registers š Shadow DMA remapping table so that devices managed by BitVisor can DMA š If we need no protection to BitVisor, simply š Copy root and context table. No need to copy second level page structures. š Set TT=10 for devices managed by BitVisor 2019-12-12 BitVisor Summit 8 19
  16. Remapping Table Shadowing 20 Root Table Context Table Second-Level Page

    Table Structures Root Table Context Table Guest Created BitVisor managed P=1 P=0 P=1 P=1 P=1 TT=0 P=1 TT=0 TT=10 P=1 Entry for the device BitVisor managed Root Table Address Register
  17. The Problem: Tracking the guest modifications of entries š VT-d

    specification allow no explicit IOMMU TLB invalidation if an entry is not on the cache š That is, the guest may add new entries w/o IOMMU TLB invalidation š Configuring EPT for all entry pages is troublesome š The Savior š Caching Mode (CM) 2019-12-12 BitVisor Summit 8 21
  18. Caching Mode (CM) 2019-12-12 BitVisor Summit 8 22 From VT-d

    spec If CM=1, “Any software updates to the remapping structures […] require explicit invalidation.” “Hardware implementations of this architecture must support a value of 0 in this field.”
  19. Tracking the modification of entries š Emulate CM=1 by intercepting

    the guest VT-d register read š Trapping IOMMU TLB invalidation by monitoring VT-d register accesses 2019-12-12 BitVisor Summit 8 23
  20. Safe vIOMMU š Shadow all entries including second level page

    structures š Ensure that the mappings do not contain BitVisor’s memory region š Shadowing is necessary even if the mapping does not contain BitVisor’s memory region because the guest might ignore caching mode and implicitly update the entry š Create remapping table for BitVisor managed devices (like what VTD_TRANS does) 2019-12-12 BitVisor Summit 8 24
  21. Future work š Support other VT-d features š Posted interrupts

    š Inject interrupt w/o VMEXIT 2019-12-12 BitVisor Summit 8 27
  22. Conclusion š Present the way to use IOMMU in BitVisor

    along with the guest 2019-12-12 BitVisor Summit 8 28
  23. References š Intel® Virtualization Technology for Directed I/O Architecture Specification,

    https://software.intel.com/en-us/download/intel-virtualization-technology-for-directed-io- architecture-specification š AMD I/O Virtualization Technology (IOMMU) Specification, https://www.amd.com/system/files/TechDocs/48882_IOMMU.pdf š QEMU Wiki, Features/VT-d, https://wiki.qemu.org/Features/VT-d š A Kegel et al., VIRTUALIZING IO THROUGH THE IO MEMORY MANAGEMENT UNIT (IOMMU), ASPLOS’16 Tutorial, http://pages.cs.wisc.edu/~basu/isca_iommu_tutorial/IOMMU_TUTORIAL_ASPLOS_2016.pdf 2019-12-12 BitVisor Summit 8 29
  24. References (cont’d) š N. Amit et al., vIOMMU: Efficient IOMMU

    Emulation, ATC’11, https://www.usenix.org/conference/usenixatc11/viommu-efficient-iommu-emulation š M. Marka et al., rIOMMU: Efficient IOMMU for I/O Devices that Employ Ring Buffers, ASPLOS’15. š O. Peleg et al., Utilizing the IOMMU Scalably, ATC’15. š A. Markuze et al., True IOMMU Protection from DMA Attacks: When Copy is Faster than Zero Copy, ASPLOS’16. š A. Markuze et al., DAMN: Overhead-free IOMMU protection for networking, ASPLOS’18. š B. Morgan et al., IOMMU protection against I/O attacks: a vulnerability and a proof of concept, Journal of the Brazilian Computer Society 2018. š A. Theodore Markettos et al., Thunderclap: Exploring Vulnerabilities in Operating System IOMMU Protection via DMA from Untrustworthy Peripherals, NDSS’19, https://thunderclap.io/ 2019-12-12 BitVisor Summit 8 30
  25. References (cont’d) š Lan Tianyu, [Xen-devel] Xen virtual IOMMU high

    level design doc V3, 2016, https://lists.xenproject.org/archives/html/xen-devel/2016-11/msg01391.html š Jason Wang and Peter Xu, Vhost and VIOMMU, KVM Forum 2016, https://events.static.linuxfound.org/sites/events/files/slides/Vhost_VIOMMU_merged_0810. pdf š Eric Auger, vIOMMU/ARM: full emulation and virtio-iommu approaches, KVM Forum 2017, https://www.linux-kvm.org/images/8/8e/Viommu_arm.pdf š Peter Xu, Device Assignment with Nested Guest and DPDK, KVM Forum 2018, https://www.linux-kvm.org/images/a/a6/KVM_Forum_2018_viommu_vfio.pdf 2019-12-12 BitVisor Summit 8 31
  26. VT-d Root Entry & Context Entry 2019-12-12 BitVisor Summit 8

    33 Root Table Context Entry Root Entry Context Table P=1 P=1 TT=0 Context Table Pointer Second Level Page Translation Pointer  Some fields are omitted
  27. How to use DMA remapping š Configure root table, context

    table and page table entries š GCMD_REG.TE = 1 (Global Command Register) 2019-12-12 BitVisor Summit 8 34
  28. IOMMU Group in Linux š Linux IOMMU driver create the

    same mapping for the devices in the same IOMMU group 2019-12-12 BitVisor Summit 8 36
  29. How to enable IOMMU in Linux š Use boot option

    š intel_iommu=on iommu=nopt š Linux configure IOMMU for each IOMMU group š intel_iommu=on iommu=pt š Only VFIO uses IOMMU for the device pass-through š To check whether the Linux uses IOMMU š "- ) ( 2019-12-12 BitVisor Summit 8 37
  30. Mapping table created by VTD_TRANS š Devices BitVisor uses š

    Can DMA only the specific region in the BitVisor’s memory š Devices the guest uses š Cannot DMA to BitVIsor’s memory 2019-12-12 BitVisor Summit 8 38
  31. DISABLE_VTD option in BitVisor š Recent firmware of Mac enables

    VT-d at boot time š DISABLE_VTD option disables VT-d by sending commands to Global Command Register š Otherwise Mac will fail to boot because they think there is no VT-d (VT-d is concealed by BitVIsor) but the actually VT-d is enabled and something go wrong 2019-12-12 BitVisor Summit 8 39