Slide 1

Slide 1 text

vIOMMU Implementation in BitVisor Masanori Misono, The University of Tokyo @BitVisor Summit 8 2019-12-12

Slide 2

Slide 2 text

NOTE š This talk is mainly about Intel VT-d š AMD IOMMU has different architecture 2019-12-12 BitVisor Summit 8 2

Slide 3

Slide 3 text

! Introduction What is the problem? 2019-12-12 BitVisor Summit 8 3

Slide 4

Slide 4 text

What is IOMMU? š MMU for I/O device, obviously š Address Translation Service (ATS) from PCI-SIG specification point of view š Implementation š Intel VT-d š AMD IOMMU 2019-12-12 BitVisor Summit 8 4

Slide 5

Slide 5 text

Address Translation Servicer (ATS) 2019-12-12 BitVisor Summit 8 5

Slide 6

Slide 6 text

2019-12-12 BitVisor Summit 8 Usage of IOMMU 6 IOVA PA (bus address) mapped region Main Memory IOMMU Device IOTLB Geust Memory Device Host Guest Pass-through (static mapping) Main Memory Device IOMMU IOTLB 2. Inter-OS Protection 1. Intra-OS Protection (To limit DMA-able region in guest memory, vIOMMU is needed)

Slide 7

Slide 7 text

BitVisor and Intel VT-d š BitVisor conceals VT-d information in ACPI table (DMAR) š Reason 1. The guest will not configure IOMMU for devices BitVisor uses š Would result in IOMMU #PF 2. BitVisor uses IOMMU to protect itself (if VTD_TRANS = 1) 2019-12-12 BitVisor Summit 8 7 BitVisor VT-d Guest Devices

Slide 8

Slide 8 text

š BitVisor passes AMD IOMMU through to the guest now! š It will cause problems if the guest uses it… ! š (Currently, most OSs does not utilize IOMMU by default) BitVisor and AMD IOMMU 2019-12-12 BitVisor Summit 8 8

Slide 9

Slide 9 text

Motivation š The guest wants to use IOMMU! š DMA attacks are now common š ThunderClap [NDSS’19] š Protect from buggy device drivers š For virtualization š BitVisor now supports (unsafe) nested virtualization š (For my research) š VTD_TRANS only protects BitVisor, not the guest 2019-12-12 BitVisor Summit 8 9 CC-BY Brett Gutstein 2019

Slide 10

Slide 10 text

Goal š Let the guest use IOMMU while BitVisor managing some devices ! š Target is VT-d š Because that is what I have 2019-12-12 BitVisor Summit 8 10

Slide 11

Slide 11 text

Approach š No “EPT” for VT-d š Actually, “Scalable Mode Address Translation” enables nested translations, but currently no hardware is available š We need IOMMU emulation š i.e. vIOMMU š This is BitVisor. Only emulate necessary parts! 2019-12-12 BitVisor Summit 8 11

Slide 12

Slide 12 text

Design! How does it work? 2019-12-12 BitVisor Summit 8 12

Slide 13

Slide 13 text

Intel VT-d š VT-d is more than just “IOMMU” š DMA remapping š Interrupt remapping š Interrupt posting (Posted interrupts) š Address translation faults reporting 2019-12-12 BitVisor Summit 8 13

Slide 14

Slide 14 text

Intel VT-d š VT-d is more than just “IOMMU” š DMA remapping š Interrupt remapping š Interrupt posting (Posted interrupts) š Address translation faults reporting 2019-12-12 BitVisor Summit 8 14 This is the today’s topic

Slide 15

Slide 15 text

DMA Remapping (Legacy Mode) 2019-12-12 BitVisor Summit 8 15 Bus = b Dev=0, Fun=0 Dev=31, Fun=7 Bus = 0 Bus = 255 Root Table Context Table For Bus = b Root Table Address Register Dev=d, Fun=f Second-Level Page Table Structures This is what current HW supports

Slide 16

Slide 16 text

DMA Remapping (Legacy Mode) 2019-12-12 BitVisor Summit 8 16 Dev=0, Fun=0 Dev=31, Fun=7 Bus = 0 Bus = 255 Root Table Context Table For Bus = b Root Table Address Register Second-Level Page Table Structures P=1 P=1 TT=0 Translation Type

Slide 17

Slide 17 text

DMA Remapping (Legacy Mode) 2019-12-12 BitVisor Summit 8 17 Dev=0, Fun=0 Dev=31, Fun=7 Bus = 0 Bus = 255 Root Table Context Table For Bus = b Root Table Address Register Second-Level Page Table Structures P=1 P=1 TT=10 If TT=10, untranslated requests are processed as pass-through. (i.e., no address translation is performed) Translation Type

Slide 18

Slide 18 text

Relation between Translation Type (TT) and Address Type (AT) in TLP 2019-12-12 BitVisor Summit 8 18 AT TT 00 01 10 Untranslated (00) Translate Translate? (implementation dependent) Pass through Translation Request (01) Block Allow Block Translated (10) Block Allow Block ← PCI Translation Layer Packet (TLP)

Slide 19

Slide 19 text

(Unsafe) vIOMMU Overview š Let the guest directly configure VT-d š Show DMAR in ACPI Table š Allow to access to the VT-d registers š Shadow DMA remapping table so that devices managed by BitVisor can DMA š If we need no protection to BitVisor, simply š Copy root and context table. No need to copy second level page structures. š Set TT=10 for devices managed by BitVisor 2019-12-12 BitVisor Summit 8 19

Slide 20

Slide 20 text

Remapping Table Shadowing 20 Root Table Context Table Second-Level Page Table Structures Root Table Context Table Guest Created BitVisor managed P=1 P=0 P=1 P=1 P=1 TT=0 P=1 TT=0 TT=10 P=1 Entry for the device BitVisor managed Root Table Address Register

Slide 21

Slide 21 text

The Problem: Tracking the guest modifications of entries š VT-d specification allow no explicit IOMMU TLB invalidation if an entry is not on the cache š That is, the guest may add new entries w/o IOMMU TLB invalidation š Configuring EPT for all entry pages is troublesome š The Savior š Caching Mode (CM) 2019-12-12 BitVisor Summit 8 21

Slide 22

Slide 22 text

Caching Mode (CM) 2019-12-12 BitVisor Summit 8 22 From VT-d spec If CM=1, “Any software updates to the remapping structures […] require explicit invalidation.” “Hardware implementations of this architecture must support a value of 0 in this field.”

Slide 23

Slide 23 text

Tracking the modification of entries š Emulate CM=1 by intercepting the guest VT-d register read š Trapping IOMMU TLB invalidation by monitoring VT-d register accesses 2019-12-12 BitVisor Summit 8 23

Slide 24

Slide 24 text

Safe vIOMMU š Shadow all entries including second level page structures š Ensure that the mappings do not contain BitVisor’s memory region š Shadowing is necessary even if the mapping does not contain BitVisor’s memory region because the guest might ignore caching mode and implicitly update the entry š Create remapping table for BitVisor managed devices (like what VTD_TRANS does) 2019-12-12 BitVisor Summit 8 24

Slide 25

Slide 25 text

Evaluation ! It’s showtime! 2019-12-12 BitVisor Summit 8 25

Slide 26

Slide 26 text

Performance š TBE 2019-12-12 BitVisor Summit 8 26

Slide 27

Slide 27 text

Future work š Support other VT-d features š Posted interrupts š Inject interrupt w/o VMEXIT 2019-12-12 BitVisor Summit 8 27

Slide 28

Slide 28 text

Conclusion š Present the way to use IOMMU in BitVisor along with the guest 2019-12-12 BitVisor Summit 8 28

Slide 29

Slide 29 text

References š Intel® Virtualization Technology for Directed I/O Architecture Specification, architecture-specification š AMD I/O Virtualization Technology (IOMMU) Specification, š QEMU Wiki, Features/VT-d, š A Kegel et al., VIRTUALIZING IO THROUGH THE IO MEMORY MANAGEMENT UNIT (IOMMU), ASPLOS’16 Tutorial, 2019-12-12 BitVisor Summit 8 29

Slide 30

Slide 30 text

References (cont’d) š N. Amit et al., vIOMMU: Efficient IOMMU Emulation, ATC’11, š M. Marka et al., rIOMMU: Efficient IOMMU for I/O Devices that Employ Ring Buffers, ASPLOS’15. š O. Peleg et al., Utilizing the IOMMU Scalably, ATC’15. š A. Markuze et al., True IOMMU Protection from DMA Attacks: When Copy is Faster than Zero Copy, ASPLOS’16. š A. Markuze et al., DAMN: Overhead-free IOMMU protection for networking, ASPLOS’18. š B. Morgan et al., IOMMU protection against I/O attacks: a vulnerability and a proof of concept, Journal of the Brazilian Computer Society 2018. š A. Theodore Markettos et al., Thunderclap: Exploring Vulnerabilities in Operating System IOMMU Protection via DMA from Untrustworthy Peripherals, NDSS’19, 2019-12-12 BitVisor Summit 8 30

Slide 31

Slide 31 text

References (cont’d) š Lan Tianyu, [Xen-devel] Xen virtual IOMMU high level design doc V3, 2016, š Jason Wang and Peter Xu, Vhost and VIOMMU, KVM Forum 2016, pdf š Eric Auger, vIOMMU/ARM: full emulation and virtio-iommu approaches, KVM Forum 2017, š Peter Xu, Device Assignment with Nested Guest and DPDK, KVM Forum 2018, 2019-12-12 BitVisor Summit 8 31

Slide 32

Slide 32 text

Backups ! Anything left? 2019-12-12 BitVisor Summit 8 32

Slide 33

Slide 33 text

VT-d Root Entry & Context Entry 2019-12-12 BitVisor Summit 8 33 Root Table Context Entry Root Entry Context Table P=1 P=1 TT=0 Context Table Pointer Second Level Page Translation Pointer Some fields are omitted

Slide 34

Slide 34 text

How to use DMA remapping š Configure root table, context table and page table entries š GCMD_REG.TE = 1 (Global Command Register) 2019-12-12 BitVisor Summit 8 34

Slide 35

Slide 35 text

VT-d Scalable Mode Address Translation 2019-12-12 BitVisor Summit 8 35 From VT-d spec

Slide 36

Slide 36 text

IOMMU Group in Linux š Linux IOMMU driver create the same mapping for the devices in the same IOMMU group 2019-12-12 BitVisor Summit 8 36

Slide 37

Slide 37 text

How to enable IOMMU in Linux š Use boot option š intel_iommu=on iommu=nopt š Linux configure IOMMU for each IOMMU group š intel_iommu=on iommu=pt š Only VFIO uses IOMMU for the device pass-through š To check whether the Linux uses IOMMU š "- ) ( 2019-12-12 BitVisor Summit 8 37

Slide 38

Slide 38 text

Mapping table created by VTD_TRANS š Devices BitVisor uses š Can DMA only the specific region in the BitVisor’s memory š Devices the guest uses š Cannot DMA to BitVIsor’s memory 2019-12-12 BitVisor Summit 8 38

Slide 39

Slide 39 text

DISABLE_VTD option in BitVisor š Recent firmware of Mac enables VT-d at boot time š DISABLE_VTD option disables VT-d by sending commands to Global Command Register š Otherwise Mac will fail to boot because they think there is no VT-d (VT-d is concealed by BitVIsor) but the actually VT-d is enabled and something go wrong 2019-12-12 BitVisor Summit 8 39