Slide 1

Slide 1 text

LOW-LEVEL ARKIT Rendering with Metal !1 Warren Moore August 26, 2018

Slide 2

Slide 2 text

AGENDA Introduction to AR Using ARKit and SceneKit Stretch Break Introduction to Metal Stretch Break Metal Rendering for ARKit Conclusion, Q&A !2

Slide 3

Slide 3 text

SAMPLE CODE github.com/warrenm/ARKitOnMetal !3 Two sample projects: SceneKit and Metal Requires Xcode 10 beta Requires actual hardware to run; no Simulator

Slide 4

Slide 4 text

WHAT IS AR? !4 “[Placing] digital or computer-generated information, whether it be images, audio, video, and touch or haptic sensations [in] a real-time environment” Kipper, Greg; Rampolla, Joseph. Augmented Reality: An Emerging Technologies Guide to AR

Slide 5

Slide 5 text

!5

Slide 6

Slide 6 text

No content

Slide 7

Slide 7 text

!7

Slide 8

Slide 8 text

!8

Slide 9

Slide 9 text

!9

Slide 10

Slide 10 text

History of ARKit !10 • World Tracking • Horizontal Plane Detection • Face Tracking (iPhone X) ARKit 1.0 WWDC ’17 ARKit 1.5 iOS 11.3 ARKit 2.0 WWDC ’18 • Image Tracking • Vertical Plane Detection • Shared/Persistent World Maps • Environment Maps • 3D Object Detection

Slide 11

Slide 11 text

What is ARKit? Tracking Scene understanding Rendering !11

Slide 12

Slide 12 text

TRACKING Achieved in part with visual- inertial odometry (VIO) !12

Slide 13

Slide 13 text

Inertial Odometry !13 I’m at (0, 0, 0)!

Slide 14

Slide 14 text

Inertial Odometry !14 I accelerated at 0.9 m/s for 0.5s, so I moved 11cm! Also, I rotated about 0.3 radians around the Z axis!

Slide 15

Slide 15 text

Getting Visual Integrating acceleration alone is subject to drift Fortunately, it’s only half of the visual-inertial odometry puzzle The other half, naturally, is visual !15

Slide 16

Slide 16 text

Visual Odometry • The process of correlating features between video frames • Part of simultaneous localization and mapping (SLAM) !16 source: http://rpg.ifi.uzh.ch/docs/VO_Part_I_Scaramuzza.pdf

Slide 17

Slide 17 text

SCENE UNDERSTANDING The ability to make sense of the world via object and image recognition • Horizontal and vertical planes • Faces • Registered images and scanned objects !17

Slide 18

Slide 18 text

RENDERING Utilities for interoperating with game/rendering frameworks • SpriteKite • SceneKit • Metal* !18

Slide 19

Slide 19 text

ARKIT IN PRACTICE !19 with SceneKit

Slide 20

Slide 20 text

SESSIONS AND CONFIGURATIONS • ARSession: a context object that manages underlying AVFoundation and CoreMotion sessions, and holds the state of the tracked world • ARConfiguration: a descriptor object containing properties that control which tracking features are active !20

Slide 21

Slide 21 text

Acquiring a Session !21 let session = view.session // Where view is an ARSCNView

Slide 22

Slide 22 text

Configuration Types Orientation Tracking • 3DOF device orientation World Tracking • 6DOF device position/orientation, plane detection, hit testing Face Tracking • Facial feature tracking using front-facing camera on iPhone X !22

Slide 23

Slide 23 text

Configuration Types Image Tracking • Position and orientation tracking for known images Object Scanning • Position and orientation of scanned 3D objects !23 New in iOS 12

Slide 24

Slide 24 text

Running a Configuration !24 let configuration = ARWorldTrackingConfiguration() configuration.planeDetection = [.horizontal] let sessionOptions: ARSession.RunOptions = [.resetTracking, .removeExistingAnchors] session.run(configuration, options: sessionOptions)

Slide 25

Slide 25 text

ARSessionObserver !25 func session(_ session: ARSession, cameraDidChangeTrackingState camera: ARCamera) func sessionWasInterrupted(_ session: ARSession) func sessionInterruptionEnded(_ session: ARSession) func session(_ session: ARSession, didFailWithError error: Error)

Slide 26

Slide 26 text

Tracking Status !26 Initializing Normal Excessive Motion Insufficient Features Not Available Relocalizing session configuration changed session started relocalized successfully session interrupted or restarted too dark or not enough visible features features became visible device motion calmed device moving excessively

Slide 27

Slide 27 text

Anchors An anchor is an object with a real-world position Examples include • Planes • Faces • Images • Objects !27

Slide 28

Slide 28 text

ARSCNView Subclass of SCNView • Responsible for rendering a scene • Also associates nodes to anchors !28

Slide 29

Slide 29 text

SCENE GRAPHS • A data structure for representing object hierarchies • Manages nodes, each of which can have: • One parent • Any number of child nodes • Nodes can also hold geometry, cameras, lights, etc. !29 Root Light Camera Torso Arm Head Leg …

Slide 30

Slide 30 text

Scene Graphs in SceneKit SceneKit has classes corresponding to common scene graph object types • SCNNode • SCNGeometry • SCNMaterial • SCNCamera • SCNLight !30

Slide 31

Slide 31 text

ARSCNViewDelegate A specialization of the ARSessionObserver protocol used by SCNView to • pass through session lifecycle notifications • inform when nodes have been created for anchors !31

Slide 32

Slide 32 text

Adding Geometry to a SCNNode !32 func renderer(_ renderer: SCNSceneRenderer, didAdd node: SCNNode, for anchor: ARAnchor) { if let planeAnchor = anchor as? ARPlaneAnchor { let geometry = ARSCNPlaneGeometry(device: device)! geometry.update(from: planeAnchor.geometry) node.geometry = geometry } }

Slide 33

Slide 33 text

Planar Occlusion !33 Recognize one or more planes Set their rendering order to 0 to draw them first Set their color mask to 0 to prevent drawing into the color buffer Other objects (appear to) get clipped to the planes

Slide 34

Slide 34 text

Sample Code !34 You can experiment further with ARKit + SceneKit using the ARKitOnSCN project in the sample source

Slide 35

Slide 35 text

RENDERING WITH METAL !35

Slide 36

Slide 36 text

WHAT IS METAL? • A low-level graphics and compute API • Introduced in iOS 8 in 2014; introduced in macOS El Capitan • A spiritual successor to OpenGL (ES) • Designed to work well with Apple GPUs !36

Slide 37

Slide 37 text

Metal as Platform Enabler Metal now underlies many Apple frameworks • Core Graphics • Core Animation / macOS Window Server • SceneKit • SpriteKit • CoreML / Vision (where applicable) • OpenGL (!) !37

Slide 38

Slide 38 text

FUNDAMENTALS OF 3D GRAPHICS • Representing Objects with Vertices • Coordinate Spaces • Transformations • The Metal API • Shaders and Pipelines • Lighting and Texturing !38

Slide 39

Slide 39 text

AN EXAMPLE VERTEX !39 x y z x y z position normal s t texture coords

Slide 40

Slide 40 text

Building a Model from Vertices !40 • Objects typically contain many vertices • Vertices are connected by edges into primitives, which are usually triangles • Vertices and the connectivity information (in the form of indices) are stored in buffers

Slide 41

Slide 41 text

Coordinate Systems • When rendering, our aim is to produce an image of a scene from a point of view • Objects are modeled relative to some natural origin • We need to transform from this model space into a consistent eye space, the coordinate system of our virtual camera !41

Slide 42

Slide 42 text

Model Space to World Space !42 The model-to-world transformation places objects relative to one another in world space.

Slide 43

Slide 43 text

World Space to Eye Space !43 The world-to-eye transformation places all objects in the coordinate space of the point of view, the camera

Slide 44

Slide 44 text

Eye Space to Clip Space !44 The eye-to-clip transformation, called a projection transformation, moves us into a normalized coordinate space (NDC)

Slide 45

Slide 45 text

Transformations There are a few essential transformations we want to perform !45 Translation Rotation Scale

Slide 46

Slide 46 text

Transformation Matrices Each transformation corresponds to a matrix of a certain form !46 Translation Rotation Scale 1 0 0 tx 0 1 0 ty 0 0 1 tz 0 0 0 1 cosθ -sinθ 0 0 sinθ cosθ 0 0 0 0 1 0 0 0 0 1 sx 0 0 0 0 sy 0 0 0 0 sz 0 0 0 0 1

Slide 47

Slide 47 text

Transforming a Vector !47 let vector = float4(0.5, 0.4, 0.1, 1) let matrix = float4(translationBy: float3(1, 1, 0)) let transformedVector = matrix * vector

Slide 48

Slide 48 text

Composing a Node’s Transform !48 func composeTransformComponents() { let T = float4x4(translationBy: translation) let R = float4x4(rotationFromEulerAngles: eulerAngles) let S = float4x4(scaleBy: scale) matrix = T * R * S }

Slide 49

Slide 49 text

Metal API Essentials • Devices • Resources • Libraries and Functions • Render Pipeline States • Command Submission • Draw Calls !49

Slide 50

Slide 50 text

DEVICES • MTLDevice: an abstraction of a GPU • Macs can have multiple GPUs (e.g., an Intel integrated GPU and discrete AMD GPU) • Used to create various other Metal objects • Resources (buffers & textures) • Command queues • Render pipeline states !50 CPU Memory GPU MTLDevice

Slide 51

Slide 51 text

RESOURCES • All data used for rendering has to exist in GPU-accessible resources • This includes geometric data like vertex positions, texture coordinates, normals, etc. • Again, these are stored in buffers • It also includes image data, stored as textures !51

Slide 52

Slide 52 text

SHADERS • Colloquially, a shader is a small program that runs on the GPU • Metal doesn’t expressly use this term • Vertex and fragment functions are written in Metal Shading Language, a dialect of C++. !52

Slide 53

Slide 53 text

Vertex Functions • A vertex function… • runs once* for each vertex in each triangle • reads data from buffers (and potentially textures) • produces a vertex whose position is in clip space !53

Slide 54

Slide 54 text

Example Vertex Function !54 vertex VertexOut vertex_main(VertexIn in [[stage_in]], constant Uniforms &uniforms [[buffer(0)]]) { VertexOut out; float4 modelPosition = float4(in.position, 1); out.clipPosition = uniforms.modelViewProjectionMatrix * modelPosition; out.eyePosition = (uniforms.modelViewMatrix * modelPosition).xyz; out.eyeNormal = instance.normalMatrix * in.normal; out.texCoords = in.texCoords; return out; }

Slide 55

Slide 55 text

Fragment Functions A fragment function • runs once per fragment • uses interpolated values produced by the rasterizer • may also sample from one or more textures • is responsible for producing the color of the fragment • which may be blended with an existing color !55

Slide 56

Slide 56 text

Example Fragment Function !56 fragment half4 fragment_main(FragmentIn in [[stage_in]], constant Uniforms &uniforms [[buffer(0)]] texture2d diffuseTexture [[texture(0)]]) { constexpr sampler linearSampler(filter::linear); float4 baseColor = diffuseTexture.sample(linearSampler, in.texCoords); float3 diffuseColor = baseColor.rgb; float3 L = normalize(uniforms.eyeLightPosition - in.eyePosition); float3 N = normalize(in.eyeNormal); float diffuse = saturate(dot(N, L)); float3 color = diffuse * diffuseColor; return half4(half3(color), baseColor.a); }

Slide 57

Slide 57 text

Libraries • A library is a collection of functions, stored as a .metallib file • All .metal files in a project are compiled into the default library • At runtime, MTLLibrary objects provide MTLFunction objects, which can be linked together into pipeline states. !57

Slide 58

Slide 58 text

VERTEX DESCRIPTORS • Describe how data is laid out in buffers • Contain: • attributes, which are properties specified for a vertex (position, normal, texture coordinates, etc.) • layouts, which describe how much space each vertex occupies (the stride) !58

Slide 59

Slide 59 text

An Example Vertex !59 x y z x y z position normal s t texture coords

Slide 60

Slide 60 text

A Basic Vertex Struct !60 // Swift struct Vertex { var position: packed_float3 var normal: packed_float3 var texCoords: packed_float2 } // Metal struct Vertex { float3 position [[attribute(0)]]; float3 normal [[attribute(1)]]; float2 texCoords [[attribute(2)]]; };

Slide 61

Slide 61 text

Vertex Descriptor !61 let descriptor = MTLVertexDescriptor() // position descriptor.attributes[0].bufferIndex = 0 descriptor.attributes[0].format = .float3 descriptor.attributes[0].offset = 0 // normal descriptor.attributes[1].bufferIndex = 0 descriptor.attributes[1].format = .float3 descriptor.attributes[1].offset = MemoryLayout.stride * 3 // texture coordinates descriptor.attributes[2].bufferIndex = 0 descriptor.attributes[2].format = .float2 descriptor.attributes[2].offset = MemoryLayout.stride * 6 descriptor.layouts[0].stepFunction = .perVertex descriptor.layouts[0].stepRate = 1 descriptor.layouts[0].stride = MemoryLayout.stride

Slide 62

Slide 62 text

RENDER PIPELINE STATES • Render pipeline states gather together configuration and code that describes how Metal should render objects • Shader functions • Vertex descriptor • Framebuffer configuration • We provide this information up-front because compiling a pipeline state is expensive, so we only want to do it once !62

Slide 63

Slide 63 text

Creating a Render Pipeline !63 let descriptor = MTLRenderPipelineDescriptor() descriptor.vertexFunction = library.makeFunction(name: “vertex_main") descriptor.fragmentFunction = library.makeFunction(name: “fragment_main”) descriptor.colorAttachments[0].pixelFormat = .bgra8Unorm descriptor.depthAttachmentPixelFormat = .float32 let vertexDescriptor = MTLVertexDescriptor() // …configure vertex descriptor… descriptor.vertexDescriptor = vertexDescriptor do { return try device.makeRenderPipelineState(descriptor: descriptor) } catch { fatalError("Could not create pipeline state for full-screen quad") }

Slide 64

Slide 64 text

COMMAND SUBMISSION • Commands in Metal are submitted to command queues • Command queues allow ordered execution of GPU commands • Commands are written into command buffers, which are then enqueued on a queue • Commands are written by command encoders, which have API for setting state and performing draw calls !64

Slide 65

Slide 65 text

Creating a Command Queue !65 commandQueue = device.makeCommandQueue()!

Slide 66

Slide 66 text

Encoding a Frame !66 • Create a command buffer • Create a command encoder • Set state • Issue commands • Present the drawable • Repeat at 60 FPS

Slide 67

Slide 67 text

Creating a Command Buffer !67 commandBuffer = commandQueue.makeCommandBuffer()

Slide 68

Slide 68 text

Creating a Command Encoder !68 guard let pass = view.currentRenderPassDescriptor else { return } renderCommandEncoder = commandBuffer.makeRenderCommandEncoder(descriptor: pass)!

Slide 69

Slide 69 text

Setting State !69 • Setting vertex buffers and textures • Setting fragment buffers and textures

Slide 70

Slide 70 text

Setting Vertex Buffers !70 for (index, buffer) in geometry.buffers.enumerated() { renderCommandEncoder.setVertexBuffer(buffer, offset: 0, index: index) } var uniforms = Uniforms() uniforms.modelMatrix = node.worldTransform.matrix let uniformPtr = instanceUniformBuffer.contents().assumingMemoryBound(to: Uniforms.self) uniformPtr.pointee = uniforms renderCommandEncoder.setVertexBuffer(instanceUniformBuffer, offset: 0, index: uniformsIndex)

Slide 71

Slide 71 text

Setting Fragment Textures !71 renderCommandEncoder.setFragmentTexture(textureForMaterialProperty(material.diffuse), index: Material.TextureIndex.diffuse) renderCommandEncoder.setFragmentTexture(textureForMaterialProperty(material.normal), index: Material.TextureIndex.normal) renderCommandEncoder.setFragmentTexture(textureForMaterialProperty(material.emissive), index: Material.TextureIndex.emissive)

Slide 72

Slide 72 text

Draw Calls So, we’ve set a lot of state, but how do we actually draw stuff? !72

Slide 73

Slide 73 text

Issuing Draw Calls !73 renderCommandEncoder.drawIndexedPrimitives(type: element.primitiveType, indexCount: element.indexCount, indexType: element.indexType, indexBuffer: element.indexBuffer, indexBufferOffset: element.indexBufferOffset)

Slide 74

Slide 74 text

Finishing the Pass !74 renderCommandEncoder.endEncoding()

Slide 75

Slide 75 text

Getting on the Screen • We’ve been tacitly drawing into a texture which can be shown on screen • This texture is wrapped by an object called a drawable • This drawable is what we hand off to the compositor !75

Slide 76

Slide 76 text

Presenting a Drawable !76 guard let drawable = view.currentDrawable else { return } commandBuffer.present(drawable)

Slide 77

Slide 77 text

Finishing the Frame Now that all the work is encoded, we need to tell the GPU to execute it !77

Slide 78

Slide 78 text

Commiting !78 commandBuffer.commit()

Slide 79

Slide 79 text

ARKIT AND METAL !79

Slide 80

Slide 80 text

AN ENGINE FOR METAL AR We want to build an extensible engine like SceneKit No big deal, right? We just need: • Scenes, nodes, cameras, materials, lights, shading models, multipass rendering with postprocessing, model loaders, physics, animations, hit-testing, audio… !80

Slide 81

Slide 81 text

Hands-On with the Engine Let’s take a detailed look at various aspects of the source • Scenes and Nodes • Cameras and Lights • Geometry and Materials • ARMTKView • The Renderer !81

Slide 82

Slide 82 text

!82

Slide 83

Slide 83 text

FUTURE DIRECTIONS Much, much more to do! • Light and shadow (Directional lights, contact shadows, SH) • Animation, face morph targets and weights • Object and image recognition • Integration with CoreLocation, CoreML, Vision, etc. • Physics • Audio but we’re off to a good start !83

Slide 84

Slide 84 text

Q&A !84 Thank you!