Mix & Augmented Reality

M I X E D & A U G M
E N T E D R E A L I T Y T H E W O R L D B E Y O N D Qian JIN | @bonbonking | [email protected] 2017 EPF Course

A B O U T M E

A G E N D A • Definitions • AR
Market • AR domain knowledges • SDK & Platform (ARKit, ARCore, Tango…) • MR & AR Ecosystem & Use cases • Takeaways

I N T R O D U C T I
O N • What’s the differences between VR, AR & MR? • Timeline of Augmented reality • Augmented reality in 2016

Source: http://www.recode.net/2015/7/27/11615046/whats-the-difference-between-virtual-augmented-and-mixed-reality

W H AT ’ S T H E D I
F F E R E N C E S ?

Source: https://www.wired.com/2016/04/magic-leap-vr/ Virtual Reality VR places the user in another
location entirely. Whether that location is computer-generated or captured by video, it entirely occludes the user’s natural surroundings.

Source: https://www.wired.com/2016/04/magic-leap-vr/ Augmented Reality In augmented reality—like Google Glass or
the Yelp app’s Monocle feature on mobile devices— the visible natural world is overlaid with a layer of digital content.

Source: https://www.wired.com/2016/04/magic-leap-vr/ Mixed Reality In technologies like Magic Leap’s, virtual
objects are integrated into—and responsive to—the natural world. A virtual ball under your desk, for example, would be blocked from view unless you bent down to look at it. In theory, MR could become VR in a dark room.

A U G M E N T E D R
E A L I T Y T I M E L I N E

1968: Ivan Sutherland, ﬁrst HMD (Head Mounted Display) system

1974: Myron Krueger, Videoplace

1990: Tom Caudell, Boeing researcher Coined the term “Augmented Reality”

1992: Louis Rosenberg, “Virtual Fixture”

1994: Julie Martin, “Dancing in Cyberspace”

1998: First virtual yellow ﬁrst down marker during a live
NFL game

1999: Naval reserches on Battleﬁeld Augmented Reality System (BARS)

2000: Hirokazu Kato, ARToolKit

2003: Sportvision, 1st & Ten Computer graphic system

2009: Esquire Magazine AR cover, Feat. Robert Downey JR.

2013: Volkswagen MARTA (Mobile Augmented Reality Technical Assistance)

2013: Google Glass, the trend of wearable AR

2013: Magic Leap, Series A funding

2016: AR & VR investment reach $1.1 billon

2016: Microsoft HoloLens & Meta 2 release Developer Kit

A U G M E N T E D R
E A L I T Y I N 2 0 1 6

A U G M E N T E D R
E A L I T Y I N 2 0 1 7

D E F I N I T I O N
• Augmented reality (AR) is a live direct or indirect view of a physical, real-world environment whose elements are augmented (or supplemented) by computer-generated sensory input such as sound, video, graphics or GPS data.

D E F I N I T I O N
• Enhancing one’s current perception of reality with digital information and media, such as 3D models and videos • Overlaying in real-time the camera view of your smartphone, tablet, PC or connected glasses.

A R M A R K E T / B
U S I N E S S A S P E C T S • AR Market size • Segmentation of things • Business chain

A R I N T H E N E X
T Y E A R S • Focus on entreprise augmented reality apps • 2014: $247 million • 2019: $2.4 billion (prediction) • Consumer products launches expected for 2017 Source: https://www.juniperresearch.com/press/press-releases/enterprise-ar-app-revenues-reach-2-4bn-by-2019

A R D O M A I N K N
O W L E D G E S • Hardware • Software and Algorithms

H A R D WA R E • Eyeglasses •
HMD (Head Mounted Display) • HUD (Head-up Display) • Contact lenses • Virtual Retinal Display

Source: http://spectrum.ieee.org/biomedical/bionics/augmented-reality-in-a-contact-lens/eyesb1

S O F T WA R E • Basics of
Computer Vision • Marker based AR vs Markerless AR • ARML (Augmented Reality Markup Language)

B A S I C S O F C O
M P U T E R V I S I O N

C O M P U T E R V I
S I O N : T Y P I C A L TA S K S • Recognition: Object recognition, Identification, Detection • Motion analysis: Egomotion, Tracking, Optical flow • Scene reconstruction: computing a S3 model of the scene • Image restoration: the removal of noises

C O M P U T E R V I
S I O N : S Y S T E M M E T H O D S • Image acquisition • Pre-processing: re-sampling, noise reduction, constrast enhancement • Feature extraction • Detection / Segmentation • High-level processing: Image recognition / Image registration • Decision making

C O M P U T E R V I
S I O N : I M A G E R E G I S T R AT I O N • First stage can use feature detection methods like corner detection, blob detection, edge detection or thresholding and/or other image processing methods. • The second stage restores a real world coordinate system from the data obtained in the first stage.

M A R K E R B A S E
D A R V S M A R K E R L E S S A R

Source: http://eejournal.com/archives/articles/20140401-augmented/ C A M E R A A C
Q U I S I T I O N T R A C K I N G R E N D E R I N G A U G M E N T E D I M A G E V I RT U A L C O M P O N E N T D I S P L AY

M A R K E R B A S E
D P R O C E S S I N G • Converting the input image to grayscale • Performing binary threshold operations in order to generate a high contrast black and white image • Detecting contours in order to "bound" the marker • Identifying marker candidates, and then • Performing distortion correction in order to enable accurate marker decode

M A R K E R L E S S
P R O C E S S I N G • Without fiducial markers, the camera position must be determined through “natural feature tracking” using feature-based detection, tracking, and matching. This approach is associated with the SLAM (simultaneous localization and mapping) techniques that have been developed in robotic research.

A R M L

A R M L : D E F I N
I T I O N • Augmented Reality Markup Language (ARML) is a data standard developed within the Open Geospatial Consortium (OGC), which consists of an XML grammar to describe the location and appearance of virtual objects in the scene, as well as ECMAScript bindings to allow dynamic access to properties of virtual objects.

A R M L : M A I N C
O N C E P T S • Features represent the physical object that should be augmented. • VisualAssets describe the appearance of the virtual object in the augmented scene. • Anchors describe the spatial relation between the physical and the virtual object.

A R M L : A N C H O
R An Anchor describes the location of the physical object in the real world. Four different Anchor types are defined in ARML: • Geometries • Trackables • RelativeTo • ScreenAnchor

Augmented Reality Computer Vision Surface Estimation Scene Understanding Feature Detection
Bundle Adjustment Sensor Fusion Camera Calibration Visual-inertial Navigation SLAM Feature Matching Light Estimation Camera Intrinsics Optimal Correction Nonlinear Optimization Triangulation

S D K / P L AT F O R
M • Project Tango (Android) • ARKit (iOS) • ARCore (Android) • Platforms (Augment, Layar, Wikitude, Vufuria, ARToolKit) • WebAR: Javascript frameworks

His 10M views Youtube video of Head Tracking for Desktop
VR Displays using the Wii Remote is purely amazing!

P R O J E C T TA N G
O • Smartphones lack the understanding of the environment • Teach phone to see & understand the environment • Augment & improve our own ability to answer the question such as: • How many paints do I need to fill up this wall? • Will this couch fit in my living room? • How do I get from point A to point B?

Motion Tracking Area Learning Depth Perception

Project Tango Prototype

Project Tango Dev Kit

ASUS ZenFone AR Lenovo Phab 2 Pro

W H AT ' S A R E A L
E A R N I N G I N TA N G O ? Area Learning gives the device the ability to see and remember the key visual features of a physical space—the edges, corners, other unique features. • Drift correction (also called loop closures) • Area Description File (ADF) Align the virtual & the physical world.

W H AT ' S D E P T H
P E R C E P T I O N I N TA N G O ? Common depth perception technologies: • Structured-light (require IR projector & IR sensor) • Time of flight (require IR projector & IR sensor) • Stereoscopy Tango's depth perception works best indoors at moderate distances (0.5 to 4 meters). Areas lit with light sources high in IR like sunlight or incandescent bulbs, or objects that do not reflect IR light, cannot be scanned well.

W H AT ' S D E P T H
P E R C E P T I O N I N TA N G O ? Tango API provide a function to get depth data in the form of a point cloud. This format gives (x, y, z) coordinates for as many points in the scene as are possible to calculate. *{X, Y, Z, C}, C here means Conﬁdence.

P R O J E C T TA N G
O T E A R D O W N

H O W D O E S D E P
T H S E N S O R W O R K ? The IR projector projects a pattern of IR light which falls on objects around it like a sea of dots. We can't see the dots because the light is projected in the Infrared color range. The IR camera sees the dots and sends its video feed of this distorted dot pattern to the processor. Processor works out depth from the displacement of the dots: on near objects the pattern is spread out, on far objects the pattern is dense. Ref: https://jahya.net/blog/how-depth-sensor-works-in-5-minutes/

TA N G O D E V I C E
T E A R D O W N The depth-sensing array in Tango prototype includes: • an infrared projector • 4 MP rear-facing RGB/IR camera • 180º field of view fisheye rear-facing camera

T E A R D O W N IR projector: provides infrared light that other (non-RGB) cameras can use to get a sense of an area in 3D space. Quote Google: "The IR projector is from Mantis Vision, and designed specific to our specs for field of view and resolution. It is custom designed to work in partnership with the 4MP RGB-IR camera on the other side."

IR projector from Tango Tablet & Tango's prototype kit

T E A R D O W N • The bright grid of dots shows that Tango works similarly to the original Microsoft Kinect, with a grid of dots to be captured by the IR sensors of the 4 MP camera, building a depth map.

A R K I T • Mobile AR platform •
High-level API • iOS (A9 and up)

World tracking Visual inertial odometer No external setup Plane detection
Hit-testing Light estimation Easy integration AR views Custom rendering Tracking Scene Understanding Rendering

Rendering Application SceneKit SpriteKit Metal Processing ARKit

AVFoundation CoreMotion Capturing ARKit

ARSession

ARSessionConfiguration ARSession

ARSessionConfiguration run(_ configuration) ARSession

ARSessionConfiguration AVCaptureSession CMMotionManager run(_ configuration) ARSession

ARSessionConfiguration ARFrame ARFrame ARFrame AVCaptureSession CMMotionManager run(_ configuration) currentFrame ARSession

A R C O R E ( N O T
T H E C I T Y )

S U P P O R T E D D
E V I C E S • ARCore is designed to work on a wide variety of qualified Android phones running N and later. During the SDK preview, ARCore supports the following devices: • Google Pixel, Pixel XL, Pixel 2, Pixel 2 XL • Samsung Galaxy S8 (SM-G950U, SM-G950N, SM-G950F, SM-G950FD, SM-G950W, SM-G950U1)

M O T I O N T R A C
K I N G • Feature Point, Pose (Position & Orientation) • Concurrent Odometry and Mapping (COM)

E N V I R O N M E N
TA L U N D E R S TA N D I N G • Plane detection (beware of flat surfaces without textures)

L I G H T E S T I M
AT I O N • ARCore provides the average environment lightning intensity

O T H E R F U N D A
M E N TA L C O N C E P T S • User interaction: hit-testing • Anchoring objects

H O W M U C H I S A
R C O R E TA N G O ? atap = (Google's) Advanced Technology and Projects See: https://atap.google.com/

K E Y C O M P O N E
N T S https://developers.google.com/ar/reference/java/com/google/ar/core/package-summary

A R J AVA S C R I P T
F R A M E W O R K S

E C O S Y S T E M •
Wearable products • Mobiles apps • Industrial 4.0 • (Some other) AR startups

W E A R A B L E S •
Google Glass • ODG Smart Glasses • Magic Leap • Microsoft Hololens • Metavision

G O O G L E

Source: https://www.technologyreview.com/s/524576/glass-darkly/

Magic Leap

what if computing could spill outside the computer?

Holograms are real!

A P P S • Game (Ingress, Pokemon Go) •
Education / Medicine • Entertainment (InkHunter, Snapchat) • Consumer (ikea) • Utility (Google Translate)

A R G A M E S

K E Y E L E M E N T
S • Real life location & physical portal mapping • Collecting equipments / gears • Individual challenges (e.g. attack a portal & a stadium) • League / Fraction

A R F O R E D U C AT
I O N

A R F O R M E D I C
I N E

A R F O R E N T E R
TA I N M E N T

Snapchat Lens

How Snapchat ﬁlters really works?

Snapchat Spectacles

I N K H U N T E R

A R C O N S U M E R
A P P S

A R U T I L I T Y A
P P S

I N D U S T R I A L
4 . 0 • DAQRI: Smart Helmet / Smart Glasses • DHL • Caterpillar

( S O M E O T H E R
) A R S TA R T U P S • 8i (Los Angeles, USA / Wellington, New Zealand) • Immersiv (Paris, France) • Wingnut AR (Wellington, New Zealand)

8 I : R E A L H U M
A N H O L O G R A M S F O R A U G M E N T E D , V I R T U A L A N D M I X E D R E A L I T Y

W I N G N U T A R

TA K E A WAY S • Smart Dust •
Visual Positioning Service • Augmented Reality: A Compelling Mobile Embedded Vision Opportunity

Ref: http://www.legitreviews.com/qualcomm-2nd-gen-spectra-depth-sensing-camera-coming-soon_197068

U B I Q U I T O U S
F U T U R E We will have more and more ways to establish the communication between the virtual world and the physical world. When this day comes, technologies will be truly ubiquitous.

Mini Project

A G E N D A • 15min: Team up
(2-3 people per team) + Brainstorming augmented reality use cases • 45min: Elaborate your idea • 30min: Pitch your idea (5min max per team)

S O M E N O T E S •
IMPORTANT: Send your slide (in PDF) to [email protected] with your team members’ names. They will be evaluated along with your presentation. • Any support is allowed (Slides, White board, paper drawing…) • Live demo is welcomed! • Be imaginative :) • Use lean canvas if you need help elaborating your ideas

Mix & Augmented Reality

Mix & Augmented Reality

More Decks by jinqian

Other Decks in Technology

Featured

Transcript