Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Extend User Experience of WebRTC with Cool Sensor Devices

mganeko
November 02, 2017

Extend User Experience of WebRTC with Cool Sensor Devices

Cloud Expo 2017, original title was "Extend User Experience of WebRTC with Unique Sensor Devices"
Using WebRTC with 360 Camera, Microphone Array, 3D scan by RealSense, holographic devices such as Dreamoc HD3 and HoloLens

mganeko

November 02, 2017
Tweet

More Decks by mganeko

Other Decks in Technology

Transcript

  1. Extend User Experience of WebRTC with Unique Sensor Devices Masashi

    Ganeko INFOCOM CORPORATION Nov. 2, 2017 Cloud Expo 2017 @Santa Clara
  2. Extend User Experience of WebRTC with Unique Cool Sensor Devices

    Masashi Ganeko INFOCOM CORPORATION Nov. 2, 2017 Cloud Expo 2017 @Santa Clara
  3. About myself • Masashi Ganeko / @massie_g – Manager of

    a research team – INFOCOM CORPORATION (from Japan, Tokyo) • http://infocom.co.jp/english/index.html • One of Organizers of WebRTC Meetup Tokyo – https://atnd.org/groups/webrtc • English Presentation for WebRTC (2013-2017) – https://speakerdeck.com/mganeko • Japanese Presentation for WebRTC (2013-2017) – http://www.slideshare.net/mganeko 3
  4. What is WebRTC • Web Real-time Communication for – Video

    – Audio – Data • Open standard – W3C WebRTC Working Group ... API – IETF RTCWEB Working Group … Protocol – Core library is open source software • Designed for Web Browsers, and other web connected devices • Easy to combine with other Web technologies 10101110100…
  5. What I want to talk about today • WebRTC is

    a very useful tool to build your own communication application • WebRTC + Sensor Devices à more Interesting & Exciting user experience • Introduce two experimental projects, to show “the Power of WebRTC” – Shotoku-Tamago – Virtual Teleport
  6. Shotoku-Tamago • First Prize of RICOH THEATA x IoT Developers

    Contest 2016 – http://contest.theta360.com/index-en.html – http://award.contest.theta360.com/prize1-e.html • 347 entries from 33 countries, 54 projects submitted
  7. Problem in Web Meeting • Web Meeting is very common

    – It works pretty well for 1 to 1 – It works for 3 or 4 distributed members • But it is poor experience for a meeting, – between a group and 1 remote member – Hard to understand who is speaking, from a remote member ? ? ?
  8. Current Solution • Wide Camera – Too small faces •

    Swing Camera – Not automatic – Expensive $1000
  9. Purpose of Shotoku-Tamago • Improve experience of remote member, –

    at the meeting between a group and 1 person • Make easy to understand: – who is/are speaking – their expression, such as smiling, angry, happy, disappointed, … • With not expensive devices • With fixed camera, without manual operation
  10. Cool sensor devices in Shotoku-Tamago • RICOH THETA S (360

    Camera) – Dual fisheye lenses – Capture the whole area of the meeting room at once, without swinging or moving • SYSTEM IN FRONTIER TAMAGO-03 – Egg Shaped Microphone Array – Locates and tracks who are speaking automatically http://www.sifi.co.jp/system/modules/pico/index.php?content_id=39&ml_lang=en
  11. Origin of Name: Shotoku Taishi /  • Legendary Prince

    of Japan, AD. 600 • Many episodes • Some might be true, some might not be • One of the most famous episodes: • When 10 people were talking to him at the same time, he could understand each one’s talk. • So, He is knows as “prince with multiple ears”. • Shotoku-Tamago
  12. Origin of Name: Shotoku Taishi /  • Legendary Prince

    of Japan, AD. 600 • Many episodes • Some might be true, some might not be • One of the most famous episodes: • When 10 people were talking to him at the same time, he could understand each one’s talk. • So, He is knows as “prince with multiple ears”. • Shotoku-Tamago “Egg” in Japanese
  13. Whole architecture of Shotoku-Tamago Web Browser Web Browser Web Browser

    Web Browser Video/Audio media 360 Video/ mono Audio Video/Audio Direction of speaking member WebSocket WebRTC Render with WebGL
  14. 1. Detecting who are speaking HARK - Robot Audition Software

    http://www.hark.jp/ • By Honda Research Institute Japan with Kyoto University • Royalty free for research use Microphone array - consists of 8 small microphones - work with HARK Using “source tracker” of HARK tool, to locate and track speaking members
  15. 2. Connecting HARK tool and Web Browser Web Browser Web

    Browser Video/Audio media 360 Video/ mono Audio Video/Audio WebSocket WebRTC Web Browser Web Browser Direction of speaking member
  16. 2. Connecting HARK tool and Web Browser • HARK tool

    is command line standalone native app. • It is not possible to send data from HARK tool to a Web Browser directly. • Write a pipe tool, with Go-lang as WebSocket server. HARK tool Standalone native app Web Browser Web Browser USB stdout stdin Convert tool As WebSocket Server WebSocket
  17. 3. Sending direction of speaking member Web Browser Web Browser

    Web Browser Web Browser Direction of speaking member WebSocket
  18. 4. capturing 360  video Video/Audio WebSocket Direction of speaking

    member Web Browser Web Browser Video/Audio media WebRTC 360 Video/ mono Audio Web Browser Web Browser mediaDevices.getUserMedia() Dual-fisheye format Video
  19. 5. sending 360  video with WebRTC Video/Audio WebSocket Direction

    of speaking member 360 Video/ mono Audio Web Browser Web Browser Web Browser Web Browser Video/Audio media WebRTC Dual-fisheye format Video
  20. Web Browser Web Browser 6. rendering 360  video with

    WebGL https://github.com/ricohapi/video-streaming-sample-app/tree/master/samples/oneway-watch RICOH sample Dual-fisheye format Video Map to sphere, with UV mapping Render with WebGL (three.js)
  21. Web Browser Web Browser 7. Cropping members face who are

    speaking Which areas to crop are decided by sound direction located with HARK Sphere of 360 video Up to 5 WebGL cameras Up to 5 canvas elements
  22. Whole architecture of Shotoku-Tamago (again) Web Browser Web Browser Web

    Browser Web Browser Video/Audio media 360 Video/ mono Audio Video/Audio Direction of speaking member WebSocket WebRTC Render with WebGL
  23. Power of WebRTC in Shotoku-Tamago • Easy to handle 360

    video with WebGL – Use VR technology in a web browser • Easy to utilize real-time data of sensor devices with WebSocket – data from a sensor device, such as microphone array – data processed by signal process software, such as HARK • Makes web meeting much more vivid, by combination of all of these technologies
  24. Virtual Teleport • Real-time communication tool with – Forward: Real-time

    3D scanned Hologram – Backward: 360video • Demonstrated in AppsJapan exhibition of Interop Tokyo June 2017. (140,000 visitors / 3days) – More than 800 guests enjoyed the new experience with Holographic communication • Referred in Web Media of Japan – http://www.watch.impress.co.jp/headline/docs/extra/vr/1064673.html
  25. Challenge in Virtual Teleport • Communication with Web meeting today

    – Only 2D videos of faces are transferred • Try Future communication with Virtual Teleport – Transfer your existence to remote place – Show your whole body in 3D Hologram, such as “STAR WARS”
  26. Cool devices in Virtual Teleport • Real-time 3D scan device

    – Intel RealSense R200 • https://www.intel.com/content/www/us/en/support/emerging-technologies/intel-realsense-technology/000016214.html • Depth Camera (IR Laser Projector, Dual IR Camera) • Holographic Display devices – Dreamoc HD3 • https://www.realfiction.com/solutions/dreamoc-hd3 – Microsoft HoloLens • https://www.microsoft.com/en-us/hololens IR Laser Projector IR Camera RGB Camera
  27. Real-time 3D scan Show in Holographic Display Show 3D Hologram

    / Watch 360 Video Render 360 video Capture 360Video
  28. Whole architecture of Virtual Teleport MediaStream HDMI WebSocket DataChannel Forward:

    Show you in 3D Hologram Backward: Watch remote 360Video
  29. 1. Capturing 3D in point cloud data • Capture with

    RealSense, from 4 directions – Data is called as “Point Cloud”, a set of 3D points • Merge 4 sets of point cloud from 4 directions – Shown in 4 different colors in the right figure
  30. Merging 4 directions • Using multiple depth cameras is not

    easy – Each camera projects IR Laser pattern – Multiple patterns collide usually IR Laser Projector 1 • Intel RealSense R200 can avoid collision – With libRealsense – https://github.com/IntelRealSense/librealsense IR Laser Projector 2
  31. Processing point cloud data • Reduce points – to support

    HoloLens (not so powerful) – to control network bitrate (< 100Mbps) • Remove noise – remove splattered points • Make 3D mesh object – find triangles for polygon – connect polygons to make mesh – repair holes of mesh – reduce polygons of mesh 1.7 M points à 15 K points 26 K polygons à 5.5 K polygons
  32. Point Cloud Library • a standalone, large scale, open project

    for 2D/3D image and point cloud processing. – http://pointclouds.org/ • Development is active, after Kinect V1 released • Using with libRealsense – https://github.com/lebronzhang/pcl – https://github.com/lebronzhang/pcl/blob/master/visualization/tools/real_sense_viewer.cpp
  33. Using PCL for point cloud processing • Reduce points –

    pickup center area … PCL PassThrough – choose 1 from dense points … PCL VoxelGrid • Remove noise – remove splattered points … PCL OutlierRemoval – smoothen points … PCL OutlierRemoval • Make 3D mesh object – find triangles for polygon … GreedyProjectionTriangulataion – connect polygons to make mesh – repair holes of mesh … VTK • https://www.vtk.org/Wiki/VTK/Examples/Cxx/Meshes/FillHoles – reduce polygons of mesh … Reduction Polygon • https://github.com/PointCloudLibrary/pcl/issues/967
  34. 2. Sending 3D Data HDMI WebSocket DataChannel • Sending same

    data to – Dreamoc HD3 over WebRTC DataChannel – HoloLens over WebSocket
  35. • Data – 3D mesh data • build from point

    cloud of 4 IR camera – Texture of RGB camera • 4 jpeg images, 640 x 480 – UV Map • How to map texture to mesh • Convert different coordinate system – PCL … Right-handed coordinate system – Unity … Left-handed coordinate system • 2 – 3 frames / sec - 1 MB / frame - about 20 M bits / sec Inside of 3D data
  36. Building Dreamoc HD3 App • Build with Unity C# for

    Windows app HDMI • Use 3 camera and 3 image for 3 mirrors – Front view, Left view, Right view • Unity Asset: WebRTC Network – https://www.assetstore.unity3d.com/en/#!/content/47846 DataChannel
  37. Building HoloLens App • Use Unity for 3D programming with

    C# – use MixedRealityToolkit-Unity (a.k.a HoloToolKit-Unity) • position detection, gesture detection – export Visual Studio project • Use Visual Studio 2017 to build Windows 10 UWP app – UWP: Universal Windows Platform (Store App)
  38. Real-time Hologram is not perfect yet • Many holes and

    bumps in 3D mesh object – Algorithm of real-time point reducing and polygon detection is not mature yet – But it will be improved soon with machine learning • Not smooth motion – CPU power is not enough to handle high frame rate • But CPU / GPU is improved year by year – Bitrate is too high in case of transfering many frames per second • But 3D data compression method is coming, such as Draco I believe that it will be improved in 2 – 3 years.
  39. Backward: 360video with multiple displays • 360 camera to capture

    video • Multiple displays to cover wide view (about 180) – Synchronized Scroll as one large screen – NO VR headset, NOT to hide face • WebRTC for video / audio • WebGL / Three.js for rendering DO NOT Use, NOT to hide face ݱࡏ͜ͷΠϝʔδΛද ࣔͰ͖·ͤΜɻ 3 (or more) displays as 1 large screen MediaStream Synchronize direction with WebSocket
  40. Whole architecture of Virtual Teleport MediaStream HDMI WebSocket DataChannel Forward:

    Show you in 3D Hologram Backward: Watch remote 360Video
  41. Power of WebRTC in Virtual Teleport • Make “Holographic communication”

    possible, today – Transfer 3D data in real-time over WebRTC DataChannel – With 3D scan camera, such as RealSense – With Holographic device, such as Dreamoc HD3 and HoloLens • Holographic communication is very attractive experience – even with rough model and not smooth motion • Real-time 3D scan with depth camera is evolving rapidly – There are may useful Open Source Software – Machine learning may improve 3D scan stunningly • WebRTC works in many platforms, as well as Web Browser – Linux C++ app – Windows Unity C# app
  42. Conclusion • WebRTC is very powerful, because easy to combine

    with – many Web technologies, such as WebSocket, WebGL – many open source software, such as three.js, PCL • It is possible to make exciting user experience – with many cool sensor devices and display devices • I hope you will build your own great application with WebRTC!