Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Ben Sandofsky: Building Periscope Sketches

Realm
June 13, 2016

Ben Sandofsky: Building Periscope Sketches

Presented at AltConf 2016

Realm

June 13, 2016
Tweet

More Decks by Realm

Other Decks in Programming

Transcript

  1. We take a feature from Pitch to Production,
 with a

    few dead ends and bugs along the way.
 
 You’ll walk away with a little more insight into
 graphics on iOS, and leveraging the GPU. Today
  2. What’s the minimum you need to answer the questions? It

    needs viewers. It needs interactive drawing. “Should drawings be baked into the video?” “Do we even want the feature?”
  3. Given the Architecture… • Video Stream (320x568) • JSON stream

    • NTP Embedded for syncing • Arbitrary payloads are accepted on staging
  4. The Hacky Design • New gesture = new stroke object

    • Each stroke gets its own CAShapeLayer • Upload a snapshot once a second • Animate of strokeStart and strokeEnd
  5. Thoughts on Drawings… • They’re essential to the message. What

    happens when you save to the camera roll? • Do we need renderers for every single platform? • How do we version drawing schema?
  6. The Video Stack • Powered by GPUImage • Filters convert

    iPhone, GoPro and DJI Sources: 2-plane and
 3-plane YUV to RGB • Scaling to 320x568
  7. Why People Struggle with OpenGL • It’s a wacky state

    machine based on “binding” • Multithreading is a battle • Legacy support e.g. client side vertex arrays
  8. GPUs are great at Parallelism CPU GPU Clock Speed 1,400

    450 Cores 2 4 Just avoid data transfer.
  9. -0.0378297 0.12794 0.00447467 0.850855 0.5 -0.0447794 0.128887 0.00190497 0.900159 0.5

    -0.0680095 0.151244 0.0371953 0.398443 0.5 -0.00228741 0.13015 0.0232201 0.85268 0.5 -0.0226054 0.126675 0.00715587 0.675938 0.5 -0.0251078 0.125921 0.00624226 0.711533 0.5 -0.0371209 0.127449 0.0017956 0.888639 0.5 0.033213 0.112692 0.0276861 0.652757 0.5 0.0380425 0.109755 0.0161689 0.708171 0.5 -0.0255083 0.112568 0.0366767 0.454541 0.437538 -0.0245306 0.112636 0.0373469 0.448754 0.455187 0.0274031 0.12156 0.0212208 0.533079 0.5 -0.0628961 0.158419 -0.0175871 0.404517 0.5 0.0400813 0.104202 0.0221684 0.535542 0.5 0.0451532 0.0931968 0.0111604 0.579563 0.425995 -0.0324965 0.174231 -0.00238999 0.365607 0.5 -0.0804587 0.135827 0.0500319 0.499575 0.5 -0.0724944 0.126022 0.052902 0.564827 0.5 Vertices are points that make up your 3D object. You can include additional data with each point, for use in rendering later.
  10. -1.0, 1.0 -1.0, -1.0 1.0, 1.0 1.0, -1.0 Vertex shaders

    are mostly used to go from abstract coordinates into the screen space, -1.0 to +1.0.
  11. Vertex shaders are mostly used to go from abstract coordinates

    into the screen space, -1.0 to +1.0. Fragment shaders actually output pixel. You write shading algorithms for the desired effects.
  12. Vertex shaders are mostly used to go from abstract coordinates

    into the screen space, -1.0 to +1.0. Fragment shaders actually output pixel. You write shading algorithms for the desired effects. GPUs are really good at sampling pixels from textures, bitmaps uploaded to the GPU.
  13. Vertex Data Shaders Other Settings Vertex Shaders Run Fragment Shaders

    Run Setup On the GPU glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);
  14. attribute vec4 position; attribute vec4 inputTextureCoordinate; varying vec2 textureCoordinate; void

    main() { gl_Position = position; textureCoordinate = inputTextureCoordinate.xy; }
  15. varying highp vec2 textureCoordinate; uniform sampler2D inputImageTexture; void main() {

    gl_FragColor = texture2D(inputImageTexture, textureCoordinate); }
  16. There’s got to be something simpler than that snapshot stuff…

    Hmm… We’ve got a simulation, with realtime graphics, driven by user input.
  17. The Final Design • Treat it like a particle system

    • Append new particles the vertex buffer • Every frame increments its age • Only render particles where age < max life typedef struct { GLfloat x; GLfloat y; GLfloat radius; GLfloat age; GLfloat dissolveAngle; // Color GLfloat red; GLfloat green; GLfloat blue; GLfloat alpha; } SketchPoint;
  18. float life = (aAge / uMaxAge); lowp float outroValue =

    smoothstep(0.75, 1.0, life); gl_Position.x += (cos(aDissolveAngle) * outroValue * 0.01); gl_Position.y += (sin(aDissolveAngle) * outroValue * 0.01);
  19. uniform float uMaxAge; uniform mat4 uTransform; uniform float uPointScale; attribute

    vec4 position; attribute float radius; attribute vec4 aColor; attribute float aAge; attribute vec4 inputTextureCoordinate; attribute float aDissolveAngle; varying lowp vec4 vColor; varying lowp float vLife; varying vec2 textureCoordinate; void main() { gl_Position = position; gl_PointSize = radius * uPointScale; vColor = aColor; float life = (aAge / uMaxAge); vLife = life; lowp float introValue = (1.0 - smoothstep(0.0, 0.05, life)); lowp float outroValue = smoothstep(0.75, 1.0, life); lowp float flairUpValue = smoothstep(0.7, 0.8, life); lowp float shrinkValue = 1.0 - smoothstep(0.7, 0.95, life); gl_Position.x += (cos(aDissolveAngle) * outroValue * 0.01); gl_Position.y += (sin(aDissolveAngle) * outroValue * 0.01); gl_Position *= uTransform; vColor.rgb = mix(vColor.rgb, vec3(1.0, 1.0, 1.0), (introValue * 0.6)); vColor.rgb = mix(vColor.rgb, vec3(1.0, 1.0, 1.0), flairUpValue * 0.6); gl_PointSize *= (introValue + 1.0) * shrinkValue; }
  20. varying highp vec2 textureCoordinate; varying lowp vec4 vColor; varying lowp

    float vLife; uniform sampler2D uBrushTexture; void main() { lowp float outroFade = (1.0 - smoothstep(0.95, 1.0, vLife)); gl_FragColor.a = texture2D(uBrushTexture, gl_PointCoord).r * outroFade; gl_FragColor.rgb = vColor.rgb * gl_FragColor.a; }
  21. OpenGL Multithreading • GCD Serial Queues are not dedicated threads

    • Sometimes GPUImage would dealloc on the main thread • Sometimes OpenGL contexts got crossed, messing up state
  22. Performance • Test on real hardware:
 iPod Touch 5th Gen

    • Test under realistic load:
 Near heart rate limit • Don’t guess. Measure.
  23. The GPUImage Codebase • 29,744 lines of Objective-C. 16,094 are

    filters • Hardware capability checks • Resource pooling • Presents rendering on screen
  24. Investigating a Slim Renderer • 1,500 Lines of code •

    Reduces device utilization
 from 29% to 4% • Smaller surface area to understand • Metal is awesome, but not all devices support it Before After
  25. Credits • Aaron Wasserman • Sara Haider • Geraint Davies

    • Pablo Jablonski • Tyler Hansen • Veronika Hecko Wu • Joe Bernstein • Kayvon Beykpour