Upgrade to Pro — share decks privately, control downloads, hide ads and more …

For Butter or Worse

For Butter or Worse

Project Butter was an initiative to make the OS faster and more responsive in Android 4.1. This presentation explains some of the tools and techniques used by the Android team to achieve its goals.

Romain Guy

June 27, 2012
Tweet

More Decks by Romain Guy

Other Decks in Programming

Transcript

  1. For Butter or Worse Smoothing Out Performance in Android UIs

    Chet Haase Romain Guy Android UI Toolkit Engineers
  2. jank, noun 1. Choppy performance “Swiping the home screen feels

    janky” 2. Discontinuous, surprising experience “What's with the jank launching that app?”
  3. butter, tasty noun 1. Smooth performance “Home screen swiping is

    very buttery” 2. Fattening spread and cooking ingredient “America’s obesity problem is directly proportional to the deliciousness of butter”
  4. 2. Discontinuous, surprising experience “What's with the jank launching that

    app?” jank, noun 1. Choppy performance “Swiping the home screen feels janky” This session will focus on the first kind of jank.
  5. Recipe • Butter for eating is made from cream •

    Butter for Android is made from - low latency - fast, consistent framerate Latency Event Display
  6. Recipe • Butter for eating is made from cream •

    Butter for Android is made from - low latency - fast, consistent framerate Latency Event Display
  7. Recipe • Butter for eating is made from cream •

    Butter for Android is made from - low latency - fast, consistent framerate Latency Event Display Event Display Event Display Event Display Event Display Event Display Event Display Event Display Event Display Event Display Event Display
  8. Input Latency a a..c b User Input Time c d

    e f g h i j k l m n o p q r s t u v w x Dispatcher Activity Proc a..c Draw d...j Proc d..j Draw k..s
  9. Input Latency a a..c b User Input Time c d

    e f g h i j k l m n o p q r s t u v w x Dispatcher Activity Proc a..c Draw d...j Proc d..j Draw k..s Latency a d c j
  10. Event Streaming a b User Input Time c d e

    f g h i j k l m n o p q r s t u v w x Activity Proc a..f Draw Proc g..o Draw VSync VSync
  11. Event Streaming a b User Input Time c d e

    f g h i j k l m n o p q r s t u v w x Activity Proc a..f Draw Proc g..o Draw VSync VSync Latency a g f o
  12. Framerate But that's not quite all there is to it...

    Drawing Drawing faster is a good start
  13. Drawing: The Big Picture Event Set Property Value Invalidate Measure

    & Layout Prepare Draw Update DisplayList Draw DisplayList Swap Buffers Composite Windows Post Buffer Display List Dequeue Buffer Enqueue Buffer Activity SurfaceFlinger
  14. Something Happens Draw Display Something Happens Draw Display Measure &

    Layout Prepare Draw Update DisplayList Display List Dequeue Buffer
  15. Something Happens Draw Display Something Happens Draw Display Measure &

    Layout Prepare Draw Update DisplayList Draw DisplayList Display List Dequeue Buffer
  16. Something Happens Draw Display Something Happens Draw Display Measure &

    Layout Prepare Draw Update DisplayList Draw DisplayList Swap Buffers Display List Dequeue Buffer Enqueue Buffer
  17. Something Happens Draw Display Something Happens Draw Display Event Set

    Property Value Invalidate Measure & Layout Prepare Draw Update DisplayList Draw DisplayList Swap Buffers Composite Windows Post Buffer Display List Dequeue Buffer Enqueue Buffer Activity SurfaceFlinger
  18. VSync Synchronizing rendering with the display refresh 60 fps Displays

    refresh at 60 Hz, allowing apps to render at 60fps VSync (vertical sync) is a way to synchronize the rendering sub-system with the refresh of the hardware display.
  19. In practice, the display refresh rate will be between 55

    and 60 Hz. If your app does not hit 60 fps you might just be maxing out the device’s refresh rate. adb shell dumpsys SurfaceFlinger will tell you the exact refresh rate of your display.
  20. Android has always used vsync to avoid tearing. Tearing happens

    when swapping buffers does not happen at the vsync. When this occurs, you will see part of the screen showing the previous buffer and part of the screen showing the new buffer. This effect can be seen in many games on consoles for instance. Prior to Android 4.1 Jelly Bean, animations and drawing requests were however not vsync’d.
  21. Tearing Android has always used vsync to avoid tearing. Tearing

    happens when swapping buffers does not happen at the vsync. When this occurs, you will see part of the screen showing the previous buffer and part of the screen showing the new buffer. This effect can be seen in many games on consoles for instance. Prior to Android 4.1 Jelly Bean, animations and drawing requests were however not vsync’d.
  22. Getting the Pixels to the Screen Time CPU GPU Visible

    Update DisplayList Draw DisplayList Display
  23. Drawing without VSync 1 0 1 Time Display 1 1

    2 2 2 CPU GPU 3 VSync VSync VSync VSync 3 3 4 4
  24. Drawing without VSync 1 0 1 Time Display 1 1

    2 2 2 CPU GPU 3 VSync VSync VSync VSync 3 3 4 4 Jank!
  25. Drawing with VSync 1 0 1 Time Display 1 2

    3 2 2 CPU GPU 4 VSync VSync VSync VSync 3 3 4 4
  26. Drawing with DisplayLists set property invalidate set property invalidate set

    property invalidate set property invalidate Update DisplayList Draw DisplayList Display List VSync
  27. Drawing with DisplayList Properties set property set property set property

    set property Draw DisplayList Display List VSync DLProps
  28. DisplayList Properties • Free with ViewPropertyAnimator • Or use ObjectAnimator

    with View properties - alpha - translationX/Y - scaleX/Y - rotation/X/Y
  29. 0 1 2 3 Time in ms Frame With DisplayList

    Properties Without DisplayList Properties Comparison of drawing the transition from launcher to the list of all apps with and without display list properties. You can see that display list properties cost almost nothing
  30. Parallel Processing and Double Buffering B A B Time Display

    VSync B A B A A B B A A CPU GPU A VSync VSync VSync
  31. Parallel Processing and Double Buffering B A B Time Display

    A B B A A CPU GPU A VSync VSync VSync VSync Jank! Jank!
  32. Parallel Processing and Double Buffering Triple B A B Time

    Display A B C C A CPU GPU C A B B A VSync VSync VSync VSync Jank!
  33. Window Composition GPU Draw DisplayList Swap Buffers Composite Windows Post

    Buffer Activity SurfaceFlinger Buffer Overlay Overlay Overlay FB FB FB Overlay
  34. android:sdk $ cd platform-tools/ android:platform-tools $ adb shell 㾑 dumpsys

    gfxinfo dumpsys gfxinfo is a useful tool to analyze hardware accelerated apps shows memory usage, number of views and, with the right setting, frame profiling data
  35. In Settings > Developer options, enable Profile GPU rendering You

    must destroy your activity before you’ll see the results
  36. In Settings > Developer options, enable Profile GPU rendering You

    must destroy your activity before you’ll see the results
  37. Demo Demo doing a dumpsys on com.android.settings Shows the array

    of values: Draw = update display lists Process = execute display lists Execute = swap buffers Copy the array and paste it in a spreadsheet to visualize the data. You’ll see the last 128 frames at most. Every time you capture a series of value, the buffer is cleared.
  38. 0 3 6 9 12 15 Time in ms Frames

    Update display lists Process display lists Swap buffers Scrolling the main screen in Settings on a Nexus 7 Everything is below 16ms. You can see processing the display lists is usually what takes time (turning Canvas commands into OpenGL primitives)
  39. android:sdk $ cd tools/systrace android:systrace $ ./systrace systrace is a

    new, system-wide, instrumentation tool Useful to see how your app interacts with the rest of the world It can be used for optimization work. For instance if performance looks bad from gfxinfo or gfxinfo looks good but perf is bad
  40. Select the traces you want. Most of the time you’ll

    want to use Graphics and View together, but you should experiment with other traces
  41. Select the traces you want. Most of the time you’ll

    want to use Graphics and View together, but you should experiment with other traces
  42. Select the traces you want. Most of the time you’ll

    want to use Graphics and View together, but you should experiment with other traces
  43. Demo Let’s see how it works. Run systrace while scrolling

    settings, open the result and explain what we see: performTraversals/draw/etc. Open trace-slow.html and show what a bad app looks like and what to do with it. Go in traceview to see what the app does
  44. Captures 5 seconds A systrace will capture only 5 seconds

    of data. You’ll usually run it to analyze an animation, a scroll, etc. You can change this behavior with an argument (-t)
  45. Stand-alone HTML output The output is a stand-alone HTML file

    you can attach to bug reports, email to co-workers, etc. Open the help menu to see all the keys you can use. Most useful: WASD (just like in FPS games on PC) to move around and zoom in/out
  46. type | ... | name ----------+ ... +------- OVERLAY |

    ... | com...SlowListActivity OVERLAY | ... | StatusBar OVERLAY | ... | NavigationBar Here’s a sample output. You’ll find this table at the end of the dump. You want as many windows as possible to go into overlays. Here we have 3 windows on screen and everything is composited with overlays, it’s great
  47. type | ... | name ----------+ ... +------- OVERLAY |

    ... | com...SlowListActivity OVERLAY | ... | StatusBar OVERLAY | ... | NavigationBar ✔ ✔ ✔ Here’s a sample output. You’ll find this table at the end of the dump. You want as many windows as possible to go into overlays. Here we have 3 windows on screen and everything is composited with overlays, it’s great
  48. type | ... | name ----------+ ... +------ OVERLAY |

    ... | com...SlowListActivity FB | ... | PopupWindow:424d4cc8 FB | ... | StatusBar FB | ... | NavigationBar In this case our application creates an extra window, the PopupWindow The windows now cannot all go in overlays and 3 of them must be flattened using GPU composition There is a cost: GPU has more work to do, burns fillrate/bandwidth, etc. You want to avoid this.
  49. type | ... | name ----------+ ... +------ OVERLAY |

    ... | com...SlowListActivity FB | ... | PopupWindow:424d4cc8 FB | ... | StatusBar FB | ... | NavigationBar ✔ ✘ ✘ ✘ In this case our application creates an extra window, the PopupWindow The windows now cannot all go in overlays and 3 of them must be flattened using GPU composition There is a cost: GPU has more work to do, burns fillrate/bandwidth, etc. You want to avoid this.
  50. Caveats The number of overlays depends on the device. The

    opacity of the window may impact whether it goes in an overlay or not. Always use dumpsys while the app is drawing, or shortly after it’s done. Android has an optimization to move everything to FB (GPU) after a short period of time (saves battery when the app doesn’t update)
  51. traceview We briefly talked about traceview, you can refer to

    the official documentation in the SDK It’s a tracing profiler that helps you see what’s going on in your app and how much time it takes It’s one of the most important tools at your disposal, and you can invoke it from ADT or DDMS
  52. HierarchyViewer HierarchyViewer is your best friend when it comes to

    optimizing your UI. Use it stand-alone or as part of ADT. It shows you the tree of views straight from your app. You can inspect numerous properties, capture partial screenshots, force layouts/repaints, export PSD files and much more. Custom properties are supported with an annotation.
  53. https://github.com/romainguy/ViewServer Note however that HierarchyViewer will not work out of

    the box on consumer devices. You must use an emulator, a userdebug or eng build, or use ViewServer.java available on github. Apache 2.0, easy to use and secure. We’re thinking of ways to integrate this directly into ADT.
  54. Tracer for OpenGL ES Our newest tool; helps you debug

    and profile GL apps. Can be useful for hardware accelerated apps to see exactly what’s going on behind the scenes. Shows you how many commands your app generates, how much overdraw they cause, etc. You can also see the time each command takes to execute. Note that the commands are grouped by view, making it easy to read.
  55. Allocation Tracker Allocation Tracker is part of DDMS and the

    easiest way to track down and remove unnecessary allocations from your app. Each allocation has a type, size and stack trace. Anecdote: as I was preparing this talk, I decided to take a screenshot of Allocation Tracker. This lead me to discover numerous unnecessary allocations caused by the GL rendering pipeline and that affected most apps.
  56. ✔ Consistent frame-rate ✔ Lower latency ✔ Faster display list

    drawing ✔ GPU-free window composition ✔ Faster display list updates Here are the various things you can improve by applying the following tips
  57. new Let me tell you how to use this keyword:

    don’t use it. Use DDMS’ allocation tracker to ensure you allocate only what you need in performance sensitive code paths (onDraw, onMeasure, onLayout, touch events, Adapter.getView, etc.)
  58. new Let me tell you how to use this keyword:

    don’t use it. Use DDMS’ allocation tracker to ensure you allocate only what you need in performance sensitive code paths (onDraw, onMeasure, onLayout, touch events, Adapter.getView, etc.)
  59. public void bindView(View view, Context context, Cursor c) { BookViewHolder

    holder = (BookViewHolder) view.getTag(); String bookId = c.getString(mInternalIdIndex); holder.bookId = bookId; holder.sortTitle = c.getString(mSortTitleIndex); final ShelvesActivity activity = mActivity; if (activity.getScrollState() == SCROLL_STATE_FLING || activity.isPendingCoversUpdate()) { holder.title.setCompoundDrawablesWithIntrinsicBounds( null, null, null, mDefaultCover); holder.queryCover = true; } else { holder.title.setCompoundDrawablesWithIntrinsicBounds( null, null, null, ImageUtilities.getCachedCover(bookId, mDefaultCover)); holder.queryCover = false; } final CharArrayBuffer buffer = holder.buffer; c.copyStringToBuffer(mTitleIndex, buffer); final int size = buffer.sizeCopied; if (size != 0) { holder.title.setText(buffer.data, 0, size); } } In performance sensitive code paths, do less work. Use caches, avoid I/ O, don’t rescale images on the fly, use background tasks, etc. See Jeff Sharkey’s talk.
  60. // Do less! In performance sensitive code paths, do less

    work. Use caches, avoid I/ O, don’t rescale images on the fly, use background tasks, etc. See Jeff Sharkey’s talk.
  61. android.view.Choreographer Choreographer is a new API, very simple, in Android

    4.1 Jelly Bean. It lets you schedule callbacks on vsync. This is the API used internally by the framework to schedule animations, redraws, etc. This API is great if your app does its own animations; games for instance.
  62. // Invalidates at the next v-sync event myView.postInvalidateOnAnimation(); Use this

    method when implementing animations in your view. This is how the UI toolkit animates scrolls and flings in ListView, ScrollView, etc. Can be invoked from any thread.
  63. callback = new Runnable() { public void run() { setupAndStartAnimation();

    } } myView.postOnAnimation(callback); Runs an arbitrary action at the next v-sync. Useful to start your custom animations
  64. callback = new Choreographer.FrameCallback() { public void doFrame(long frameTime) {

    render(frameTime); } }; Choreographer c = Choreographer.getInstance(); c.postFrameCallback(callback); If you write a game, you might not have a View, or you’re not using the UI thread, so use Choreographer direction. Simply create a FrameCallback and pass it to Choreographer. Note that there is one Choreographer instance per Looper thread. You can post a callback from any thread, but they will run on the Choreographer’s looper. The frametime in nanoseconds is very useful to synchronize animations on a single time-base. It’s the time at which the vsync event occurred.
  65. view.setLayerType(View.LAYER_TYPE_HARDWARE, null); Complex view should be animated with layers enabled.

    A layer is a GPU snapshot of your view, and it’s extremely efficient to render. For more information, please refer to our Google I/O 2011 talk called “Android Accelerated Rendering”, available on YouTube.
  66. view.animate().alpha(0).withLayer(); Starting with Android 4.1 Jelly Bean, an easier way

    to use layers is to use the new withLayer() API on ViewPropertyAnimator. Easy to write, easy to read, very efficient.
  67. ✂ CLIP PING Clipping is handled by the UI toolkit

    to ensure the system only draws what needs to be drawn. Usually you don’t have to do anything but if you have custom drawing code or custom views you should make sure you don’t draw more than what you really need. Here are a couple of things you can do to help.
  68. view.invalidate(); Whenever you call View.invalidate() ensure you need to redraw

    the entire view. In many situations a partial invalidate is enough and will help performance. Simply specify the region of the view that needs to be redrawn. A great example of this is EditText and its blinking cursor.
  69. view.invalidate(left, top, right, bottom); Whenever you call View.invalidate() ensure you

    need to redraw the entire view. In many situations a partial invalidate is enough and will help performance. Simply specify the region of the view that needs to be redrawn. A great example of this is EditText and its blinking cursor.
  70. DisplayList DisplayList DisplayList DisplayList DisplayList DisplayList DisplayList DisplayList Calling invalidate()

    on a View will cause that entire View (and any view it intersects) to redraw. In this example, a complex view causes all of its children to redraw. Each display list must be executed.
  71. DisplayList DisplayList DisplayList DisplayList DisplayList DisplayList DisplayList DisplayList invalidate() Calling

    invalidate() on a View will cause that entire View (and any view it intersects) to redraw. In this example, a complex view causes all of its children to redraw. Each display list must be executed.
  72. DisplayList DisplayList DisplayList DisplayList DisplayList DisplayList DisplayList DisplayList invalidate() Draw

    display lists Calling invalidate() on a View will cause that entire View (and any view it intersects) to redraw. In this example, a complex view causes all of its children to redraw. Each display list must be executed.
  73. DisplayList DisplayList DisplayList DisplayList DisplayList DisplayList DisplayList DisplayList DisplayList DisplayList

    DisplayList DisplayList DisplayList DisplayList DisplayList DisplayList invalidate() Draw display lists Calling invalidate() on a View will cause that entire View (and any view it intersects) to redraw. In this example, a complex view causes all of its children to redraw. Each display list must be executed.
  74. DisplayList DisplayList DisplayList DisplayList DisplayList DisplayList DisplayList DisplayList With invalidate(l,

    t, r, b) the rendering pipeline can cull (i.e. ignore) entire parts of the render tree. This leads to significant performance improvements when used properly.
  75. DisplayList DisplayList DisplayList DisplayList DisplayList DisplayList DisplayList DisplayList invalidate(l, t,

    r, b) With invalidate(l, t, r, b) the rendering pipeline can cull (i.e. ignore) entire parts of the render tree. This leads to significant performance improvements when used properly.
  76. DisplayList DisplayList DisplayList DisplayList DisplayList DisplayList DisplayList DisplayList invalidate(l, t,

    r, b) Draw display lists With invalidate(l, t, r, b) the rendering pipeline can cull (i.e. ignore) entire parts of the render tree. This leads to significant performance improvements when used properly.
  77. DisplayList DisplayList DisplayList DisplayList DisplayList DisplayList DisplayList DisplayList DisplayList DisplayList

    DisplayList DisplayList invalidate(l, t, r, b) Draw display lists With invalidate(l, t, r, b) the rendering pipeline can cull (i.e. ignore) entire parts of the render tree. This leads to significant performance improvements when used properly.
  78. In developer settings, use “Show GPU view updates” to enable

    a tool that flashes the region of the screen you invalidate. This is very useful to ensure you only invalidate what’s needed.
  79. In developer settings, use “Show GPU view updates” to enable

    a tool that flashes the region of the screen you invalidate. This is very useful to ensure you only invalidate what’s needed.
  80. Don’t draw invisible items invalidate(l,t,r,b) is useful to tell the

    system what parts of the screen to redraw. It is equally important to never draw pieces of the UI that will never be visible to the user. In this section we’ll focus on a real-world example, using the Google application, new in Android 4.1 Jelly Bean. When the app shows you suggestion cards, they can appear stacked. If you use standard views, overdraw will occur.
  81. If you just stack the two cards using regular Views,

    here is how they will be drawn. First the bottom view, then the top one. There is quite a lot of overdraw happening here, highlighted in red. What can we do to get rid of that overdraw?
  82. If you just stack the two cards using regular Views,

    here is how they will be drawn. First the bottom view, then the top one. There is quite a lot of overdraw happening here, highlighted in red. What can we do to get rid of that overdraw?
  83. If you just stack the two cards using regular Views,

    here is how they will be drawn. First the bottom view, then the top one. There is quite a lot of overdraw happening here, highlighted in red. What can we do to get rid of that overdraw?
  84. If you just stack the two cards using regular Views,

    here is how they will be drawn. First the bottom view, then the top one. There is quite a lot of overdraw happening here, highlighted in red. What can we do to get rid of that overdraw?
  85. @Override public void onDraw(Canvas c) { c.save(); if (stacked) {

    c.clipRect(headerLeft, headerTop, headerRight, headerBottom); } drawHeader(c); drawContent(c); c.restore(); } The trick is to set the clip rect to include only the “header” of the stacked card. This way the content won’t be drawn at all.
  86. Here’s what the render pass looks like now that we

    properly clip the content of the bottom card. No more wasted cycles on drawing hidden content.
  87. clipRect Here’s what the render pass looks like now that

    we properly clip the content of the bottom card. No more wasted cycles on drawing hidden content.
  88. clipRect Here’s what the render pass looks like now that

    we properly clip the content of the bottom card. No more wasted cycles on drawing hidden content.
  89. clipRect Here’s what the render pass looks like now that

    we properly clip the content of the bottom card. No more wasted cycles on drawing hidden content.
  90. ✔ Faster display list drawing Properly clipping your custom views

    will help reduce the time spent drawing display lists. But we can do better.
  91. private void drawContent(Canvas c) { for (Item item : itemsList)

    { if (!c.quickReject(item.l, item.t, item.r, item.b, Canvas.EdgeType.BW)) { item.draw(c); } } } If quickReject() returns true, then the specified rectangle is outside of the current clip region. Which means whatever you draw within the bounds of the specified rectangle will not show up on screen. In this case you can simply avoid calling your drawing code. This is what the UI toolkit does automatically with views.
  92. ✔ Faster display list drawing ✔ Faster display list updates

    Any piece of code you can avoid running is a win for performance and quickReject can help you.
  93. Dim layer The dim layer is a special layer on

    Android, managed by the window manager and SurfaceFlinger. Any window can request the dim layer to draw attention to the foreground. It helps the user by giving context (you are still performing the same task) but removes distractions.
  94. getWindow().addFlags( WindowManager.LayoutParams.FLAG_DIM_BEHIND); Requesting a dim layer is very easy. Just

    set the appropriate flag on your activity’s or dialog’s window. You can also change the dim amount.
  95. type | ... | name ----------+ ... +------ OVERLAY |

    ... | com...MyActivity FB | ... | DimAnimator FB | ... | StatusBar FB | ... | NavigationBar Using a dim layer will unfortunately introduce a new surface that SurfaceFlinger must composite. This is usually enough to make apps fall out of the optimized all-overlays case. In addition, the dim layer is *always* composited by the GPU. Even if there are enough overlays available to do the composition, the dim layer will force a GPU composition.
  96. type | ... | name ----------+ ... +------ OVERLAY |

    ... | com...MyActivity FB | ... | DimAnimator FB | ... | StatusBar FB | ... | NavigationBar ✔ ✘ ✘ ✘ Using a dim layer will unfortunately introduce a new surface that SurfaceFlinger must composite. This is usually enough to make apps fall out of the optimized all-overlays case. In addition, the dim layer is *always* composited by the GPU. Even if there are enough overlays available to do the composition, the dim layer will force a GPU composition.
  97. getWindow().addFlags( WindowManager.LayoutParams.FLAG_DIM_BEHIND); If you need the extra resources, use a

    background drawable instead. This is what Android 4.1 Jelly Bean does in the quick contact window to ensure the opening animation is smooth, with a consistent frame-rate.
  98. getWindow().addFlags( WindowManager.LayoutParams.FLAG_DIM_BEHIND); getWindow().setBackground( new ColorDrawable(0x7f000000)); If you need the extra

    resources, use a background drawable instead. This is what Android 4.1 Jelly Bean does in the quick contact window to ensure the opening animation is smooth, with a consistent frame-rate.
  99. ✔ Faster display list drawing ✔ GPU-free window composition If

    you avoid using the dim layer, you can avoid spending GPU resources compositing the special dim layer. This will help improve the time spent drawing display lists. The fewer layers that need to be composited, the better.