Slide 1

Slide 1 text

Talking Trash @chethaase & @romainguy

Slide 2

Slide 2 text

“Garbage in, garbage out”

Slide 3

Slide 3 text

“Garbage in, garbage out” But how fast?

Slide 4

Slide 4 text

Modern Android Development Chet Haase @chethaase Romain Guy @romainguy

Slide 5

Slide 5 text

Modern Android Development Chet Haase @chethaase Romain Guy @romainguy

Slide 6

Slide 6 text

Modern Android Development Chet Haase @chethaase Romain Guy @romainguy Blah blah blah memory blah blah...

Slide 7

Slide 7 text

Optimized for size 
 | 
 +- JIT optimizations not 
 | as powerful 
 | 
 +- Allocation/collection slow 
 | 
 +- Heap fragmentation Dalvik

Slide 8

Slide 8 text

So: _ Avoid allocation whenever possible > For example: enums _ Primitive types are cool 
 (autoboxing is not)

Slide 9

Slide 9 text

ART Optimized for performance JIT + AOT Faster allocation/collection Heap defragmentation Large object heap

Slide 10

Slide 10 text

ART Optimized for performance JIT + AOT Faster allocation/collection Heap defragmentation Large object heap So:

Slide 11

Slide 11 text

ART Optimized for performance JIT + AOT Faster allocation/collection Heap defragmentation Large object heap So: Allocate as necessary (yes, even enums) Use appropriate types But: phones are still constrained batteries are, too be aware of inner-loop bottlenecks

Slide 12

Slide 12 text

Memories int foo = 5; MyObject thing = new MyObject();

Slide 13

Slide 13 text

Memories Stack int foo = 5; MyObject thing = new MyObject();

Slide 14

Slide 14 text

Memories Stack Registers int foo = 5; MyObject thing = new MyObject();

Slide 15

Slide 15 text

Memories Stack Registers int foo = 5; MyObject thing = new MyObject(); foo

Slide 16

Slide 16 text

Memories Stack Registers int foo = 5; MyObject thing = new MyObject(); foo

Slide 17

Slide 17 text

Memories Stack Registers int foo = 5; MyObject thing = new MyObject(); Heap foo

Slide 18

Slide 18 text

Memories Stack Registers int foo = 5; MyObject thing = new MyObject(); Heap foo thing

Slide 19

Slide 19 text

Manual Garbage Collection MyObject* thing = new MyObject; // code using thing... delete thing; // code using thing C++

Slide 20

Slide 20 text

Manual Garbage Collection MyObject* thing = new MyObject; // code using thing... delete thing; // code using thing Leak! C++

Slide 21

Slide 21 text

Manual Garbage Collection MyObject* thing = new MyObject; // code using thing... delete thing; // code using thing C++

Slide 22

Slide 22 text

Manual Garbage Collection MyObject* thing = new MyObject; // code using thing... delete thing; // code using thing C++

Slide 23

Slide 23 text

Manual Garbage Collection MyObject* thing = new MyObject; // code using thing... delete thing; // code using thing Crash! C++

Slide 24

Slide 24 text

Manual Garbage Collection MyObject* thing = new MyObject; // code using thing... delete thing; // code using thing Crash! C++ (maybe)

Slide 25

Slide 25 text

Automatic Garbage Collection MyObject thing = new MyObject(); // code using thing… Java

Slide 26

Slide 26 text

Automatic Garbage Collection MyObject thing = new MyObject(); // code using thing… No leak! Java // eventually freed

Slide 27

Slide 27 text

Automatic Garbage Collection MyObject thing = new MyObject(); // code using thing… No leak! Java // eventually freed // code using thing…

Slide 28

Slide 28 text

Automatic Garbage Collection MyObject thing = new MyObject(); // code using thing… No leak! No crash! Java // eventually freed // code using thing…

Slide 29

Slide 29 text

Runtime GC Concerns

Slide 30

Slide 30 text

Runtime GC Concerns How long does allocation take?

Slide 31

Slide 31 text

Runtime GC Concerns How long does allocation take? 
 How long does collection take?

Slide 32

Slide 32 text

Runtime GC Concerns How long does allocation take? 
 How long does collection take? What impact does this have across all threads?

Slide 33

Slide 33 text

Runtime GC Concerns How long does allocation take? 
 How long does collection take? What impact does this have across all threads? When do collections happen?

Slide 34

Slide 34 text

Runtime GC Concerns How long does allocation take? 
 How long does collection take? What impact does this have across all threads? When do collections happen? 
 How efficient is heap usage?

Slide 35

Slide 35 text

Dalvik GC ? Allocation

Slide 36

Slide 36 text

Dalvik GC ? Allocation

Slide 37

Slide 37 text

Dalvik GC ? Allocation

Slide 38

Slide 38 text

Dalvik GC Allocation

Slide 39

Slide 39 text

Dalvik GC Collection

Slide 40

Slide 40 text

Dalvik GC Collection Mark root set (pause)

Slide 41

Slide 41 text

Dalvik GC Collection Mark root set (pause) Mark reachable I (concurrent)

Slide 42

Slide 42 text

Dalvik GC Collection Mark root set (pause) Mark reachable I (concurrent)

Slide 43

Slide 43 text

Dalvik GC Collection Mark root set (pause) Mark reachable I (concurrent) Mark reachable II (pause)

Slide 44

Slide 44 text

Dalvik GC Collection Mark root set (pause) Mark reachable I (concurrent) Mark reachable II (pause)

Slide 45

Slide 45 text

Dalvik GC Collection Mark root set (pause) Mark reachable I (concurrent) Mark reachable II (pause) Collect (concurrent)

Slide 46

Slide 46 text

Dalvik GC ? Allocation, Take II

Slide 47

Slide 47 text

Dalvik GC ? Allocation, Take II

Slide 48

Slide 48 text

Dalvik GC ? Allocation, Take II

Slide 49

Slide 49 text

Collection: GC_FOR_ALLOC Dalvik GC ?

Slide 50

Slide 50 text

Collection: GC_FOR_ALLOC Dalvik GC ?

Slide 51

Slide 51 text

Allocation Dalvik GC ?

Slide 52

Slide 52 text

Allocation Dalvik GC

Slide 53

Slide 53 text

Dalvik GC ? Allocation, Take III

Slide 54

Slide 54 text

Dalvik GC ? Allocation, Take III

Slide 55

Slide 55 text

Dalvik GC ? Grow the Heap

Slide 56

Slide 56 text

Dalvik GC ? Out of Memory Error Grow the Heap or…

Slide 57

Slide 57 text

Fragmentation! (Kitkat) Heap

Slide 58

Slide 58 text

Fragmentation! (Kitkat) Heap

Slide 59

Slide 59 text

Fragmentation! (Kitkat) Heap

Slide 60

Slide 60 text

Fragmentation! (Kitkat) Heap

Slide 61

Slide 61 text

Fragmentation! (Kitkat) Heap

Slide 62

Slide 62 text

Fragmentation! (Kitkat) Heap

Slide 63

Slide 63 text

Fragmentation! (Kitkat) Heap ?

Slide 64

Slide 64 text

Fragmentation! (Kitkat) Heap ?

Slide 65

Slide 65 text

D/dalvikvm: GC_FOR_ALLOC freed 2K, 50% free 197613K/392536K, paused 6ms, total 6ms I/dalvikvm-heap: Forcing collection of SoftReferences for 2000012-byte allocation D/dalvikvm: GC_BEFORE_OOM freed 0K, 50% free 197613K/392536K, paused 6ms, total 6ms E/dalvikvm-heap: Out of memory on a 2000012-byte allocation. Sad Logcat

Slide 66

Slide 66 text

D/dalvikvm: GC_FOR_ALLOC freed 2K, 50% free 197613K/392536K, paused 6ms, total 6ms I/dalvikvm-heap: Forcing collection of SoftReferences for 2000012-byte allocation D/dalvikvm: GC_BEFORE_OOM freed 0K, 50% free 197613K/392536K, paused 6ms, total 6ms E/dalvikvm-heap: Out of memory on a 2000012-byte allocation. Sad Logcat

Slide 67

Slide 67 text

D/dalvikvm: GC_FOR_ALLOC freed 2K, 50% free 197613K/392536K, paused 6ms, total 6ms I/dalvikvm-heap: Forcing collection of SoftReferences for 2000012-byte allocation D/dalvikvm: GC_BEFORE_OOM freed 0K, 50% free 197613K/392536K, paused 6ms, total 6ms E/dalvikvm-heap: Out of memory on a 2000012-byte allocation. Sad Logcat

Slide 68

Slide 68 text

D/dalvikvm: GC_FOR_ALLOC freed 2K, 50% free 197613K/392536K, paused 6ms, total 6ms I/dalvikvm-heap: Forcing collection of SoftReferences for 2000012-byte allocation D/dalvikvm: GC_BEFORE_OOM freed 0K, 50% free 197613K/392536K, paused 6ms, total 6ms E/dalvikvm-heap: Out of memory on a 2000012-byte allocation. Sad Logcat

Slide 69

Slide 69 text

D/dalvikvm: GC_FOR_ALLOC freed 2K, 50% free 197613K/392536K, paused 6ms, total 6ms I/dalvikvm-heap: Forcing collection of SoftReferences for 2000012-byte allocation D/dalvikvm: GC_BEFORE_OOM freed 0K, 50% free 197613K/392536K, paused 6ms, total 6ms E/dalvikvm-heap: Out of memory on a 2000012-byte allocation. Sad Logcat

Slide 70

Slide 70 text

ART (Lollipop) Faster allocation! 
 Faster collection! 
 Faster runtime!

Slide 71

Slide 71 text

ART Allocation

Slide 72

Slide 72 text

ART Allocation RosAlloc

Slide 73

Slide 73 text

ART Allocation RosAlloc Replacement for dlmalloc

Slide 74

Slide 74 text

ART Allocation RosAlloc Replacement for dlmalloc Thread-local allocations

Slide 75

Slide 75 text

ART Allocation RosAlloc Replacement for dlmalloc Thread-local allocations Grouped small allocations, page-aligned large allocations

Slide 76

Slide 76 text

ART Allocation RosAlloc Replacement for dlmalloc Thread-local allocations Grouped small allocations, page-aligned large allocations Finer-grained locks

Slide 77

Slide 77 text

ART Allocation RosAlloc Replacement for dlmalloc Thread-local allocations Grouped small allocations, page-aligned large allocations Finer-grained locks 4-5x faster than Dalvik!

Slide 78

Slide 78 text

ART Allocation RosAlloc Replacement for dlmalloc Thread-local allocations Grouped small allocations, page-aligned large allocations Finer-grained locks 4-5x faster than Dalvik!

Slide 79

Slide 79 text

ART Allocation

Slide 80

Slide 80 text

ART Allocation Large object space

Slide 81

Slide 81 text

ART Allocation Large object space

Slide 82

Slide 82 text

ART Allocation Large object space

Slide 83

Slide 83 text

ART Allocation Large object space

Slide 84

Slide 84 text

ART Allocation Large object space In Dalvik

Slide 85

Slide 85 text

ART Allocation Large object space ? In Dalvik

Slide 86

Slide 86 text

ART Allocation Large object space In ART

Slide 87

Slide 87 text

ART Allocation Large object space In ART

Slide 88

Slide 88 text

ART Allocation Large object space Moving collector! In ART

Slide 89

Slide 89 text

ART Allocation Large object space Moving collector! No more fragmentation! In ART

Slide 90

Slide 90 text

ART Allocation Large object space Moving collector! No more fragmentation! In ART * * Eventually

Slide 91

Slide 91 text

ART Allocation Large object space Moving collector! No more fragmentation! In ART * * Eventually

Slide 92

Slide 92 text

Fragmentation! (L+) Heap

Slide 93

Slide 93 text

Fragmentation! (L+) Heap

Slide 94

Slide 94 text

Fragmentation! (L+) Heap

Slide 95

Slide 95 text

Fragmentation! (L+) Heap ?

Slide 96

Slide 96 text

Fragmentation! (L+) Heap ?

Slide 97

Slide 97 text

Dalvik GC Mark root set (pause) Mark reachable I (concurrent) Mark reachable II (pause) Collect (concurrent)

Slide 98

Slide 98 text

~10ms Dalvik GC Mark root set (pause) Mark reachable I (concurrent) Mark reachable II (pause) Collect (concurrent)

Slide 99

Slide 99 text

~10ms Mark root set (pause) Mark reachable I (concurrent) Mark reachable II (pause) Collect (concurrent) ART GC

Slide 100

Slide 100 text

Mark root set (pause) Mark reachable I (concurrent) Mark reachable II (pause) Collect (concurrent) ART GC

Slide 101

Slide 101 text

Mark root set (concurrent) Mark reachable I (concurrent) Mark reachable II (pause) Collect (concurrent) ART GC

Slide 102

Slide 102 text

Mark root set (concurrent) Mark reachable I (concurrent) Mark reachable II (pause) Collect (concurrent) ART GC Faster!

Slide 103

Slide 103 text

~3ms Mark root set (concurrent) Mark reachable I (concurrent) Mark reachable II (pause) Collect (concurrent) ART GC Faster!

Slide 104

Slide 104 text

ART Collection

Slide 105

Slide 105 text

ART Collection Minor GC

Slide 106

Slide 106 text

ART Collection Minor GC Fast collection of “young generation”

Slide 107

Slide 107 text

ART Collection Minor GC Fast collection of “young generation” Temporary objects less expensive

Slide 108

Slide 108 text

ART Collection Minor GC Fast collection of “young generation” Temporary objects less expensive 
 Large object heap

Slide 109

Slide 109 text

ART Collection Minor GC Fast collection of “young generation” Temporary objects less expensive 
 Large object heap Less fragmentation

Slide 110

Slide 110 text

ART Collection Minor GC Fast collection of “young generation” Temporary objects less expensive 
 Large object heap Less fragmentation Less heap resizing

Slide 111

Slide 111 text

ART Collection Minor GC Fast collection of “young generation” Temporary objects less expensive 
 Large object heap Less fragmentation Less heap resizing Fewer GC_FOR_ALLOC pauses

Slide 112

Slide 112 text

ART Collection Minor GC Fast collection of “young generation” Temporary objects less expensive 
 Large object heap Less fragmentation Less heap resizing Fewer GC_FOR_ALLOC pauses 
 Faster runtime

Slide 113

Slide 113 text

ART in Marshmallow Optimizing compiler Allocation optimizations

Slide 114

Slide 114 text

ART in Nougat

Slide 115

Slide 115 text

ART in Nougat More inlining and optimizations

Slide 116

Slide 116 text

ART in Nougat More inlining and optimizations

Slide 117

Slide 117 text

ART in Nougat More inlining and optimizations Allocation

Slide 118

Slide 118 text

ART in Nougat More inlining and optimizations Allocation Rewritten in assembly

Slide 119

Slide 119 text

ART in Nougat More inlining and optimizations Allocation Rewritten in assembly 10x faster than Dalvik (Kitkat)

Slide 120

Slide 120 text

ART in Oreo

Slide 121

Slide 121 text

ART in Oreo Concurrent heap compaction

Slide 122

Slide 122 text

ART in Oreo Concurrent heap compaction Defragmentation in foreground!

Slide 123

Slide 123 text

ART in Oreo Concurrent heap compaction Defragmentation in foreground! Less heap resizing, GC_FOR_ALLOC

Slide 124

Slide 124 text

ART in Oreo Concurrent heap compaction Defragmentation in foreground! Less heap resizing, GC_FOR_ALLOC

Slide 125

Slide 125 text

ART in Oreo Concurrent heap compaction Defragmentation in foreground! Less heap resizing, GC_FOR_ALLOC Device-wide memory savings

Slide 126

Slide 126 text

ART in Oreo Concurrent heap compaction Defragmentation in foreground! Less heap resizing, GC_FOR_ALLOC Device-wide memory savings System and Google Play Services

Slide 127

Slide 127 text

ART in Oreo Concurrent heap compaction Defragmentation in foreground! Less heap resizing, GC_FOR_ALLOC Device-wide memory savings System and Google Play Services Smaller heaps for all

Slide 128

Slide 128 text

ART in Oreo Concurrent heap compaction Defragmentation in foreground! Less heap resizing, GC_FOR_ALLOC Device-wide memory savings System and Google Play Services Smaller heaps for all

Slide 129

Slide 129 text

Concurrent Compaction Heap ... T0 Region T1 Region T2 Region T3 Region Tn Region ... ...

Slide 130

Slide 130 text

Concurrent Compaction Heap ... T0 Region T1 Region T2 Region T3 Region Tn Region ... ... Compaction Phase

Slide 131

Slide 131 text

Concurrent Compaction Heap ... T0 Region T1 Region T2 Region T3 Region Tn Region ... ... Compaction Phase

Slide 132

Slide 132 text

Concurrent Compaction Heap ... T0 Region T1 Region T2 Region T3 Region Tn Region ... ... Compaction Phase

Slide 133

Slide 133 text

Concurrent Compaction Heap ... T0 Region T1 Region T2 Region T3 Region Tn Region ... ... Compaction Phase

Slide 134

Slide 134 text

ART in Oreo Thread-local bump allocator 70% faster allocations than Nougat 18x faster than Dalvik (Kitkat)

Slide 135

Slide 135 text

Concurrent Compaction — Allocation Heap ... T0 Region T1 Region T2 Region T3 Region Tn Region All-thread Heap ... ...

Slide 136

Slide 136 text

Concurrent Compaction — Allocation Heap ... T0 Region T1 Region T2 Region T3 Region Tn Region All-thread Heap ... ...

Slide 137

Slide 137 text

Concurrent Compaction — Allocation Heap ... T0 Region T1 Region T2 Region T3 Region Tn Region All-thread Heap ... ... T1 Region Free Pointer

Slide 138

Slide 138 text

Concurrent Compaction — Allocation Heap ... T0 Region T1 Region T2 Region T3 Region Tn Region All-thread Heap ... ... T1 Region Free Pointer

Slide 139

Slide 139 text

Concurrent Compaction — Allocation Heap ... T0 Region T1 Region T2 Region T3 Region Tn Region All-thread Heap ... ... T1 Region Free Pointer

Slide 140

Slide 140 text

Concurrent Compaction — Allocation Heap ... T0 Region T1 Region T2 Region T3 Region Tn Region All-thread Heap ... ... T1 Region Free Pointer

Slide 141

Slide 141 text

Concurrent Compaction — Allocation Heap ... T0 Region T1 Region T2 Region T3 Region Tn Region All-thread Heap ... ... T1 Region Free Pointer

Slide 142

Slide 142 text

Concurrent Compaction — Allocation Heap ... T0 Region T1 Region T2 Region T3 Region Tn Region All-thread Heap ... ... T1 Region Free Pointer

Slide 143

Slide 143 text

Allocation Improvements

Slide 144

Slide 144 text

ART in O+

Slide 145

Slide 145 text

ART in O+ Young generation collections gone in O

Slide 146

Slide 146 text

ART in O+ Young generation collections gone in O Enabled in AOSP

Slide 147

Slide 147 text

ART in O+ Young generation collections gone in O Enabled in AOSP Watch for that future release…

Slide 148

Slide 148 text

Object Pools

Slide 149

Slide 149 text

Object Pools Conventional wisdom

Slide 150

Slide 150 text

Object Pools Conventional wisdom Reusing objects is faster (saves on allocation/collection time)


Slide 151

Slide 151 text

Object Pools Conventional wisdom Reusing objects is faster (saves on allocation/collection time)
 Actual wisdom

Slide 152

Slide 152 text

Object Pools Conventional wisdom Reusing objects is faster (saves on allocation/collection time)
 Actual wisdom As of Oreo, synchronized object pools are generally slower

Slide 153

Slide 153 text

Soooo… What Now? Creating garbage is okay (and so is collecting it)
 Use the types and objects you need Even enums
 GC is still overhead But not as critical to avoid as it was in Dalvik Make the right choices for your architecture Avoid overhead in critical sections when possible

Slide 154

Slide 154 text

Jank Test Autoboxing

Slide 155

Slide 155 text

Jank Test Autoboxing private Float[] mHolder = new Float[100_000];

Slide 156

Slide 156 text

Jank Test Autoboxing public void run() { long startTime = System.currentTimeMillis(); float f = 0f; for (int i = 0; i < mHolder.length; ++i, f += 1.0f) { mHolder[i] = f; } System.out.println("Alloc time = " + (System.currentTimeMillis() - startTime)); } private Float[] mHolder = new Float[100_000];

Slide 157

Slide 157 text

Jank Test Autoboxing public void run() { long startTime = System.currentTimeMillis(); float f = 0f; for (int i = 0; i < mHolder.length; ++i, f += 1.0f) { mHolder[i] = f; } System.out.println("Alloc time = " + (System.currentTimeMillis() - startTime)); } private Float[] mHolder = new Float[100_000]; I/System.out: Alloc time = 28 D/dalvikvm: GC_FOR_ALLOC freed 2047K, 1% free 337371K/339492K, paused 10ms, total 10ms I/System.out: Alloc time = 29

Slide 158

Slide 158 text

Jank Test Autoboxing public void run() { long startTime = System.currentTimeMillis(); float f = 0f; for (int i = 0; i < mHolder.length; ++i, f += 1.0f) { mHolder[i] = f; } System.out.println("Alloc time = " + (System.currentTimeMillis() - startTime)); } private Float[] mHolder = new Float[100_000]; I/System.out: Alloc time = 28 D/dalvikvm: GC_FOR_ALLOC freed 2047K, 1% free 337371K/339492K, paused 10ms, total 10ms I/System.out: Alloc time = 29 I/System.out: Alloc time = 3 I/System.out: Alloc time = 2 I/System.out: Alloc time = 4

Slide 159

Slide 159 text

Jank Test Bitmaps

Slide 160

Slide 160 text

Jank Test Bitmaps public void run() { long startTime = System.currentTimeMillis(); mBitmap = Bitmap.createBitmap(1_000, 1_000, Bitmap.Config.ARGB_8888); System.out.println("Alloc time = " + (System.currentTimeMillis() - startTime)); }

Slide 161

Slide 161 text

Jank Test Bitmaps public void run() { long startTime = System.currentTimeMillis(); mBitmap = Bitmap.createBitmap(1_000, 1_000, Bitmap.Config.ARGB_8888); System.out.println("Alloc time = " + (System.currentTimeMillis() - startTime)); } I/System.out: Alloc time = 16 D/dalvikvm: GC_FOR_ALLOC freed 3907K, 2% free 341280K/347244K, paused 7ms, total 7ms I/dalvikvm-heap: Grow heap (frag case) to 337.165MB for 4000012-byte allocation D/dalvikvm: GC_FOR_ALLOC freed <1K, 1% free 345186K/347244K, paused 7ms, total 7ms

Slide 162

Slide 162 text

Jank Test Bitmaps public void run() { long startTime = System.currentTimeMillis(); mBitmap = Bitmap.createBitmap(1_000, 1_000, Bitmap.Config.ARGB_8888); System.out.println("Alloc time = " + (System.currentTimeMillis() - startTime)); } I/System.out: Alloc time = 16 D/dalvikvm: GC_FOR_ALLOC freed 3907K, 2% free 341280K/347244K, paused 7ms, total 7ms I/dalvikvm-heap: Grow heap (frag case) to 337.165MB for 4000012-byte allocation D/dalvikvm: GC_FOR_ALLOC freed <1K, 1% free 345186K/347244K, paused 7ms, total 7ms I/System.out: Alloc time = 1 I/System.out: Alloc time = 0 I/System.out: Alloc time = 0 I/System.out: Alloc time = 1

Slide 163

Slide 163 text

No content

Slide 164

Slide 164 text

data class Float3(x: Float, y: Float, z: Float) fun Tonemap_ACES(x: Float3): Float3 { val a = 2.51f val b = 0.03f val c = 2.43f val d = 0.59f val e = 0.14f return (x * (a * x + b)) / (x * (c * x + d) + e) }

Slide 165

Slide 165 text

inline operator fun Float.plus(v: Float3) = Float3(this + v.x, this + v.y, this + v.z) inline operator fun Float.times(v: Float3) = Float3(this * v.x, this * v.y, this * v.z)

Slide 166

Slide 166 text

No content

Slide 167

Slide 167 text

No content

Slide 168

Slide 168 text

No content

Slide 169

Slide 169 text

Android O, 1 tile = 00.1~00.5s

Slide 170

Slide 170 text

Android O, 1 tile = 00.1~00.5s Android K, 1 tile = 40.0~50.0s

Slide 171

Slide 171 text

10-23 14:40:04.997 3885-3908/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 4ms, total 4ms 10-23 14:40:05.067 3885-3908/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 2ms, total 2ms 10-23 14:40:05.147 3885-3901/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 2ms, total 2ms 10-23 14:40:05.177 3885-3907/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 5ms, total 5ms 10-23 14:40:05.207 3885-3909/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 3ms, total 3ms 10-23 14:40:05.277 3885-3903/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 2ms, total 2ms 10-23 14:40:05.307 3885-3909/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5125K/5912K, paused 3ms, total 3ms 10-23 14:40:05.357 3885-3901/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 713K, 14% free 5124K/5912K, paused 5ms, total 5ms 10-23 14:40:05.397 3885-3903/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 3ms, total 3ms 10-23 14:40:05.447 3885-3901/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 2ms, total 2ms 10-23 14:40:05.517 3885-3902/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 2ms, total 2ms 10-23 14:40:05.607 3885-3909/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 2ms, total 2ms 10-23 14:40:05.677 3885-3903/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 2ms, total 2ms 10-23 14:40:05.707 3885-3909/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 4ms, total 4ms 10-23 14:40:05.767 3885-3902/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 2ms, total 2ms 10-23 14:40:05.837 3885-3908/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 2ms, total 2ms 10-23 14:40:05.867 3885-3902/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 2ms, total 2ms 10-23 14:40:05.897 3885-3901/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 2ms, total 2ms 10-23 14:40:05.957 3885-3902/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 5ms, total 5ms 10-23 14:40:05.997 3885-3902/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 2ms, total 2ms 10-23 14:40:06.037 3885-3908/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 2ms, total 2ms 10-23 14:40:06.107 3885-3908/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 2ms, total 2ms 10-23 14:40:06.157 3885-3902/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 3ms, total 3ms 10-23 14:40:06.227 3885-3909/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 4ms, total 4ms 10-23 14:40:06.267 3885-3901/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 3ms, total 3ms 10-23 14:40:06.337 3885-3903/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 2ms, total 2ms 10-23 14:40:06.367 3885-3907/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 2ms, total 2ms 10-23 14:40:06.437 3885-3901/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 5ms, total 5ms 10-23 14:40:06.527 3885-3909/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 2ms, total 2ms 10-23 14:40:06.597 3885-3907/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 4ms, total 4ms 10-23 14:40:06.617 3885-3901/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 4ms, total 4ms 10-23 14:40:06.697 3885-3909/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 2ms, total 2ms 10-23 14:40:06.717 3885-3902/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 2ms, total 2ms 10-23 14:40:06.767 3885-3901/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 2ms, total 2ms 10-23 14:40:06.817 3885-3901/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 713K, 14% free 5124K/5912K, paused 3ms, total 3ms 10-23 14:40:06.857 3885-3907/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 711K, 14% free 5124K/5912K, paused 5ms, total 5ms 10-23 14:40:06.937 3885-3908/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 5ms, total 5ms 10-23 14:40:06.957 3885-3901/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 2ms, total 2ms 10-23 14:40:07.027 3885-3903/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 2ms, total 2ms 10-23 14:40:07.117 3885-3907/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 2ms, total 2ms 10-23 14:40:07.157 3885-3901/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 2ms, total 2ms 10-23 14:40:07.197 3885-3909/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 5ms, total 5ms 10-23 14:40:07.257 3885-3908/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 2ms, total 2ms 10-23 14:40:07.307 3885-3908/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 3ms, total 3ms 10-23 14:40:07.357 3885-3908/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 2ms, total 2ms 10-23 14:40:07.397 3885-3903/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 3ms, total 3ms 10-23 14:40:07.447 3885-3903/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 2ms, total 2ms 10-23 14:40:07.477 3885-3908/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 2ms, total 2ms 10-23 14:40:07.557 3885-3901/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 2ms, total 2ms 10-23 14:40:07.617 3885-3903/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 2ms, total 2ms 10-23 14:40:07.647 3885-3908/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 2ms, total 2ms 10-23 14:40:07.677 3885-3901/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 5ms, total 5ms 10-23 14:40:07.707 3885-3903/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 2ms, total 2ms 10-23 14:40:07.727 3885-3901/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 5ms, total 5ms 10-23 14:40:07.767 3885-3909/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 2ms, total 2ms 10-23 14:40:07.797 3885-3908/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 2ms, total 2ms 10-23 14:40:07.847 3885-3907/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 5ms, total 5ms 10-23 14:40:07.927 3885-3903/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5125K/5912K, paused 5ms, total 5ms 10-23 14:40:08.057 3885-3902/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 713K, 14% free 5124K/5912K, paused 4ms, total 4ms 10-23 14:40:08.077 3885-3907/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 5ms, total 5ms 10-23 14:40:08.107 3885-3903/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 2ms, total 2ms 10-23 14:40:08.147 3885-3908/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 4ms, total 4ms 10-23 14:40:08.207 3885-3901/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 2ms, total 2ms 10-23 14:40:08.267 3885-3902/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 713K, 14% free 5124K/5912K, paused 4ms, total 4ms 10-23 14:40:08.287 3885-3907/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 711K, 14% free 5124K/5912K, paused 5ms, total 5ms 10-23 14:40:08.337 3885-3902/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 4ms, total 4ms

Slide 172

Slide 172 text

10-23 14:40:24.847 3885-3909/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 3ms, total 3ms 10-23 14:40:24.917 3885-3903/com.google.ray_trasher D/dalvikvm: GC_FOR_ALLOC freed 712K, 14% free 5124K/5912K, paused 2ms, total 2ms

Slide 173

Slide 173 text

No content

Slide 174

Slide 174 text

Benchmarks, GC and Caches

Slide 175

Slide 175 text

Core Core Core Core Kryo 385 (Pixel 3) “Gold” cores

Slide 176

Slide 176 text

Core Core Core Core L1 L1 L1 L1 Kryo 385 (Pixel 3) 4x 32 KiB “Gold” cores

Slide 177

Slide 177 text

Core Core Core Core L2 L2 L2 L2 L1 L1 L1 L1 Kryo 385 (Pixel 3) 4x 32 KiB “Gold” cores 4x 256 KiB

Slide 178

Slide 178 text

Core Core Core Core L2 L2 L2 L2 L3 L1 L1 L1 L1 Kryo 385 (Pixel 3) 4x 32 KiB “Gold” cores 4x 256 KiB 1x 2 MiB

Slide 179

Slide 179 text

L2 L3 L1 private val data = FloatArray(16) // … val a = foo[0] RAM

Slide 180

Slide 180 text

L2 L3 L1 private val data = FloatArray(16) // … val a = foo[0] RAM

Slide 181

Slide 181 text

L2 L3 L1 private val data = FloatArray(16) // … val a = foo[0] RAM

Slide 182

Slide 182 text

L2 L3 L1 private val data = FloatArray(16) // … val a = foo[0] RAM

Slide 183

Slide 183 text

L2 L3 L1 private val data = FloatArray(16) // … val a = foo[0] RAM

Slide 184

Slide 184 text

64 bytes

Slide 185

Slide 185 text

val m = ArrayList(n) 64 bytes

Slide 186

Slide 186 text

m[0] = FloatArray(4) m[1] = FloatArray(4) m[2] = FloatArray(4) m[n] = FloatArray(4) val m = ArrayList(n) 64 bytes

Slide 187

Slide 187 text

m[0] = FloatArray(4) m[1] = FloatArray(4) m[2] = FloatArray(4) m[n] = FloatArray(4) val m = ArrayList(n) RAM 64 bytes

Slide 188

Slide 188 text

m[0] FloatArray(4) m[1] FloatArray(4) m[2] FloatArray(4) m[n] FloatArray(4) val m = ArrayList(n) for (i in 0 until m.size - 3) { val a = m[i ] val b = m[i + 1] val c = m[i + 2] val d = m[i + 3] computeStuff(a, b, c, d) } L1 64 bytes

Slide 189

Slide 189 text

m[0] FloatArray(4) m[1] FloatArray(4) m[2] FloatArray(4) m[n] FloatArray(4) val m = ArrayList(n) for (i in 0 until m.size - 3) { val a = m[i ] val b = m[i + 1] val c = m[i + 2] val d = m[i + 3] computeStuff(a, b, c, d) } L1 64 bytes

Slide 190

Slide 190 text

m[0] FloatArray(4) m[1] FloatArray(4) m[2] FloatArray(4) m[n] FloatArray(4) val m = ArrayList(n) for (i in 0 until m.size - 3) { val a = m[i ] val b = m[i + 1] val c = m[i + 2] val d = m[i + 3] computeStuff(a, b, c, d) } L1 64 bytes

Slide 191

Slide 191 text

m[0] FloatArray(4) m[1] FloatArray(4) m[2] FloatArray(4) m[n] FloatArray(4) val m = ArrayList(n) for (i in 0 until m.size - 3) { val a = m[i ] val b = m[i + 1] val c = m[i + 2] val d = m[i + 3] computeStuff(a, b, c, d) } L1 64 bytes

Slide 192

Slide 192 text

m[0] FloatArray(4) m[1] FloatArray(4) m[2] FloatArray(4) m[n] FloatArray(4) val m = ArrayList(n) for (i in 0 until m.size - 3) { val a = m[i ] val b = m[i + 1] val c = m[i + 2] val d = m[i + 3] computeStuff(a, b, c, d) } L1 64 bytes

Slide 193

Slide 193 text

m[0] FloatArray(4) m[1] FloatArray(4) m[2] FloatArray(4) m[n] FloatArray(4) val m = ArrayList(n) for (i in 0 until m.size - 3) { val a = m[i ] val b = m[i + 1] val c = m[i + 2] val d = m[i + 3] computeStuff(a, b, c, d) } L1 64 bytes

Slide 194

Slide 194 text

m[0] = FloatArray(4) m[1] = FloatArray(4) m[2] = FloatArray(4) m[n] = FloatArray(4) val m = ArrayList(n) RAM 64 bytes

Slide 195

Slide 195 text

m[0] FloatArray(4) m[1] FloatArray(4) m[2] FloatArray(4) m[n] FloatArray(4) val m = ArrayList(n) for (i in 0 until m.size - 3) { val a = m[i ] val b = m[i + 1] val c = m[i + 2] val d = m[i + 3] computeStuff(a, b, c, d) } L1 64 bytes

Slide 196

Slide 196 text

m[0] FloatArray(4) m[1] FloatArray(4) m[2] FloatArray(4) m[n] FloatArray(4) val m = ArrayList(n) for (i in 0 until m.size - 3) { val a = m[i ] val b = m[i + 1] val c = m[i + 2] val d = m[i + 3] computeStuff(a, b, c, d) } L1 64 bytes

Slide 197

Slide 197 text

m[0] FloatArray(4) m[1] FloatArray(4) m[2] FloatArray(4) m[n] FloatArray(4) val m = ArrayList(n) for (i in 0 until m.size - 3) { val a = m[i ] val b = m[i + 1] val c = m[i + 2] val d = m[i + 3] computeStuff(a, b, c, d) } L1 64 bytes

Slide 198

Slide 198 text

m[0] FloatArray(4) m[1] FloatArray(4) m[2] FloatArray(4) m[n] FloatArray(4) val m = ArrayList(n) for (i in 0 until m.size - 3) { val a = m[i ] val b = m[i + 1] val c = m[i + 2] val d = m[i + 3] computeStuff(a, b, c, d) } L1 64 bytes

Slide 199

Slide 199 text

m[0] FloatArray(4) m[1] FloatArray(4) m[2] FloatArray(4) m[n] FloatArray(4) val m = ArrayList(n) for (i in 0 until m.size - 3) { val a = m[i ] val b = m[i + 1] val c = m[i + 2] val d = m[i + 3] computeStuff(a, b, c, d) } L1 64 bytes

Slide 200

Slide 200 text

m[0] FloatArray(4) m[1] FloatArray(4) m[2] FloatArray(4) m[n] FloatArray(4) val m = ArrayList(n) for (i in 0 until m.size - 3) { val a = m[i ] val b = m[i + 1] val c = m[i + 2] val d = m[i + 3] computeStuff(a, b, c, d) } L1 64 bytes

Slide 201

Slide 201 text

m[0] FloatArray(4) m[1] FloatArray(4) m[2] FloatArray(4) m[n] FloatArray(4) val m = ArrayList(n) for (i in 0 until m.size - 3) { val a = m[i ] val b = m[i + 1] val c = m[i + 2] val d = m[i + 3] computeStuff(a, b, c, d) } L1 64 bytes

Slide 202

Slide 202 text

m[0] FloatArray(4) m[1] FloatArray(4) m[2] FloatArray(4) m[n] FloatArray(4) val m = ArrayList(n) for (i in 0 until m.size - 3) { val a = m[i ] val b = m[i + 1] val c = m[i + 2] val d = m[i + 3] computeStuff(a, b, c, d) } L1 64 bytes

Slide 203

Slide 203 text

m[0] FloatArray(4) m[1] FloatArray(4) m[2] FloatArray(4) m[n] FloatArray(4) val m = ArrayList(n) for (i in 0 until m.size - 3) { val a = m[i ] val b = m[i + 1] val c = m[i + 2] val d = m[i + 3] computeStuff(a, b, c, d) } L1 64 bytes

Slide 204

Slide 204 text

0.0 1.0 2.0 3.0 4.0 5.0 6.0 No thrash L1 thrash L2 thrash Relative computation times (Pixel 3)

Slide 205

Slide 205 text

On some workloads, the work of the GC will affect performance

Slide 206

Slide 206 text

You might be benchmarking perfect memory access patterns

Slide 207

Slide 207 text

Questions?