Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Dissecting Andy: The Dalvik VM under the microscope

Takipi
October 22, 2013

Dissecting Andy: The Dalvik VM under the microscope

An in-depth look at the Dalvik VM -- the state-of-the-art mobile virtual machine at the core of the Android platform.

Key points:
* The major differences between Dalvik and other prominent VMs such as HotSpot.
* How Dalvik was designed for optimal performance under strict memory, energy consumption and processing power constraints.
* How the Dalvik VM is built into the Android OS.
* Important things every developer should know when writing for the Android platform.

Takipi

October 22, 2013
Tweet

More Decks by Takipi

Other Decks in Programming

Transcript

  1. Previously • Led development of AutoCAD WS 10M users on

    Android & iOS • Principal Eng. at VisualTao Acquired by Autodesk Inc. 2009 • Researcher in elite IDF tech unit
  2. Agenda • Introduction • Dalvik is a unique VM •

    Life on a mobile device • Development tips
  3. Dalvik (the VM) • The VM which powers Android •

    Optimized for mobile devices • Very different from HotSpot • Integrated into the OS
  4. Agenda • Introduction • Dalvik is a unique VM •

    Life on a mobile device • Development tips
  5. What makes Dalvik different? • Optimized for mobile devices ◦

    Memory constraints ▪ Small RAM ▪ No swap file! ◦ CPU constraints ◦ Storage constraints
  6. What makes Dalvik different? • Optimized for mobile devices ◦

    Memory constraints ◦ CPU constraints ▪ Relatively slow processor ▪ Small cache ◦ Storage constraints
  7. What makes Dalvik different? • Optimized for mobile devices ◦

    Memory constraints ◦ CPU constraints ◦ Storage constraints ▪ Small internal storage ▪ External storage not always available
  8. What makes Dalvik different? • Optimized for mobile devices ◦

    Memory constraints ◦ CPU constraints ◦ Storage constraints ◦ Power constraints ▪ CPU drains battery
  9. What makes Dalvik different? • Designed to be able to

    run efficiently on an extremely wide variety of hardware specs
  10. What makes Dalvik different? • Designed to be able to

    run efficiently on an extremely wide variety of hardware specs ◦ RAM: 32 MB ~ 2 GB ◦ CPU: 200 MHz ~ 2 GHz multi-core ◦ Storage: 32 MB ~ 100s of GBs
  11. What makes Dalvik different? • Host all application processes ◦

    Thus, it is integrated into the OS ◦ Device runs many VMs concurrently ◦ Even internal applications ◦ Apps must be responsive
  12. Agenda • Introduction • Dalvik is a unique VM •

    Life on a mobile device • Development tips
  13. What's in a JVM? • Runtime libraries • Garbage collector

    • Class loading mechanism • Java bytecode interpreter ◦ Optionally: JIT compiler, multiple GCs, debugger...
  14. Class life Java / Scala / Clojure Compiler .class .class

    .class Class Loader Bytecode Interpreter
  15. Class life Java / Scala / Clojure Compiler .class .class

    .class Class Loader Bytecode Interpreter JIT Compiler
  16. Class file • Constant pool ◦ String literals ◦ Number

    constants ◦ Identifiers • Method code • Fields • More...
  17. Java bytecode public static int min(int a, int b) {

    if (a < b) { return a; } else { return b; } }
  18. Java bytecode public static int min(int a, int b) {

    if (a < b) { return a; } else { return b; } } ILOAD 0 ILOAD 1 IF_ICMPGE L1 ILOAD 0 IRETURN L1: ILOAD 1 IRETURN
  19. Java bytecode public static int min(int a, int b) {

    if (a < b) { return a; } else { return b; } } ILOAD 0 ILOAD 1 IF_ICMPGE L1 ILOAD 0 IRETURN L1: ILOAD 1 IRETURN *a
  20. Java bytecode public static int min(int a, int b) {

    if (a < b) { return a; } else { return b; } } ILOAD 0 ILOAD 1 IF_ICMPGE L1 ILOAD 0 IRETURN L1: ILOAD 1 IRETURN *b *a
  21. Java bytecode public static int min(int a, int b) {

    if (a < b) { return a; } else { return b; } } ILOAD 0 ILOAD 1 IF_ICMPGE L1 ILOAD 0 IRETURN L1: ILOAD 1 IRETURN *b *a
  22. Java bytecode public static int min(int a, int b) {

    if (a < b) { return a; } else { return b; } } ILOAD 0 ILOAD 1 IF_ICMPGE L1 ILOAD 0 IRETURN L1: ILOAD 1 IRETURN
  23. Java bytecode public static int min(int a, int b) {

    if (a < b) { return a; } else { return b; } } ILOAD 0 ILOAD 1 IF_ICMPGE L1 ILOAD 0 IRETURN L1: ILOAD 1 IRETURN *b
  24. Java bytecode public static int min(int a, int b) {

    if (a < b) { return a; } else { return b; } } ILOAD 0 ILOAD 1 IF_ICMPGE L1 ILOAD 0 IRETURN L1: ILOAD 1 IRETURN *b Return value
  25. What's in a JVM Dalvik? • Runtime libraries • Garbage

    collector • Class loading mechanism • Java bytecode interpreter ◦ Also: JIT compiler, debugger support
  26. What's in a JVM Dalvik? • Runtime libraries ✓ •

    Garbage collector • Class loading mechanism • Java bytecode interpreter ◦ Also: JIT compiler, debugger support
  27. What's in a JVM Dalvik? • Runtime libraries ✓ •

    Garbage collector ✓ • Class loading mechanism • Java bytecode interpreter ◦ Also: JIT compiler, debugger support
  28. Class life Java / Scala / Clojure Compiler .class .class

    .class dx classes.dex Optimizer odex Cache
  29. Dalvik bytecode public static int min(int a, int b) {

    if (a < b) { return a; } else { return b; } }
  30. Dalvik bytecode public static int min(int a, int b) {

    if (a < b) { return a; } else { return b; } } 0000: if-ge v0, v1, 0003 0002: return v0 0003: move v0, v1 0004: goto 0002
  31. Dalvik bytecode public static int min(int a, int b) {

    if (a < b) { return a; } else { return b; } } 0000: if-ge v0, v1, 0003 0002: return v0 0003: move v0, v1 0004: goto 0002
  32. Dalvik bytecode public static int min(int a, int b) {

    if (a < b) { return a; } else { return b; } } 0000: if-ge v0, v1, 0003 0002: return v0 0003: move v0, v1 0004: goto 0002
  33. Dalvik bytecode public static int min(int a, int b) {

    if (a < b) { return a; } else { return b; } } 0000: if-ge v0, v1, 0003 0002: return v0 0003: move v0, v1 0004: goto 0002
  34. Dalvik bytecode public static int min(int a, int b) {

    if (a < b) { return a; } else { return b; } } 0000: if-ge v0, v1, 0003 0002: return v0 0003: move v0, v1 0004: goto 0002
  35. Dalvik bytecode public static int min(int a, int b) {

    if (a < b) { return a; } else { return b; } } 0000: if-ge v0, v1, 0003 0002: return v0 0003: move v0, v1 0004: goto 0002
  36. Stack vs. Registers 0000: if-ge v0, v1, 0003 0002: return

    v0 0003: move v0, v1 0004: goto 0002 ILOAD 0 ILOAD 1 IF_ICMPGE L1 ILOAD 0 IRETURN L1: ILOAD 1 IRETURN
  37. Register (Dalvik) advantages • Smaller memory footprint ◦ Does not

    use auxiliary stack structure • Code is shorter ◦ ~43% less opcodes (bytecode “lines”)
  38. Stack (JVM) advantages • Compilation is simpler and faster •

    Overall code size is smaller ◦ ~30% smaller than equiv. register bytecode
  39. Stack (JVM) advantages • Compilation is simpler and faster •

    Overall code size is smaller ◦ ~30% smaller than equiv. register bytecode ▪ 205 JVM opcodes → 1 byte ▪ 276 Dalvik opcodes → 2 bytes + specialized opcodes + register specification
  40. Trace-based JIT advantages • Optimization takes effect faster • Low

    memory usage ◦ Compiling traces requires less RAM ◦ Resulting code is more granular ◦ “Luggage” code is never compiled
  41. The DEX file .class Constant Pool Java bytecode classes.dex Unified

    Constant Pools Dalvik bytecode Dalvik bytecode .class Constant Pool Java bytecode .class Constant Pool Java bytecode
  42. The DEX file • Constants take ~60% of class file

    size • One big constant pool for all classes • This big constant pool is divided into sections for even better space efficiency
  43. The DEX file System libraries Classes: 21.4 MB JAR: 10.7

    MB (50%) Browser app Classes: 470.3 KB JAR: 232.1 KB (49%)
  44. The DEX file System libraries Classes: 21.4 MB JAR: 10.7

    MB (50%) DEX: 10.3 MB (48%) Browser app Classes: 470.3 KB JAR: 232.1 KB (49%) DEX: 209.2 KB (44%)
  45. The ODEX file • When an app is installed, its

    DEX file is preprocessed. ◦ Word alignment, padding, endianity ◦ Bytecode verification
  46. The ODEX file • When an app is installed, its

    DEX file is preprocessed. ◦ Word alignment, padding, endianity ◦ Bytecode verification ◦ Native library call inlining ◦ Method references → vtable indices ◦ Field references → internal byte offsets
  47. The ODEX file • When an app is installed, its

    DEX file is preprocessed. ◦ Word alignment, padding, endianity ◦ Bytecode verification ◦ Native library call inlining ◦ Method references → vtable indices ◦ Field references → internal byte offsets ▪ More: Dead code removal, integral type coalescing...
  48. The ODEX file • It is then cached to internal

    storage. • From now on, it can be quickly loaded into a VM's memory with minimum overhead.
  49. The Zygote Premises: • Most apps use the same core

    libraries • Starting a VM instance is costly • RAM is scarce
  50. The Zygote The Zygote is a special VM process. •

    Born shortly after Android boots-up • A "warmed up" VM ◦ System libraries are loaded and initialized • Ready to be forked on demand
  51. The Zygote The Zygote Runtime library DEX Shared Heap (loaded

    libs, lib structures) Application DEX Angry Birds Application Heap
  52. The Zygote The Zygote Runtime library DEX Shared Heap (loaded

    libs, lib structures) Application DEX Angry Birds Application Heap Application DEX Chrome Application Heap
  53. The Zygote The Zygote Runtime library DEX Shared Heap (loaded

    libs, lib structures) Application DEX Angry Birds Application Heap Application DEX Chrome Application Heap
  54. The Zygote The Zygote Runtime library DEX Shared Heap (loaded

    libs, lib structures) Application DEX Angry Birds Application Heap Application DEX Chrome Application Heap
  55. The Zygote The Zygote Runtime library DEX Shared Heap (loaded

    libs, lib structures) Application DEX Angry Birds Application Heap Application DEX Chrome Application Heap
  56. The Zygote • Quick VM startup ◦ Improved app startup

    time • Preloaded and initialized libraries • Sharing of memory across VMs • Apps are segregated
  57. The Zygote • Quick VM startup • Preloaded and initialized

    libraries ◦ Improved overall app responsiveness • Sharing of memory across VMs • Apps are segregated
  58. The Zygote • Quick VM startup • Preloaded and initialized

    libraries • Sharing of memory across VMs ◦ Smaller VM memory footprint • Apps are segregated
  59. The Zygote • Quick VM startup • Preloaded and initialized

    libraries • Sharing of memory across VMs • Apps are segregated ◦ Utilize Linux kernel security model
  60. Summary Conserve RAM by... • Using register-based bytecode • Using

    trace-based JIT compilation • Merging class files into a single .dex file • Sharing memory between processes • Mapping loaded .dex bytecode to files
  61. Summary Conserve CPU (and battery) by... • Using register-based bytecode

    • Optimizing .dex files ◦ Perform platform optimization ◦ Optimize during installation, instead of at runtime • Forking the Zygote, reducing startup overhead
  62. Agenda • Introduction • Dalvik is a unique VM •

    Life on a mobile device • Development tips
  63. Don’t grind water • Prefer signaling mechanisms over polling. •

    Try to do minimum work when there's no user, network or sensor input.
  64. Don’t grind water • Prefer signaling mechanisms over polling. •

    Try to do minimum work when there's no user, network or sensor input. • Monitor the state of the battery ◦ Lengthen polling cycles if necessary ◦ Turn off background services
  65. Efficient looping List<Item> list = new ArrayList<Item>(); ... for (Item

    item : list) { ... } List<Item> list = new ArrayList<Item>(); ... int size = list.size(); for (int i = 0; i < size; ++i) { ... } 3x faster!
  66. Efficient looping Item[] array = new Item[...]; ... for (int

    i = 0; i < array.length; ++i) { ... }
  67. Efficient looping Item[] array = new Item[...]; ... for (int

    i = 0; i < array.length; ++i) { ... } Item[] array = new Item[...]; ... for (Item item : array) { ... } JIT can't yet optimize this!
  68. Efficient looping Using an Iterator is still the best (and

    sometimes only) way to iterate over non ArrayList collections.
  69. Garbage collections are bad! • They cause performance hiccups •

    They are heavy on the CPU • They drain the battery
  70. Garbage collections are bad! • They cause performance hiccups •

    They are heavy on the CPU • They drain the battery Avoid short-lived allocations
  71. Avoid short-lived allocations • Dalvik’s GC is not generational •

    Not optimized for stack allocations as in HotSpot
  72. Avoid short-lived allocations • Try to avoid boxing and unboxing

    ◦ int → Integer → int, etc. • If you need to aggregate, consider passing the aggregator as an argument. ◦ List ◦ Set ◦ StringBuilder
  73. RAM is scarce! • Caching wisely can reduce allocations ◦

    Recycle views ◦ Recycle bitmaps • ...but be wary of large caches
  74. RAM is scarce! • Caching wisely can reduce allocations ◦

    Recycle views ◦ Recycle bitmaps • ...but be wary of large caches • Persist whatever you can
  75. RAM is scarce! • Use streams instead of in-memory buffers

    ◦ Decode files directly from file streams ◦ Deserialize structures directly from HTTP streams
  76. RAM is scarce! • Use streams instead of in-memory buffers

    ◦ Decode files directly from file streams ◦ Deserialize structures directly from HTTP streams • For example ◦ BitmapFactory.decodeStream(InputStream is) ◦ MyProtoBuffer.parseFrom(InputStream is)