Slide 1

Slide 1 text

@alblue ©2020 Alex Blewitt Bite Sized Bytecode Classes and ClassLoaders

Slide 2

Slide 2 text

@alblue ©2020 Alex Blewitt Overview • ClassLoaders • Classes • Bytecode

Slide 3

Slide 3 text

@alblue ©2020 Alex Blewitt Class Loaders and Class files

Slide 4

Slide 4 text

@alblue ©2020 Alex Blewitt Loading and defining classes • The JVM builds Class instances from .class files • A class loader is responsible for finding (or generating) bytes • Class.forName("an.example") → triggers lookup if not loaded • A ClassLoader can be chained to a 'parent classloader' • Most app servers (e.g. Tomcat, Netty) have one ClassLoader per app App 1 CL App 2 CL App 3 CL App 4 CL Tomcat CL ClassLoaders also load resources!

Slide 5

Slide 5 text

@alblue ©2020 Alex Blewitt ClassLoaders and Classes • A Class is owned by its loading ClassLoader • A Class must be uniquely named in a ClassLoader • Two Class objects with the same name can exist in a JVM • A name is not unique; a Class+ClassLoader pair is unique a.getClass().getName().equals("ClassName") == true; (ClassName)a ! ClassCastException • Could be caused by multiple web apps storing thread local variables

Slide 6

Slide 6 text

@alblue ©2020 Alex Blewitt Loading a class • The mechanisms to bring a class into existence vary • URLClassLoader can load classes from a URL • AppletClassLoader (was) used to load applets into a browser • ASM, ByteBuddy, Mockito etc. generating classes on the fly • In essence, custom class loaders boil down to: 1. Load or generate bytes from somewhere 2. Call defineClass()

Slide 7

Slide 7 text

@alblue ©2020 Alex Blewitt Dynamic class creation • Classes can be created at runtime, from Java 1.3 onwards: final Runnable r = (Runnable) Proxy.newProxyInstance(getClass().getClassLoader(), new Class>[] { Runnable.classv}, (InvocationHandler) (instance, method, args) -> { System.out.println("Hello World!"); return null; } ); • Easier to use a lambda for Java 8 and above: Runnable r = () -> { System.out.println("Hello World!"); };

Slide 8

Slide 8 text

@alblue ©2020 Alex Blewitt Dynamic class creation Tool Provider System JavaC File Manager Source File Class File Class Loader

Slide 9

Slide 9 text

@alblue ©2020 Alex Blewitt Dynamic class creation • Can also compile and load classes programmatically:
 var javac = javax.tools.ToolProvider.getSystemJavaCompiler(); var fileMgr = javac.getStandardFileManager(null, null, null); var srcs = fileMgr.getJavaFileObjects("/tmp/Test.java"); javac.getTask(null, fileMgr, null, null, null, srcs).call(); var cl = new ClassLoader() { public Class load(final byte[] bytes) { return defineClass(bytes, 0, bytes.length); } }; final var bytes = Files.readAllBytes(Path.of("/tmp/Test.class")); ((Runnable) (cl.load(bytes).newInstance())).run(); public class Test implements Runnable { public void run() { System.out.println("Hello World!"); } } https://github.com/alblue/jvmulator/blob/master/src/main/java/com/bandlem/jvm/jvmulator/compiler/

Slide 10

Slide 10 text

@alblue ©2020 Alex Blewitt Class file format Magic 0xcafebabe Minor 0 Major 55 Constant Pool count Flags public This Super Fields count Methods count Class Attributes count UTF8 Count 5 E x a m p e l Int 0x48656c6f Float 0x7f800000 Long 0x416c2042_6c756521 Doubl 0xfff00000_00000000 Class UTF8 1 Field Class 2 NaT 3 Method Class 2 NaT 4 NaT Name 1 Type 6 String UTF8 1 IMethod Class 2 NaT 4 UTF8 Count 2 [ Z 1⃣ 1⃣ 8⃣ 7⃣ 3⃣ 4⃣ 6⃣ 5⃣ Name 1 Type 6 Flags public Attributes count Attribute Data Length Name 1 Attributes allow for extensions The file format hasn't changed for decades Few constant types have been added

Slide 11

Slide 11 text

@alblue ©2020 Alex Blewitt Attributes • Class files can be adorned with many attributes • Extensible, stringly-typed array of bytes • Code → contains bytecode for a method's execution • Exceptions → list of exceptions that can be thrown form a method • Runtime(In)visibleAnnotations → set of key/value pairs • NestHost/NestMembers → new support for nest mates in Java 11 (JEP 181) • Attributes are optional; e.g. native/abstract methods have no "Code" attribute

Slide 12

Slide 12 text

@alblue ©2020 Alex Blewitt Special methods • Classes can have 'special' or synthetic methods • – method that runs when the class is accessed at first time • – constructor special name • Accessor methods generated for inner classes • Have ACC_SYNTHETIC set, so they don't show up in tools

Slide 13

Slide 13 text

@alblue ©2020 Alex Blewitt Displaying bytecode • Java has a built-in disassembler for Java code javap -c [-p[rivate]] [-v[erbose]] [-cp classpath] com.Example[.class] Disassemble byte code Show all private members (c.f. proteted, package, public Display constant pool and attributes Classpath (or — module-path) of class name Class name (optional .class extension)

Slide 14

Slide 14 text

@alblue ©2020 Alex Blewitt JavaP – displaying bytecode $ javap -v -c java.lang.Object public class java.lang.Object minor version: 0 major version: 55 flags: (0x0021) ACC_PUBLIC, ACC_SUPER this_class: #17 // java/lang/Object super_class: #0 interfaces: 0, fields: 0, methods: 14, attributes: 1 Constant pool: #1 = Class #63 // java/lang/StringBuilder #2 = Methodref #1.#64 // java/lang/StringBuilder."":()V #3 = Methodref #17.#65 // java/lang/Object.getClass:()Ljava/lang/Class; #4 = Methodref #66.#67 // java/lang/Class.getName:()Ljava/lang/String; ... #6 = String #69 // @ ... #17 = Class #80 // java/lang/Object
 ...
 #34 = Utf8 equals #35 = Utf8 (Ljava/lang/Object;)Z ... #80 = Utf8 java/lang/Object All other classes will have a super_class which is not 0 Compiled against Java 11 Constant used in default 'toString' method Used to define equals(Object) method

Slide 15

Slide 15 text

@alblue ©2020 Alex Blewitt JavaP – displaying bytecode $ javap -v -c java.lang.Object public class java.lang.Object public boolean equals(java.lang.Object); descriptor: (Ljava/lang/Object;)Z flags: (0x0001) ACC_PUBLIC Code: stack=2, locals=2, args_size=2 0: aload_0 1: aload_1 2: if_acmpne 9 5: iconst_1 6: goto 10 9: iconst_0 10: ireturn LineNumberTable: line 158: 0 LocalVariableTable: Start Length Slot Name Signature 0 11 0 this Ljava/lang/Object; 0 11 1 obj Ljava/lang/Object; SourceFile: Object.java Code attribute LineNumberTable (nested) attribute LocalVariableTable (nested) attribute Used to define equals(Object) method Slot 0 usually contains 'this' Slot 1 contains the first argument obj Line number 158 of Object.java Source attribute

Slide 16

Slide 16 text

@alblue ©2020 Alex Blewitt Bytecode for equals() Code: stack=2, locals=2, args_size=2 0: aload_0 1: aload_1 2: if_acmpne 9 5: iconst_1 6: goto 10 9: iconst_0 10: ireturn Locals Stack 0 1 ➡ this other

Slide 17

Slide 17 text

@alblue ©2020 Alex Blewitt Bytecode for equals() Code: stack=2, locals=2, args_size=2 0: aload_0 1: aload_1 2: if_acmpne 9 5: iconst_1 6: goto 10 9: iconst_0 10: ireturn Locals Stack 0 this this 1 other ➡

Slide 18

Slide 18 text

@alblue ©2020 Alex Blewitt Bytecode for equals() Code: stack=2, locals=2, args_size=2 0: aload_0 1: aload_1 2: if_acmpne 9 5: iconst_1 6: goto 10 9: iconst_0 10: ireturn Locals Stack 0 this this 1 other other ➡

Slide 19

Slide 19 text

@alblue ©2020 Alex Blewitt Bytecode for equals() Code: stack=2, locals=2, args_size=2 0: aload_0 1: aload_1 2: if_acmpne 9 5: iconst_1 6: goto 10 9: iconst_0 10: ireturn Locals Stack 0 this 1 other ➡

Slide 20

Slide 20 text

@alblue ©2020 Alex Blewitt Bytecode for equals() Code: stack=2, locals=2, args_size=2 0: aload_0 1: aload_1 2: if_acmpne 9 5: iconst_1 6: goto 10 9: iconst_0 10: ireturn Locals Stack 0 this 1 other ➡ 0

Slide 21

Slide 21 text

@alblue ©2020 Alex Blewitt Bytecode for equals() Code: stack=2, locals=2, args_size=2 0: aload_0 1: aload_1 2: if_acmpne 9 5: iconst_1 6: goto 10 9: iconst_0 10: ireturn ➡ 0

Slide 22

Slide 22 text

@alblue ©2020 Alex Blewitt Bytecode

Slide 23

Slide 23 text

@alblue ©2020 Alex Blewitt Bytecode • Most bytecodes are encoded as a single byte (hence the name) • Some bytecodes take additional operands, but most operate on the stack • Bytecodes can: • Consume values from the stack • Push a value onto the stack • Transfer from the stack to a local variable (and vice versa) • Load constants from the class' constant pool

Slide 24

Slide 24 text

@alblue ©2020 Alex Blewitt Reference and object bytecodes • new – push a new instance of the class from the constant pool • newarray – push a new array with a primitive type Z B C S I L • anewarray – push a new array of reference types • multianewarray – push a multi-dimensional array • arraylength – push the length of the array • checkcast – throw if top of stack is not of the specified type • instanceof – push true if top of stack is of specified type Array of booleans is [Z Array of array of char is [[C

Slide 25

Slide 25 text

@alblue ©2020 Alex Blewitt Calling methods • invokestatic – call a static method (constant contains class) • invokevirtual – call instance methods of ToS (with inheritance) • invokespecial – call super constructor/methods of ToS • invokeinterface – call an interface method on ToS • invokedynamic – invokes a dynamic method (since Java 1.7) → Used for implementing Lambda operations

Slide 26

Slide 26 text

@alblue ©2020 Alex Blewitt Mathematics • {i,l,f,d}neg – negates the top of stack • {i,l,f,d}add/sub – adds/subtracts two numbers together • {i,l,f,d}mul/div – multiplies/divides one number from the other • {i,l,f,d}rem – remainder when divided by (modulus) • {i,l}and/or/xor – performs bitwise and/or/xor on two numbers • {i,l}shl/shr/ushr – arithmetic shift left/right or unsigned (bitwise) shift right Consumes top two stack items, pushes result onto stack Consumes and pushes single element on stack

Slide 27

Slide 27 text

@alblue ©2020 Alex Blewitt Constants • {i,l,f,d}const_{0,1} – push 0 or 1 onto the stack as integer/long/float/double • iconst_{2,3,4,5,m1} – push 2,3,4,5 or -1 onto the stack as an integer • {b,s}ipush – push the next byte/short onto the stack • ldc{,_w,2_w} – push a constant from the pool onto the stack • aconst_null – push 'null' on to the stack

Slide 28

Slide 28 text

@alblue ©2020 Alex Blewitt Conversions int short char byte long float double d2f f2d f2l d2i l2f f2i i2f l2d d2l i2d i2l l2i i2b i2s i2c 6⃣4⃣ 3⃣2⃣ 8⃣ 1⃣6⃣ 1⃣6⃣ 3⃣2⃣ 6⃣4⃣ boolean

Slide 29

Slide 29 text

@alblue ©2020 Alex Blewitt Loading and storing • {b,s,c,i,f,l,d,a}aload/astore – load/store element into array at index • {i,l,f,d,a}load/store{,_0,_1,_2,_3} – load/store from variable at index • iinc – increment local variable by constant byte • getfield/putfield <field> – get/put a field in an instance on ToS • getstatic/putstatic <field> – get/put a static field in a class

Slide 30

Slide 30 text

@alblue ©2020 Alex Blewitt Comparisons • {f,d}cmpg – compare two floats/doubles, pushes 1 on NaN • {f,d}cmpl – compares two floads/dobules, pushes -1 on NaN • lcmp – compares two longs, pushes 1 or -1 • if{eq,ne,gt,ge,lt,le} <±jump> – branch if =, ≠, >, ≥, <, ≤ 0 • if_icmp{eq,ne,gt,ge,lt,le} <±jump> – branch if =, ≠, >, ≥, <, ≤ other number • if_acmp{eq,ne} <±jump> – branch if references are equal or not equal • if{,non}null <±jump> – branch if (non) null IEEE754 floating point spec uses 'Not a Number' to represent conditions such as divide-by-zero or sqrt(-1)

Slide 31

Slide 31 text

@alblue ©2020 Alex Blewitt Control flow • {lookup,table}switch – continue execution from table (switch) • {,i,l,f,d,a}return – return a void/int/long/float/double/reference • goto{_w} <±jump> – jump to another bytecode (do not push address) • athrow – throw the (Throwable) reference on top of the stack • jsr{_w} <±jump> – jump to another part of the method (push address) • ret – return (from a jsr) to an address specified in local var

Slide 32

Slide 32 text

@alblue ©2020 Alex Blewitt Stack manipulation • swap – swap the top two int/float values on the stack • pop{,2} – pop (drop) one or two slots from the stack • dup – duplicate the top int/float on the stack • dup_x{1,2} – duplicate the top int/float on the stack, put it 1 or 2 below • dup2 – duplicate the top long/double on the stack • dup2_x{1,2} – duplicate the top long/double on the stack, put it 1 or 2 below

Slide 33

Slide 33 text

@alblue ©2020 Alex Blewitt Miscellaneous • nop – no operation • monitor{enter,exit} – synchronized blocks • breakpoint – breakpoint for debuggers • impdep{1,2} – implementation dependent operations for debuggers • wide – treat the next bytecode as having wider argument • iinc → wide iinc • *load/*store/ret → wide *load/*store/ret

Slide 34

Slide 34 text

@alblue ©2020 Alex Blewitt Demo

Slide 35

Slide 35 text

@alblue ©2020 Alex Blewitt JVMulator https://github.com/alblue/jvmulator

Slide 36

Slide 36 text

@alblue ©2020 Alex Blewitt JVMulator https://github.com/alblue/jvmulator

Slide 37

Slide 37 text

@alblue ©2020 Alex Blewitt Summary • Java class files define a class, along with methods and fields • ClassLoader instances loads a class from somewhere (disk, url, …) as a Class • Methods' implementation are bytecodes stored in Code attributes • Bytecode operates on a stack, with a number of 'local' variables • The stack and locals operate on int/long/float/double/reference types • Conversions between data types are handled with opcodes • Some opcodes take operands but the majority do not

Slide 38

Slide 38 text

@alblue ©2020 Alex Blewitt Thank you https://alblue.bandlem.com https://twitter.com/alblue https://github.com/alblue https://vimeo.com/alblue https://speakerdeck.com/alblue