Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Lukas Stadler on Co-routines for Hotspot

Lukas Stadler on Co-routines for Hotspot

More Decks by Enterprise Java User Group Austria

Other Decks in Technology

Transcript

  1. 2 Introduction  Lukas Stadler — Institute for System Software,

    Johannes Kepler University Linz, Austria — Research on Compilers and VMs, Continuations, Coroutines, … — In close collaboration with Oracle Labs
  2. Agenda  The Institute for System Software — Areas of

    Research — Current Projects  Coroutines for HotSpot — Introduction — Features — API — Results 3
  3. SSW - Institute for System Software  Part of the

    TNF at the Johannes Kepler University Linz  Head of the Institute: Prof. Dr. Dr. h.c. Hanspeter Mössenböck  Main research topics: — Languages, Compilers, Virtual Machines — Automated Software Engineering — Optimization Systems - OptLet framework  30 employees, only 1/3 paid by government  700,000 € third-party funds  More information: http://ssw.jku.at/, http://ssw.jku.at/Research/Projects/ 4
  4. Automated Software Engineering  Christian Doppler Laboratory for Automated Software

    Engineering — Software Product Lines • Siemens VAI and Siemens AG - Corporate Technology, Germany — Capture-Replay-Analysis of Real-Time Systems • KEBA AG, Linz — Dynamic Plugin-Architectures • BMD Systemhaus GmbH, Steyr  New research topics planned for 2013, together with Dynatrace and others... 5
  5. Languages, Compilers, Virtual Machines  Mix of pure and applied

    research  In close collaboration with, and mostly funded by, Oracle (Sun Microsystems) — Since 2001 — 2.5 PhD students — 2 - 3 part-time master students — Oracle Labs full-time employee in Linz (Thomas Würthinger)  “Generalization of Just-in-Time Trace Compilation for Java” funded by FWF  Coco/R, Visualization of Compiler Data Structures, ... 6
  6. Current topics in Java Research  Trace Compilation for Java

    — JavaScript technique applied to Java  Garbage Collection for High-End Mobile Devices (G1 lite) — Multi-tasking client Java environments for high-end mobile devices  Graal (for HotSpot and Maxine) — Just-in-Time compiler for Java and other languages, written in Java — Modular, extensible, high peak performance, dynamic — More optimization potential due to novel Intermediate Representation (IR) — Competitive JavaScript, Python, ... performance 7
  7. Impact  Tools (Coro/R, ...), Projects with partners (Catalysts, Engel,

    Fujitsu, ...), Startups (OptLet)  Projects developed in Linz for the HotSpot JVM: — Static Single Assignment form IR for HotSpot client compiler — Escape Analysis (Thomas Kotzmann) — Linear Scan Register Allocation (J2SE 1.6, Christian Wimmer) — Dynamic Code Evolution (Thomas Würthinger, ongoing, http://ssw.jku.at/dcevm/) — Coroutines? (maybe...) 8
  8. 9 Short Introduction: Coroutines  Lightweight threads  Generalization of

    subroutines — Multiple entry/exit points — Remembers position and local variables  Available in different shapes: — Stackful / stackless — First-class coroutines — Symmetric / asymmetric A B
  9. 10 Short Introduction: Coroutines  Somewhat confusing: — Many different

    names (generators, coexpressions, fibers, iterators, green threads, greenlets, tasklets, ...) — No consistent naming scheme (yield, transfer, detach, call, ...)
  10. 11 Coroutines in a JVM – why?  Very light-weight

    alternative to threads  Natural control abstraction for numerous problems — Scanner/parser (Conway 1963) — Non-parallel problems — Do not expose parallelism where there is none!  Inversion of algorithms — Converting callback-based algorithms (e.g. XML parser) into iterative algorithms — Can be done manually, but coroutines do it for free!
  11. 12 Coroutines in a JVM – why?  Server applications:

    easier programming and more performance — Threads are too expensive, especially if you need lots of them • Standard solution: use thread pools to process work units • Requires slicing the program into small pieces — Hide the complexity of asynchronous I/O — Move running applications between Virtual Machines
  12. 13 Coroutines in a JVM – why?  Dynamic language

    implementations need to emulate coroutines — Using threads • Synchronize multiple threads so that they look like coroutines — Compile-time transformations • Complex compilers • Methods split into smaller chunks – less optimization opportunities • Local variables on heap — Continuation-passing style
  13. 14 Coroutine Features  JVM coroutines: — No new language

    features / bytecode — Stackful, first-class, symmetric and asymmetric — Possibility to serialize coroutines (although unsafe)  Performance and feature trade-off — Fast switching — Many coroutines — Low implementation complexity — Fully featured
  14. 15 JVM Coroutines  Our first project: Continuations — Hard

    to do in a safe manner  => Coroutines for the HotSpotTM JVM — Coroutines: part of the Multi-Language Virtual Machine (MLVM) effort — Source available in the MLVM repositories — Binaries available from http://ssw.jku.at/General/Staff/LS/coro/  Prototype!  JSR work is ongoing...
  15. 16 Coroutine API  Coroutines always tied to Thread 

    Thread-like (subclass or Runnable, ...)  Symmetric: java.dyn.Coroutine — “Ring” of coroutines, fair scheduling  Asymmetric: java.dyn.AsymCoroutine — Caller / callee relationship — Input and output value: generic type parameters
  16. 17 Symmetric Coroutines public class Coroutine { public Coroutine(); public

    Coroutine(Runnable target); // some constructors omitted.. public boolean isFinished(); public static void yield(); public static void yieldTo(Coroutine target); protected void run(); } Java API
  17. 18 Symmetric Coroutines public class CoroutineTest extends Coroutine { public

    void run() { System.out.println("Coroutine running 1"); yield(); System.out.println("Coroutine running 2"); } } public static void main(String[] args) { new CoroutineTest(); System.out.println("start"); yield(); System.out.println("middle"); yield(); System.out.println("end"); } start Coroutine running 1 middle Coroutine running 2 end Example Usage Example Output
  18. 19 Asymmetric Coroutines public class AsymCoroutine<InT, OutT> implements Iterable<OutT> {

    public AsymCoroutine(); // some constructors omitted.. public boolean isFinished(); public InT ret(OutT value); public InT ret(); // short for ret(null) public OutT call(InT input); public OutT call(); // short for call(null) protected OutT run(InT value); @Override public Iterator<OutT> iterator(); } Java API
  19. 20 Asymmetric Coroutines public class AsymCoroutine<InT, OutT> implements Iterable<OutT> {

    public AsymCoroutine(AsymRunnable<? super InT, ? extends OutT> target); // ... } public interface AsymRunnable<InT, OutT> { public OutT run(AsymCoroutine<? extends InT, ? super OutT> coro, InT value); } Java API - with AsymRunnable
  20. 21 Web Application Example  Task: — Pose a series

    of questions — Display all answers — One coroutine per session <html> <body> <h1>${message}</h1> <form> <input type="text" name="answer"/> <input type="submit" value="Send"/> </form> </body> </html> Template:
  21. public class WebApp extends AsymCoroutine<String, String> { //helper method used

    to acquire numerical user input private int readNumber(String message) { int tries = 0; while (true) { try { return Integer.valueOf(readString(message)); } catch (NumberFormatException e) { if (tries++ == 0) message += "<br/>Please enter a valid number!"; } } } //helper method used to acquire user input private String readString(String message) { Template template = Template.getInstance("Question"); template.set("message", message); String response = ret(template.renderHTML()); return template.parseResponse(response, "answer"); } public String run(String value) { String name = readString("Please enter your name: "); String country = readString("Please enter your country:"); int age = readNumber("Please enter your age:"); if (age < 70) { String email = readString("Please enter your email address:"); return "Name: "+name+", Country: "+country+", Age: "+age+", Email: "+email; } else { String telephone = readString("Please enter your telephone number:"); return "Name: "+name+", Country: "+country+", Age: "+age+", Tel: "+telephone; } } }
  22. public class WebApp extends AsymCoroutine<String, String> { //helper method used

    to acquire numerical user input private int readNumber(String message) { int tries = 0; while (true) { try { return Integer.valueOf(readString(message)); } catch (NumberFormatException e) { if (tries++ == 0) message += "<br/>Please enter a valid number!"; } } } //helper method used to acquire user input private String readString(String message) { Template template = Template.getInstance("Question"); template.set("message", message); String response = ret(template.renderHTML()); return template.parseResponse(response, "answer"); } public String run(String value) { String name = readString("Please enter your name: "); String country = readString("Please enter your country:"); int age = readNumber("Please enter your age:"); if (age < 70) { String email = readString("Please enter your email address:"); return "Name: "+name+", Country: "+country+", Age: "+age+", Email: "+email; } else { String telephone = readString("Please enter your telephone number:"); return "Name: "+name+", Country: "+country+", Age: "+age+", Tel: "+telephone; } } }
  23. public class WebApp extends AsymCoroutine<String, String> { //helper method used

    to acquire numerical user input private int readNumber(String message) { int tries = 0; while (true) { try { return Integer.valueOf(readString(message)); } catch (NumberFormatException e) { if (tries++ == 0) message += "<br/>Please enter a valid number!"; } } } //helper method used to acquire user input private String readString(String message) { Template template = Template.getInstance("Question"); template.set("message", message); String response = ret(template.renderHTML()); return template.parseResponse(response, "answer"); } public String run(String value) { String name = readString("Please enter your name: "); String country = readString("Please enter your country:"); int age = readNumber("Please enter your age:"); if (age < 70) { String email = readString("Please enter your email address:"); return "Name: "+name+", Country: "+country+", Age: "+age+", Email: "+email; } else { String telephone = readString("Please enter your telephone number:"); return "Name: "+name+", Country: "+country+", Age: "+age+", Tel: "+telephone; } } }
  24. 25 Web Application Example  Very elegant programming style 

    One coroutine per session: perfect fit!  Many coroutines running on one thread  Coroutines consume less resources, are much faster than threads  But what if a coroutine blocks? — Due to IO: disk, DB, … — Other coroutines on thread are blocked!  => Other thread needs to take over
  25. 26 Thread-to-Thread Migration  Resuming a coroutine on a different

    thread  Problems: locking and native code — Not allowed when migrating coroutines  Best-effort: no migration if coroutine is running or about to run  “Coroutine stealing” implementation boolean result = coroutine.steal(); API (symmetric and asymmetric):
  26. 27 Thread-to-Thread Migration  Again, perfect fit: — One coroutine

    per session — Easy to specify user interaction — Very fast switching — Blocking sessions solved by migration  What to do with inactive sessions? — Memory starts to fill up – sessions shouldn't be thrown away!  Solution: serialize session/coroutine to disk, DB, ...
  27. 28 Serialization  Equivalent to full continuations (modulo performance characteristics)

     Restrictions: similar to migration  Capture Java / bytecode stack frames: — Method, bci, local variables, expressions  Returns CoroutineFrame[] data structure CoroutineFrame[] frames = coroutine.serialize(); API (symmetric and asymmetric):
  28. 29 Deserialization  From CoroutineFrame[] data structure  Very unsafe

    operation...  Replace a new coroutine's content — Creates interpreter frames — Replaced with compiled frames by OSR coroutine.deserialize(frames); API (symmetric and asymmetric):
  29. 30 Helper Classes  Serialization via CoroutineOutputStream  Deserialization via

    CoroutineInputStream FileInputStream fis = new FileInputStream("data.bin"); CoroutineInputStream<String, String> cis; cis = new CoroutineInputStream<String, String>(fis); coro = cis.readAsymCoroutine(); cis.close(); CoroutineOutputStream<String, String> cos; cos = new CoroutineOutputStream<String, String>(fos); cos.writeAsymCoroutine(coroutine); cos.close();
  30. 31 Serialization / Deserialization  Not shown here: — Complex

    applications will need custom logic — Extra logic for java.lang.reflect.Method objects — Business logic: how to serialize DB references, etc.  Lots of other use cases: portable agents, compilers, ...
  31. 32 Implementation Details  Each coroutine needs a stack to

    run  Stacks are expensive! — Memory and address space (>64kb) — OS resources  “Sharing” of stacks — Expensive context switch (copying)  Compromise: allow a certain number of stacks per thread (currently: 100)
  32. 33 Performance  Memory: — Stacks + storage for rescued

    coroutines  Run time: (Intel i5 750) — Create: 0.3 µs - 1.5 µs (Thread: 2.5 µs) — Start: 3 µs (Thread: 60 µs) — Switch: 20 ns (best case) • Difficult to get reliable thread switch time
  33. 34 Performance  Time per switch, depending on coroutine count

    — Working set outgrows L1 and L2 caches — 500 ns even for large numbers of coroutines  100 coroutine stacks per thread
  34. 35 5 fiber chain 5000 fiber chain JRuby performance 

    Small contrived benchmark  Passes messages through a chain of fibers (Ruby coroutines)  Improvement from 3 x slower to 2 x faster!
  35. 36 JRuby and JVM coroutines  Available today in JRuby

    git repository  When coroutines are available: “-Xfiber.coroutines=true” to enable coroutine support  “--1.9” required for fiber support
  36. 37 Conclusions & Current Status  Implementation in the MLVM

    repository — Patch to the OpenJDK source code  Binary available from http://ssw.jku.at/General/Staff/LS/coro/  Efforts to create Coroutine JSR are underway
  37. 38 Q&A!  More Information: Contact me at [email protected] Lukas

    Stadler: Serializable Coroutines for the HotSpot™ Java Virtual Machine Master's thesis, Johannes Kepler University Linz, February 2011 http://ssw.jku.at/Research/Papers/Stadler11Master/ Lukas Stadler, Thomas Würthinger, Christian Wimmer: Efficient Coroutines for the Java Platform 8th International Conference on Principles and Practice of Programming in Java, September 2010