Java Memory Consistency Model - context and concepts

Slide 1

Slide 1 text

Java Memory Consistency Model @LAFK_pl Consultant @ Tomasz Borek

Slide 2

Slide 2 text

GeeCON 2015 Kraków! @LAFK_pl Consultant @ http://2015.geecon.org/register/

Slide 3

Slide 3 text

I'll take ALL feedback I can get. @LAFK_pl, #JMCM @LAFK_pl Consultant @

Slide 4

Slide 4 text

Symentis @LAFK_pl

Slide 5

Slide 5 text

Symentis @LAFK_pl Jarosław Pałka Kuba Marchwicki

Slide 6

Slide 6 text

Today... ● A little bragging ● Fallacy correction ● Memory model ● The Java one mostly ● Short advice what about it ● Lot of links for later reading ● Yeah, 45 minutes only :P @LAFK_pl Consultant @

Slide 7

Slide 7 text

Who knows (heard of)? ● Gene Amdahl? ● Gordon Moore? ● Leslie Lamporte? ● Bill Pugh? ● Sarita Adve? ● Hans Boehm? ● Martin Thompson? ● Aleksey Shipilev? @LAFK_pl

Slide 8

Slide 8 text

Hands up, who... ● Doesn't program ● Knows Moore's law? ● Knows Amdahl's law? ● Can explain concurrency vs parallelism? ● Codes with mechanical sympathy? ● Tries to? ● Knows what mechanical sympathy is? @LAFK_pl

Slide 9

Slide 9 text

Fallacy #0 @LAFK_pl Java Memory Model is about GC, memory management, Eden, Survivors etc.

Slide 10

Slide 10 text

Not true at all! @LAFK_pl Java Memory Model is about GC, memory management, Eden, Survivors, etc.

Slide 11

Slide 11 text

@LAFK_pl

Slide 12

Slide 12 text

MANAGEMENT!! @LAFK_pl

Slide 13

Slide 13 text

Written in 2012... @LAFK_pl

Slide 14

Slide 14 text

… and still there @LAFK_pl

Slide 15

Slide 15 text

Correction! @LAFK_pl Consultant @

Slide 16

Slide 16 text

JLS, section 17.4: A memory model describes, given a program and an execution trace of that program, whether the execution trace is a legal execution of the program. The Java programming language memory model works by examining each read in an execution trace and checking that the write observed by that read is valid according to certain rules. @LAFK_pl Consultant @

Slide 17

Slide 17 text

Slide 18

Slide 18 text

Sarita Adve: memory consistency model @LAFK_pl Consultant @

Slide 19

Slide 19 text

Aleksey Shipilev: @LAFK_pl Consultant @ Memory model answers one simple question: What values can a particular read in a program return?

Slide 20

Slide 20 text

Bill Pugh, Jeremy Manson ● Most cores have many cache layers ● What if 2 cores look at same value? ● Memory model defines when and who sees what ● There're strong and weak models ● Strong guarantee seeing same things across whole system ● Weak only sometimes, via barriers / fences @LAFK_pl Consultant @

Slide 21

Slide 21 text

Bill Pugh, Jeremy Manson: What is a memory model, anyway? At the processor level, a memory model defines necessary and sufficient conditions for knowing that writes to memory by other processors are visible to the current processor, and writes by the current processor are visible to other processors. @LAFK_pl Consultant @

Slide 22

Slide 22 text

So? ● Memory CONSISTENCY ● Allowed optimisations ● Possible executions of a (possibly multithreaded!) program ● Which cores / threads see which values ● How to make it consistent for programmers ● What you're allowed to assume @LAFK_pl Consultant @

Slide 23

Slide 23 text

Fallacy #1 I don't need to worry about JMCM since REALLY smart engineers crafted it. @LAFK_pl Consultant @

Slide 24

Slide 24 text

Half-true I don't need to worry about JMCM since REALLY smart engineers crafted it @LAFK_pl Consultant @

Slide 25

Slide 25 text

Smart, sure! But still: ● Smart people are still people ● JMCM is damn hard! Yeah, they botched it. ● Java <> JVM ● JMCM is for JVM... but with Java in mind ● NO tech is a talisman of functionality! @LAFK_pl Consultant @

Slide 26

Slide 26 text

JSR-133? ● Messed up final ● Spec not for humans ● Messed up double-locking ● Messed up volatile ● Each implementation on it's own @LAFK_pl Consultant @

Slide 27

Slide 27 text

More? @LAFK_pl Consultant @ JEPS-188 and JMM9?

Slide 28

Slide 28 text

Fallacy #2 JMCM is irrelevant for me. @LAFK_pl Consultant @

Slide 29

Slide 29 text

Depends! ● What you write and with what ● Java? Not? ● Need performance? ● What platforms? ● Multicore era... @LAFK_pl Consultant @

Slide 30

Slide 30 text

Fallacy #3 JMCM is for fanatics. I have frameworks for that. @LAFK_pl Consultant @

Slide 31

Slide 31 text

Moore's ”law”? Consultant @ Consultant @ @LAFK_pl

Slide 32

Slide 32 text

Moore's ”law” I see Moore’s law dying here in the next decade or so. – Gordon Moore, 2015 Consultant @ @LAFK_pl Consultant @

Slide 33

Slide 33 text

Amdahl's law? @LAFK_pl Consultant @

Slide 34

Slide 34 text

Amdahl's law @LAFK_pl The speedup of a program using multiple processors in parallel computing is limited by the sequential fraction of the program. For example, if 95% of the program can be parallelized, the theoretical maximum speedup using parallel computing would be 20× as shown in the diagram, no matter how many processors are used. Consultant @

Slide 35

Slide 35 text

Rok 1967, Gene Amdahl states: @LAFK_pl For over a decade prophets have voiced the contention that the organization of a single computer has reached its limits and that truly significant advances can be made only by interconnection of a multiplicity of computers in such a manner as to permit cooperative solution. Consultant @

Slide 36

Slide 36 text

Slide 37

Slide 37 text

Cores... MOAR!! @LAFK_pl Consultant @

Slide 38

Slide 38 text

Data distance ● http://i.imgur.com/k0t1e.png

Slide 39

Slide 39 text

So, data distance varies @LAFK_pl Know thyne cache lines! Consultant @

Slide 40

Slide 40 text

@LAFK_pl Consultant @ So if it matters, what then?

Slide 41

Slide 41 text

Where it matters ● Javac / Jython / ... ● JIT ● Hardware, duh! ● Each time: another team @LAFK_pl Consultant @

Slide 42

Slide 42 text

Hardware ● Various ISA CPUs ● Number of registers ● Caches size or type, buses implementations ● Cache protocols (MESI, AMD's MOESI, Intel's...) ● How many functional units per CPU ● How many CPUs ● Pipeline: ● Instruction decode > address decode > memory fetch > register fetch > compute ... @LAFK_pl Consultant @

Slide 43

Slide 43 text

Program / code optimizations? ● Reorder ● (e.g. prescient store) ● Remove what's unnecessary ● (e.g. synchronize) ● Replace instructions / shorten machine code ● Function optimizations ● (e.g. Inlining) ● ... @LAFK_pl Consultant @

Slide 44

Slide 44 text

Exemplary CPU @LAFK_pl Consultant @

Slide 45

Slide 45 text

Barriers / fences „once memory has been pushed to the cache then a protocol of messages will occur to ensure all caches are coherent for any shared data. The techniques for making memory visible from a processor core are known as memory barriers or fences. – Martin Thompson, Mechanical Sympathy differs per architecture / CPU / cache type! @LAFK_pl Consultant @

Slide 46

Slide 46 text

Barriers / Fences ● CPU instruction ● Means ”flush BUFFER now!” ● CMPXCHG (may be lacking!) ● Forces update ● Starts cache coherency protocols ● Read / Write / Full @LAFK_pl Consultant @

Slide 47

Slide 47 text

@LAFK_pl Consultant @ Words of summary and gratitude

Slide 48

Slide 48 text

Doug Lea says: @LAFK_pl Consultant @ The best way is to build up a small repertoire of constructions that you know the answers for and then never think about the JMM rules again unless you are forced to do so! Literally nobody likes figuring things out of JMM rules as stated, or can even routinely do so correctly. This is one of the many reasons we need to overhaul JMM someday.

Slide 49

Slide 49 text

Doug Lea advice: @LAFK_pl Consultant @ The best way is to build up a small repertoire of constructions that you know the answers for and then never think about the JMM rules again unless you are forced to do so! Literally nobody likes figuring things out of JMM rules as stated, or can even routinely do so correctly. This is one of the many reasons we need to overhaul JMM someday.

Slide 50

Slide 50 text

Mechanical sympathy: @LAFK_pl Consultant @ ● Cache lines misses hurt ● Going to main memory hurts ● Cycles are important ● L1, L2 caches are cheap but require cache coherency protocols and memory barriers ● Not every hardware has all barriers ●

Slide 51

Slide 51 text

Gordon Moore ● Fairchild Semi- conductors co- founder ● ”Law author” ● Intel co-founder @LAFK_pl

Slide 52

Slide 52 text

Gene Amdahl ● IBM fellow ● IBM & Amdahl mainframes ● Coined law in 1967 @LAFK_pl

Slide 53

Slide 53 text

Leslie Lamport ● Distributed system clocks ● Happens before ● Sequential consistency @LAFK_pl

Slide 54

Slide 54 text

Bill Pugh ● FindBugs ● Java Memory Model is broken ● Final - Volatile ● Double-checked locking ● ”New” JMM @LAFK_pl

Slide 55

Slide 55 text

Sarita Adve ● Java Memory Model is broken ● Great many MCM papers ● Best MCM def I found @LAFK_pl

Slide 56

Slide 56 text

Martin Thompson ● Mechanical sympathy blog & mailing list ● Aeron protocol ● Mechanical sympathy proponent @LAFK_pl

Slide 57

Slide 57 text

This wouldn't have happened if not ● Jarek Pałka, who kicked me out here some time ago ● Those folks, who said ”make more” after the lightning talk I've done ● Java Day Kiev 2014 @LAFK_pl Consultant @

Slide 58

Slide 58 text

Not possible without: ● Leslie Lamport's works on distributed sistems ● Bill Pugh's work on JSR-133! http://www.cs.umd.edu/~pugh/java/memoryModel/jsr-133-faq.html ● Sarita Adve's paperts, especially shared MCM tutorial: http://www.hpl.hp.com/techreports/Compaq-DEC/WRL-95-7.pdf @LAFK_pl Consultant @

Slide 59

Slide 59 text

Terrific – and tough - reading ● Martin Thompson: Mechanical Sympathy (mailing list & blog ) ● JEPS 188: http://openjdk.java.net/jeps/188 ● Goetz et al: "Java Concurrency in Practice" ● Herilhy, Shavit, "The Art of Multiprocessor Programming" ● Adve, "Shared Memory Consistency Models: A Tutorial" ● Manson, "Special PoPL Issue: The Java Memory Model" ● Huisman, Petri, "JMM: A Formal Explanation" ● Aleksey Shipilev blog post: http://shipilev.net/blog/2014/jmm-pragmatics/ @LAFK_pl Consultant @

Slide 60

Slide 60 text

Laws and related: ● Moore's ”law”: http://www.cs.utexas.edu/~fussell/courses/cs352h/papers/moore.pdf ● Rock's law: http://en.wikipedia.org/wiki/Rock's_law ● Amdahl's law: ● http://en.wikipedia.org/wiki/Amdahl%27s_law ● Validity of the Single-Processor Approach to Achieving Large-Scale Computing Capabilities AFIPS Press, 1967 ● J.L. Gustafson, “Reevaluating Amdahl’s Law,” Comm. ACM, May 1988 ● Pleasantly parallel problems: http://en.wikipedia.org/wiki/Embarrassingly_parallel @LAFK_pl Consultant @

Slide 61

Slide 61 text

Special thanks ● Konrad Malawski and Tomek Kowalczewski, these guys really dig that stuff ● Bartosz Milewski who helped me rediscover Hans Boehm @LAFK_pl Consultant @

Slide 62

Slide 62 text

YOU! You persevered through! @LAFK_pl Consultant @