Nebenläufigkeit auf der JVM und anderswo

Nebenläuﬁgkeit auf der JVM (und anderswo) 28. November 2011 ObjektForum
Stuttgart Johannes Link [email protected] @johanneslink

aus Herb Sutter: "The free lunch is over"

Wir werden uns immer seltener um das Thema Concurrency herumdrücken
können

Was ist das Problem? • Safety: Nothing bad ever happens
• Liveness: Something good eventually happens • Speed & Responsiveness: Things (seem to) happen faster

Safety - Alles ist korrekt public class Counter { private
long count = 0; public long value() { return count; } public void inc() { count++; } } Counter counter = new Counter(); //in Thread A: counter.inc(); //in Thread B: counter.inc(); //danach: assert counter.value() == 2; temp = 0 temp = 0 + 1 count = 1 Thread A: Thread B: temp = 0 temp = 0 + 1 count = 1

Step A.1 Step A.2 Step A.3 Thread A: Thread B:
Step B.1 Step B.2 Step A.4 Step A.5 Step B.3 Step B.4

Step B.1 Step A.4 Step A.5 Step B.4 Step B.2 Step B.3

Step B.1 Step B.3 Step A.5 Step A.4 Step B.2 Step B.4

Welche Bedeutung hat konsistent in einem nebenläuﬁgen Programm?

Quiescent Consistency In einer "Ruhephase" ist ein Objekt in einem
Zustand, der einer beliebigen sequentiellen Ausführung aller vorangegangenen Methoden-Aufrufe entspricht

Synchronisation nebenläuﬁger Objekte Bestimmte Programmbereiche werden durch Schlösser (Lock) in
eine deﬁnierte sequenzielle Reihenfolge gebracht

Locks public class LockedCounter { private long count = 0;
private Lock lock = new ReentrantLock(); public long value() { lock.lock(); long value = count; lock.unlock(); return value; } public void inc() { lock.lock(); count++; lock.unlock(); } } temp = 0 Thread A: Thread B: temp = 1 count = 2 lock.lock() lock.unlock() lock.lock() lock.unlock() muss warten

Locks sind nicht umsonst • Sie bergen das Risiko von
Deadlocks • Sie führen zu sequenzialisierter Programmausführung • Sie benötigen zusätzliche Prozessorzyklen ‣ Basieren auf primitiven Operationen (TAS oder CAS), und Speicherbarrieren welche (oft) auf "richtiges" Memory zugreifen müssen

Caching, Memory & Co Prozessor Core L1/2-Cache L3-Cache Core L1/2-Cache
Memory Prozessor 1-5 20-30 300-400 Zyklen Unshared Object Shared Object

Grundsätze für die Verwendung von Locks Halte genau dann einen
Lock, • wenn du auf gemeinsamen und veränder- lichen Zustand zugreifst • wenn du atomare Operationen ausführst ‣ check then act ‣ read-modify-write Halte den Lock nicht länger als nötig

Liveness - Jeder sollte an die Reihe kommen • Gefahr
1: Deadlocks • Gefahr 2: Starvation

Lock-ordering Deadlock public class SimpleDeadlock... private final Object left =
new Object(); private final Object right = new Object(); public void leftRight() { synchronized (left) { synchronized (right) { doSomething(); } } } public void rightLeft() { synchronized (right) { synchronized (left) { doSomething(); } } } lock left try to lock right Thread A: Thread B: lock right wait for ever try to lock left wait for ever

Speed • Mehr Prozessoren/Kerne sollten zu besserer Performance führen ‣
Responsiveness ‣ Throughput

Parallel != Concurrent

Responsiveness • Besseres Antwortverhalten durch Verlagerung lang-laufender Aufgaben in parallele
Threads • Asynchrone Benachrichtigung bei Beendigung / Fortschritt der Berechnung

Throughput • Gleichzeitiges Anstoßen mehrerer Aufgaben, um die Gesamtlaufzeit zu
verkleinern ‣ CPU-bound tasks ‣ I/O-bound tasks

Amdahl's Law F: nicht parallelisierbare Anteil eines Programms N: Anzahl
der Kerne / Prozessoren

0 1,25 2,50 3,75 5,00 1 2 3 4 5
10 20 100 F = 0,2 Beschleunigung Anzahl der Kerne / Prozessoren

Wie parallelisiert man ein Programm? • Embarassingly Parallel? • Übliche
Strategien ‣ Parallelisierung des Kontrollﬂusses ‣ Parallellisierung der Daten ‣ Kombination beider Arten

Parallelisierung des Kontrollﬂusses Step 1 Step 2 Step 3 Step
4 Step 5 Step 6 Step 1 Step 2 Step 3 Step 4a Step 5 Step 6 Step 4b Step 4c

Rekursive Zerlegung des Kontrollﬂusses (Fork/Join) Step 1 Step 2 Step
3 Step 4 Step 5 Step 6 Step 1 Step 6 Steps 2-5 2-5.1 2-5.2 2-5.n 2-5.1.1 2-5.1.2 2-5.1.n

for 1..n do Step 1 Step 2 Step 3 Step
4 Step 5 Parallelisierung der Daten for each element do in parallel: Step 1 Step 2 Step 3 Step 4 Step 5 Step 1 Step 2 Step 3 Step 4 Step 5 Step 1 Step 2 Step 3 Step 4 Step 5 Step 1 Step 2 Step 3 Step 4 Step 5 Step 1 Step 2 Step 3 Step 4 Step 5

java.util.concurrent [.atomic|locks]

Locks und Conditions <<Interface>> Lock lock() lockInterruptibly() tryLock(timeout) unlock() <<Interface>>
Condition await() await(timeout) signal() signalAll() creates ReentrantLock

public class MyNewClass... private String myProp = ""; private Lock
lock = new ReentrantLock(); public String getMyProp() { lock.lock(); try { return myProp; } finally {lock.unlock();} } public void setMyProp(String value) { lock.lock(); try { myProp = value; } finally {lock.unlock();} } public class MyNewClass... private String myProp = ""; public synchronized String getMyProp() { return myProp; } public synchronized void setMyProp(String value) { myProp = value; }

Das Executor-Framework new Thread(...).start() ! <<Interface>> Executor execute(Runnable) <<Interface>> ExecutorService
submit(Runnable):Future<?> submit(Callable<T>):Future<T> shutdown() awaitTermination(timeout) <<Interface>> Future<T> cancel(mayInterrupt) get(): T get(timeout): T <<Interface>> Callable<T> call(): T Nie wieder

ExecutorService executor = Executors.newCachedThreadPool(); Callable task = new Callable<MyType>() {
@Override public MyType call() { ... //perform long-running task return new MyType(...); } }; Future<MyType> result = executor.submit(task); ... //do something else result.get(); //wait for end of task Das Executor-Framework Nie wieder new Thread(...).start() !

Atomics public class SynchronizedCounter... private long count = 0; public
synchronized long value() { return count; } public synchronized void inc() { count++; } public class AtomicCounter... private AtomicLong count = new AtomicLong(0); public long value() { return count.longValue(); } public void inc() { count.incrementAndGet(); }

Und noch mehr... • Concurrent Collections • BlockingQueue • Latches,
Barriers, Exchangers & Co • Fork/Join (Java 7)

Shelf capacity: int products: List<Product> putIn(product) takeOut(product): boolean isFull(): boolean
Product type: String Storehouse newShelf(name, size): Shelf getShelf(name): Shelf move(product, from, to): boolean shelves: Map<String, Shelf> name

public class Shelf { private int capacity; private List<Product> products
= new ArrayList<Product>(); public Shelf(int capacity) { this.capacity = capacity; } public List<Product> getProducts() { return products; } public boolean isFull() { return products.size() == capacity; } public void putIn(Product product) { if (isFull()) throw new StorageException("shelf is full."); products.add(product); } public boolean takeOut(Product aBook) { return products.remove(aBook); } }

public class Storehouse { private Map<String, Shelf> shelves = new
HashMap<String, Shelf>(); public Shelf newShelf(String name, int capacity) { Shelf newShelf = new Shelf(capacity); shelves.put(name, newShelf); return newShelf; } public Shelf getShelf(String name) { return shelves.get(name); } }

public class ConcurrentShelfCreationTest extends ParallelTestCase { private volatile Storehouse store
= new Storehouse(); @Test public void createAccountsInManyThreads() throws Exception { final int numberOfShelves = 100; final AtomicInteger shelf = new AtomicInteger(0); Runnable createShelfTask = new Runnable() { @Override public void run() { String shelfName = "n" + shelf.incrementAndGet(); Shelf shelf = store.newShelf(shelfName, 1); assertNotNull("No shelf: " + shelfName, store.getShelf(shelfName)); } }; runInParallelThreads(numberOfShelves, createShelfTask); } }

public class Storehouse { private Map<String, Shelf> shelves = new
HashMap<String, Shelf>(); public Shelf newShelf(String name, int capacity) { Shelf newShelf = new Shelf(capacity); shelves.put(name, newShelf); return newShelf; } public Shelf getShelf(String name) { return shelves.get(name); } } "No shelf: n28"

import java.util.concurrent.ConcurrentHashMap; public class Storehouse... private final Map<String, Shelf> shelves
= new ConcurrentHashMap<String, Shelf>(); ok

public class Shelf... public synchronized int getCapacity() { return capacity;
} public synchronized List<Product> getProducts() { return products; } public synchronized boolean isEmpty() { return products.isEmpty(); } public synchronized boolean isFull() { return products.size() == capacity; } public synchronized void putIn(Product product) { if (isFull()) throw new StorageException("shelf is full."); products.add(product); } public synchronized boolean takeOut(Product aBook) { return products.remove(aBook); }

public class Storehouse... public boolean move( Product product, String from,
String to ) { if (!shelves.get(from).takeOut(product)) { return false; } try { shelves.get(to).putIn(product); return true; } catch (StorageException se) { shelves.get(from).putIn(product); return false; } ( public synchronized boolean move(

public class Storehouse... public boolean move(Product product, String from, String
to) { Shelf source = shelves.get(from); Shelf target = shelves.get(to); synchronized (source) { synchronized (target) { return doMove(product, source, target); } } } private boolean doMove(Product product, Shelf src, Shelf trg) { ... } // Thread A: storehouse.move(aBook, "shelf a", "shelf b"); // Thread B: storehouse.move(anIPod, "shelf b", "shelf a"); (

public class Storehouse... public boolean move(Product product, String from, String
to) { Shelf source = shelves.get(from); Shelf target = shelves.get(to); Object[] locks = new Object[] { source, target }; Arrays.sort(locks); synchronized (locks[0]) { synchronized (locks[1]) { return doMove(product, source, target); } } } public class Shelf implements Comparable<Shelf>... public int compareTo(Shelf other) { return System.identityHashCode(this) - System.identityHashCode(other); }

Für normal sterbliche Entwickler liegt die Erstellung korrekter Multi-Thread- Programme
außerhalb ihrer Fähigkeiten

Nicht-erwähnte Probleme • Safe Publication, Java Memory Model • Cache
Lines, False Sharing • Symmetric / Asymmetric Memory • Blocking Locks vs. Spin Locks vs. Non- blocking Synchronization • Fair vs Unfair Locking • Live Locks / Starvation • etc etc etc etc

public class ValueHolder { private List<Listener> listeners = new LinkedList<Listener>();
private int value; public static interface Listener { public void valueChanged(int newValue); } public void addListener(Listener listener) { listeners.add(listener); } public void setValue(int newValue) { value = newValue; for (Listener each : listeners) { each.valueChanged(newValue); } } } The Problem with Threads http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-1.pdf

Zugrundeliegende Programmiermodell Shared mutable state (aka objects) is accessed in
multiple concurrent threads, and we use locks to synchronize / sequentialize access to the state

Das Objekt wird zur "undichten" Abstraktion Um aus einzelnen thread-sicheren
Objekten komplexere thread-sichere Objekte zu bauen, muss der Locking-Mechanismus der Einzelobjekte bekannt sein ‣ Verletzt das Prinzip der Kapselung ‣ Verletzt das Prinzip der Modularisierung

Können uns andere Programmierparadigmen retten?

Alternative Paradigmen • Immutability • Transactional Memory • Actors •
Agents • Fork/Join • Map / Filter / Reduce • Dataﬂow Concurrency

Immutability to rescue? • Ohne veränderlichen Zustand, müssen Veränderungen auch
nicht synchronisiert werden • Eine verändernde Operation gibt ein neues Objekt zurück

Unveränderliches Java-Objekt public class ImmutableName { private final String first,
last; public ImmutableName(String first, String last) { this.first = first; this.last = last; } public String getFirst() { return first; } public String getLast() { return last; } @Override public int hashCode() { final int prime = 31; ... return result; } @Override public boolean equals(Object obj) { if (this == obj) ... return true; } @Override public String toString() { return ...; } }

Unveränderliches Groovy-Objekt @Immutable class ImmutableName { String first String last
} ImmutableName setLast(newLast) { new ImmutableName(first, newLast) } }

Was passiert mit dem Zustand? • Trick: Wir unterscheiden zwischen
Identität eines Objekts und seinem aktuellen Zustand ‣ Zustand repräsentiert durch Immutable Values ‣ Objekt bietet eine sichere Methode an, um Zustand zu aktualisieren • Der überwiegende Teil des Programms arbeitet mit "thread-sicheren" Immutables

@Immutable class Shelf... int capacity List products Shelf putIn(Product product)
{ if (isFull()) throw new StorageException("shelf is full.") return cloneWith(products: new ArrayList(products) << product) } Shelf takeOut(Product product) { if (!products.contains(product)) return this return cloneWith(products: new ArrayList(products).minus(product)) } private Shelf cloneWith(changes) { def newProps = ... return new Shelf(newProps) }

@Immutable class Shelf... long version Shelf incrementVersion() { cloneWith(version: version
+ 1) }

class Storehouse... final shelves = [:] as ConcurrentHashMap Shelf newShelf(String
name, int size) { def newShelf = new Shelf(size, [], 0) shelves[name] = newShelf return newShelf } Shelf getAt(String name) { return shelves[name] } synchronized boolean update(Map shelvesToUpdate) { if (shelvesToUpdate.any {name, shelf -> shelves[name].version != shelf.version }) return false shelvesToUpdate.each {name, shelf -> shelves[name] = shelf.incVersion() } return true }

class Storehouse... boolean move(Product product, String from, String to) {
while(true) { Shelf shelfTo = shelves[to] Shelf shelfFrom = shelves[from] if (shelfTo.isFull()) return false def newShelfFrom = shelfFrom.takeOut(product) if (shelfFrom == newShelfFrom) return false shelfTo = shelfTo.putIn(product) def updates = [(from): newShelfFrom, (to): shelfTo] if (update(updates)) return true } }

Immutability im Großen • Performante Implementierung von "Immutable Data Types"
ist möglich • In funktionale Programmiersprachen sind unveränderliche Werte die Norm und veränderlicher State die Ausnahme ‣ Beispiel Clojure: Fokus auf unveränderliche Werte und explizite Mechanismen zur Manipulation des aktuellen Werts einer Entity

It‘s the coordination, stupid! Paradigma Koordination Immutability No coordination Fork/Join
Map/Reduce Working on collections with ﬁxed coordination Locking, Actors Explicit coordination Agent, STM Delegated coordination Dataﬂow Implicit coordination (c) Dierk König: Groovy in Action, 2nd ed, ch. 17

Transactional Memory • Heap als transaktionale Datenmenge • Transaktionseigenschaften ähnlich
wie bei einer Datenbank (ACID) • Optimistische Transaktionen ‣ Transaktionen werden bei einer Kollision automatisch wiederholt • Transaktionen können geschachtelt werden ✘

public class Shelf... void putIn(Product product) { atomic { if
(isFull()) throw new StorageException("...") products << product } } boolean takeOut(Product product) { atomic { return products.remove(product); } } public class Storehouse... boolean move(Bank bank) { atomic { if (!shelves[to].isFull() && shelves[from].takeOut(product)) { shelves[to].putIn(product) return true } return false } }

• Wir können weiterhin in "shared state" und Transaktionen denken
• Die korrekte Verwendung ist wesentlich einfacher als bei Locks: Deadlocks sind ausgeschlossen! • Wir gewinnen die Komponierbarkeit von Objekten zurück Vorteile von TM

• Keinerlei Hilfe, wie wir unser sequenzielles Programm parallelisieren. •
Der Fortschritt ist nicht garantiert. Livelocks sind möglich. • Ein Programm mit Transaktionen bleibt nicht deterministisch. • Keine Seiteneffekte in Transaktion erlaubt • Die performante und semantisch intuitive Implementierung ist noch Forschungsthema. Nachteile von TM

• Hardware Transactional Memory • Software Transactional Memory ‣ Clojure:
Programmiersprache auf der JVM, die STM eingebaut hat ‣ Java-STM-Frameworks: Akka, Deuce, Multiverse, Gpars TM-Implementierungen

STM in Clojure • Nur explizit gekennzeichneter State (References) wird
in Transaktion mit einbezogen • Alle anderen Daten / Objekte sind immutable

Clojure-Beispiel: Shelf (defn empty-shelf [capacity] {:products '() :capacity capacity}) (defn
put-in [shelf product] (if (= (count (shelf :products )) (shelf :capacity )) (throw (Exception. "Shelf is full.")) (assoc shelf :products (conj (shelf :products ) product)))) (defn take-out [shelf product] (assoc shelf :products (remove #(= % product) (shelf :products ))))

Clojure-Beispiel: Storehouse (defn empty-storehouse [] (ref {})) (defn add-shelf [map
name shelf] (assoc map name shelf)) (defn replace-shelf [map name shelf] (add-shelf map name shelf)) (defn new-shelf [store name capacity] (dosync (alter store add-shelf name (empty-shelf capacity)))) (defn update-shelf [store shelf-map] (dosync ...)) (defn put-in-shelf [store shelf-name product] (dosync ...)) (defn take-from-shelf [store shelf-name product] (dosync ...)) (defn move [store product from-name to-name] (dosync (put-in-shelf store to-name product) (take-from-shelf store from-name product)))

class Storehouse... boolean move(Product product, String from, String to) {
while(true) { Shelf shelfTo = shelves[to] Shelf shelfFrom = shelves[from] if (shelfTo.isFull()) return false def newShelfFrom = shelfFrom.takeOut(product) if (shelfFrom == newShelfFrom) return false shelfTo = shelfTo.putIn(product) def updates = [(from): newShelfFrom, (to): shelfTo] if (update(updates)) return true } }

Wo loggen wir erfolgreiche Transaktionen?

Fixed Coordination: Parallele Collections • Wir arbeiten auf allen Elementen
einer Collection gleichzeitig • Voraussetzung: Die Einzeloperationen sind unabhängig voneinander • Typische parallele Aktionen: ‣ Transformieren ‣ Filtern ‣ Zusammenfassen (reduce)

Fork/Join on collections import static groovyx.gpars.GParsPool.withPool def numbers = [1,
2, 3, 4, 5, 6] withPool { assert 91 == numbers .collectParallel { it * it } .sumParallel() }

More such methods any { ... } collect { ...
} count(filter) each { ... } eachWithIndex { ... } every { ... } find { ... } findAll { ... } findAny { ... } fold { ... } fold(seed) { ... } grep(filter) groupBy { ... } max { ... } max() min { ... } min() split { ... } sum()

Map/Filter/Reduce on collections import static groovyx.gpars.GParsPool.withPool withPool { assert 84
== [0, 1, 2, 3, 4, 5, 6].parallel .filter { it % 2 == 0 } .map { it + 1 } .map { it ** 2 } .reduce { a, b -> a + b } }

Fork/Join vs Map/Filter/Reduce ! ﬁxed coordination (c) Dierk König

Explicit Coordination: Message Passing - Actors • Vollständiger Verzicht auf
"shared state" • Aktoren empfangen und verschicken Nachrichten ‣ Asynchron und nicht-blockierend ‣ Jeder Actor hat seine "Mailbox" • Immer nur ein aktiver Thread pro Aktor

Aktoren in GPars import static groovyx.gpars.actor.Actors.* def printer = reactor
{ println it } def decryptor = reactor { reply it.reverse() } actor { decryptor << 'lellarap si yvoorG' react { answer -> printer << 'Decrypted message: ' + answer decryptor.stop() printer.stop() } }.join() (c) Dierk König

Actors - Vorteile • Unabhängige Aktoren • Skalierbar • Verteilbar
• Leichtere Vermeidung von ‣ Race Conditions ‣ Dead Locks ‣ Starvation

Actors - Nachteile • Explizite Koordination notwendig! • Kommunikation über
asynchrone Nachrichten ist für viele Problemstellungen eine Komplizierung • Nicht geeignet für Probleme, die einen echten Konsens über gemeinsame Objekte erfordern

Aktoren in Erlang • Aktoren: Prozesse + Modul • Selektiver
Nachrichtenempfang • "Warten auf Nachrichten" als Continuation • Endrekursive Funktionen zur Zustands

loop(KeyValues) -> receive {key, Sender, Key} -> receive {value, Sender,
Value} -> NewElement = {Key, Value}, loop([NewElement | KeyValues]) end; {get, Sender, Key} -> reply(Sender, getValue(KeyValues, Key)), loop(KeyValues); stop -> ok end. reply(Sender, ReplyMsg) -> Sender ! {self(), ReplyMsg}. getValue(..,..) -> ...

Erlang Process Patterns • Process should encapsulate an activity, not
a task • Archetypes of processes ‣ Client / Server ‣ Finite State Machine ‣ Event Manager / Event Handler • Supervisor hierarchies for robustness and fault tolerance

JVM-Probleme für nebenläuﬁges Programmieren • Kurz- und mittelfristig ‣ keine
Closures ‣ keine nativen persistenten Datenstrukturen ‣ die meisten Java-Bibliotheken nicht thread-sicher • Langfristig ‣ Keine Unterstützung von Coroutines/Continuations ‣ Late binding verhindert Sicherheit ‣ Task-basierte Concurrency skaliert nicht gut ‣ Keine performante Kommunikation zwischen Tasks ‣ Keine Optimierung für End-Rekursion

Delegate to an Agent import groovyx.gpars.agent.Agent def safe = new
Agent<Storehouse>( storehouse ) safe << { move ... } println safe.val

Agents • Sie kapseln nicht threadsichere (Java-) Komponenten • Konzeptionell
ähnlich wie Agents in Clojure • Implementierung unterscheiden sich stark in Efﬁzienz

DataFlow for implicit coordination import groovyx.gpars.dataflow.DataFlowVaraiable import static groovyx.gpars.dataflow.DataFlow.task final
dichte = new DataflowVariable() final gewicht = new DataflowVariable() final volumen = new DataflowVariable() task { dichte << gewicht.val / volumen.val } task { gewicht << 10.6 } task { volumen << 5.0 } assert dichte.val == 2.12

DataFlow • Write-Once, Read-Many (non-blocking) • Flavors: variables, streams, operators,
tasks, flows • Model the flow of data, not the control flow!

Dataﬂow-Operators final leftAddend = new DataflowQueue() final rightAddend = new
DataflowQueue() final sum = new DataflowQueue() operator(inputs: [leftAddend, rightAddend], outputs: [sum], { left, right -> sum << left + right }) task { [10, 20, 30].each { leftAddend << it } } task { [100, 200, 300].each { rightAddend << it } } [110, 220, 330].each { assert it == sum.val }

Vorteile von Dataﬂows • Kein Locking notwendig • Keine Race-Conditions
• Deterministisch ‣ Deadlocks sind möglich, sie passieren dann aber immer! • Kein Unterschied zwischen sequenziellem und nebenläuﬁgem Code • Skalieren gut

Mehr über Data Flows • Einschränkungen ‣ Nicht für jede
Problemstellung geeignet ‣ Berechnungen dürfen keine Seiteneffekte haben • Erweiterungen ‣ Data flow streams, Data flow operators • Andere JVM-Implementierungen ‣ Scala Dataflow (akka), FlowJava (akademisch & tot)

Takeaways • Shared mutable state ist böse • Andere Paradigmen
sind einfacher zu verstehen, erfordern aber häuﬁg ein Umdenken bei Design / Architektur • Java als Sprache hat zu viel Zeremonie, um alternative Ansätze knapp und lesbar schreiben zu können ‣ Groovy / Gpars sehr gut zum Experimentieren • JVM als Plattform ist für manche Ansätze nicht optimal geeignet

Links Code-Beispiele https://github.com/jlink/ ConcurrencyOnJvmAndElsewhere Hinweise http://jax.de http://www.parallel2012.de Sonderheft iX Multicore
(Dez. 2011) Zum Ausprobieren http://gpars.codehaus.org/ http://clojure.org/ http://www.erlang.org

Nebenläufigkeit auf der JVM und anderswo

Nebenläufigkeit auf der JVM und anderswo

Other Decks in Programming

Featured

Transcript