@anildigital Concurrency vs. parallelism 13 Concurrency is about dealing with lots of things at once. Parallelism is about doing lots of things at once. http://blog.golang.org/concurrency-is-not-parallelism Rob Pike
@anildigital Concurrency vs. parallelism 15 Lady Gaga Concurrent I will put down this drink to text you, then put my phone away and continue drinking, eh
@anildigital Three walls https://www.technologyreview.com/s/421186/why-cpus-arent-getting-any-faster/ Power wall Faster computers get really hot Memory wall Memory buses are not fast enough to handle these increase clock speeds ILP wall (Instruction level parallelism) Instruction pipeline really means digging deeper power hole 17 Dr. David Patterson, Berkeley
@anildigital Power wall + Memory wall + ILP wall Taken together, they mean. Computers will stop getting faster Furthermore, if an engineer optimizes one wall he aggravates the other two. Solution - Multi core processors 18
@anildigital Computer performance Latency - Amount of time it takes to complete particular program on hardware Throughput - Number of operations per second Utilisation - Utilising best use of multicore systems Speedup - (Specific to parallel programming, your algorithm runs faster on parallel hardware) Power Consumption (No one cares) 19
@anildigital Scheduling - Preemptive vs Non-preemptive A scheduling algorithm is Preemptive - If the active process or task or thread can be temporarily suspended to execute a more important process or task or thread Non-preemptive - If the active process or task or thread cannot be suspended i.e. runs to completion 20
@anildigital Scheduling - Cooperative A scheduling algorithm is Cooperative - When currently running process voluntarily gives up executing to allow another process to run. e.g. yield 21
@anildigital Java Concurrency - Threads Pros No scheduling needed by program (preemptive) Operating system does it for you Commonly used Cons Context switching / Scheduling overhead Deadlocks / Race conditions Synchronization / Locking issues 26
@anildigital Locks - java.util.concurrent ReentrantReadWriteLock 31 ReadWriteLock readWriteLock=new ReentrantReadWriteLock(); readWriteLock.readLock().lock(); // multiple readers can enter this section // if not locked for writing, and not writers waiting // to lock for writing. readWriteLock.readLock().unlock();
readWriteLock.writeLock().lock(); // only one writer can enter this section, // and only if no threads are currently reading. readWriteLock.writeLock().unlock();
@anildigital Semaphores - java.util.concurrent Semaphores A thread synchronization construct To avoid missed signals between threads or to guard a critical section like locks 36
@anildigital Semaphores - java.util.concurrent Semaphores 37 class Pool { private static final int MAX_AVAILABLE = 100; private final Semaphore available = new Semaphore(MAX_AVAILABLE, true); public Object getItem() throws InterruptedException { available.acquire(); return getNextAvailableItem(); } public void putItem(Object x) { if (markAsUnused(x)) available.release(); } ...
@anildigital Barrier - java.util.concurrent CyclicBarrier (multi-party synchronization) Can synchronize threads progressing through some algorithm. It is a barrier that all threads must wait at, until all threads reach it, before any of the threads can continue. 43
@anildigital Atomic variable Not possible to forget to acquire the lock when necessary. Second, because no locks are involved, it’s impossible for an operation on an atomic variable to deadlock. 50
@anildigital Queues - java.util.concurrent BlockingQueue public class BlockingQueueExample { public static void main(String[] args) throws Exception { BlockingQueue queue = new ArrayBlockingQueue(1024); Producer producer = new Producer(queue); Consumer consumer = new Consumer(queue); new Thread(producer).start(); new Thread(consumer).start(); Thread.sleep(4000); } }
@anildigital ThreadPool - java.util.concurrent Instead of creating new thread for the task, it can be passed to a thread pool ThreadPool reuses threads Internally it uses BlockingQueue for threads ExecutorService is a thread pool implementation 66
@anildigital ForkJoinPool - java.util.concurrent ForkJoinPool was added in Java 7 Similar to ExecutorService but one difference Makes it easy for tasks To split their work up into smaller tasks Tasks are then submitted to the ForkJoinPool too. Uses work stealing algorithm Next level of ExecutorService 69
@anildigital ForkJoinPool - java.util.concurrent Basic fork & join algorithm 70 Result solve(Problem problem) { if (problem is small) directly solve problem else { split problem into independent parts fork new subtasks to solve each part join all subtasks compose result from subresults } }
@anildigital Real world usage Threads & mutexes are still used by programmers Provides very low level APIs to handle concurrency Fine grained concurrency with more power java.util.concurrent is robust. Still there are chances of bugs. 72
@anildigital Amdhal’s Law To predict the theoretical maximum speedup for program processing using multiple processors. The speedup limited by the time needed for the sequential fraction of the program If N is the number of processors, s is the time spent by a processor on serial part of a program, and p is the time spent by a processor on a parallel part of a program, then the maximum possible speedup is given by: 1 / (s+p/N) Synchronization & communication overhead 75
@anildigital Clojure STM https://en.wikipedia.org/wiki/Software_transactional_memory “…completing an entire transaction verifies that other threads have not concurrently made changes to memory that it accessed in the past. This final operation, in which the changes of a transaction are validated and, if validation is successful, made permanent, is called a commit…” 77
@anildigital Clojure STM Software transactional memory (STM) Any method of coordinating multiple concurrent modifications to a shared set of storage locations. Similar to garbage collection has displaced the need for manual memory management. Characterized as providing the same kind of systematic simplification of another error-prone programming practice - manual lock management 78
@anildigital Clojure STM “Don’t wait on lock, just check when we’re ready to commit” 79 # Thread 2 atomic { - read variable - increment variable # going to write but Thread1 has written a variable… # notices Thread1 changed data, so ROLLS BACK - write variable } # Thread 1 atomic { - read a variable - increment a variable - write a variable }
@anildigital State, Identify and Value http://clojure.org/about/state State, Identify, and Value 72, {1, 2, 3}, “cat”, [4, 5, 7] (def sarah {:name "Sarah" :age 25 :wears-glasses? false}) An identify is an entity that has state State is a value at a point in time Value is something that doesn’t change e.g. Set of my favourite foods. I will prefer different foods in future, that will be a different set 80
@anildigital Issues with state based OOP langs Imperative programming and state OOP, complects identity and state No way to obtain state independent of identity without copying There is no way to associate the identity’s state with a different value other than in-place memory mutation. Objects not treated as values Clojure says, OO doesn’t have to be in this way 81
@anildigital Clojure STM Immutable data New values are functions of old values An identity’s state at any point in time is an immutable value Values can always be observed and new values calculated from old values without coordination Values will never change “in hand” Values can be safely shared between threads 83
@anildigital Clojure STM - Atomic references Atomic references to values Defines mutable data structures which are concurrency aware Models Identity by way of references to immutable data Dereferencing a reference type gives you its (immutable) value Changes to an identity’s state are controlled / coordinated by the system 84
@anildigital Clojure STM - Atomic references These mutable variables are in concert with Clojure’s persistent data structures to separate identify from state. Allowing accessing mutable variables from multiple threads without dangers of deadlock or race conditions In all cases the program will see stable views of the values in the world 85
@anildigital Clojure STM 86 Model Usage Functions Atoms Synchronised, Independent updates pure Refs Synchronised, Coordinated updates pure Agents Asynchronous, Independent updates any Vars Thread-local updates any Clojure Reference Containers
@anildigital Clojure Reference Types Isolate the state and constraint the ways in which that state can be changed. Clojure STM 88 Value Value Reference Type λ deref
@anildigital Clojure STM - Atom atomic variables very similar to java.util.concurrent Atomic Variables Persistent data structures separate identity from state. 89 (def my-atom (atom 42)) #'user/my-atom (deref my-atom) 42 @my-atom 42
@anildigital Clojure STM - Atom Watchers 92 (def a atom (0)) (add-watch a :print #(println "Changed from " %3 " to " %4)) # (swap! a + 2) Changed from 0 to 2
@anildigital Clojure STM - Agents Agents are an uncoordinated, asynchronous reference type. concurrency aware. Agents work in concert with persistent data structures to maintain the separation of identity and state. Changes to an agent’s state are independent of changes to other agents’ states, and that all such changes are made away from the thread of execution that schedules them. I/O and other side-effecting functions may be safely used in conjunction with agents. Agents are STM-aware, so that they may be safely used in the context of retrying transactions 93
@anildigital Clojure STM - Agents 95 Agents Actors Agent’s value can be retrieved with deref. Actor encapsulates state and provides no direct means to accept it. Agent does not encapsulates behaviour. Function is provided by sender Actor’s can encapsulate behaviour Agent’s error reporting is more primitive. Actor’s error detection and recovery is sophisticated Agents provide no support for distribution Actors can be remote Composing agents cannot deadlock Composing actors can deadlock
@anildigital Clojure STM - Agents Unlike refs and atoms, it is perfectly safe to use agents to coordinate I/O and perform other blocking operations. Use their own thread pool send For fixed thread pool. Never used for actions that might perform I/O or other blocking operations send-off for growing thread pool Ideal for guarding a contested resource. Blocking IO. 96
@anildigital Clojure STM - Refs Refs enable coordinated, synchronous changes to multiple values concurrently No possibility of the Involved refs ever being in an observable inconsistent state Race conditions among the involved refs. Deadlocks. No manual use of locks, monitors, or other low-level synchronization primitive 97
@anildigital Clojure STM - Refs Ideal for co-ordinating changes to multiple states Operations are atomic, consistent and isolated from other transactions Manual transaction control (dosync) Operations alter, ref-set, commute, ensure 98
@anildigital Clojure STM - Pros & Cons Pros Increased concurrency No thread needs to wait to access a resource Smaller scope that needs synchronizing - modifying disjoint parts of a system of a data structure Cons Aborting transactions Places limitations on the behaviour of transactions - they cannot perform any operation that cannot be undone, including most I/O 102
@anildigital Clojure - Futures Futures Evaluated within a thread pool that is shared with potentially blocking agent actions. This pooling of resources can make futures more efficient than creating native threads as needed. Using future is much more concise than setting up and starting a native thread. Clojure futures (the value returned by future) are instances of java.util.concurrent.Future, which can make it easier to interoperate with Java APIs that expect them. Dereferencing a future blocks until value is available 103
@anildigital Clojure - Delays Delays Construct that suspends some body of code, Evaluating only upon demand, when it is dereferenced Only evaluate their body of code once Caches the return value. 105 (def d (delay (println "Running...") :done!)) #'user/d (deref d) Running... :done!
@anildigital Clojure - Promises Used in similar way as delay or future They block when you dereference them if they don’t have value until they do You don’t immediately give them value, but provide them one by calling deliver 106 (def result (promise)) (future (println "The result is: " @result)) (Thread/sleep 2000)
@anildigital Clojure - Vars Identities (Atom, Agent, Refs) referred (eventually) to a single value Real world needs names that refer to different values at different points in time With vars you can override the value Concurrency aware Limitation We can override their value only within the scope of a particular function call, and nowhere else. 107
@anildigital Node.js Single threaded Cooperative multitasking (Node.js waits for some asynchronous I/O to yield to the next waiting process.) Slow network operations event loops are preferred You can serve more number of connections You don’t have to spend a process/thread for a connection libuv abstracts best of platforms in Node.js libuv internally uses thread pool. (e.g. File IO) 111
@anildigital Node.js Evented servers are really good for very light requests, but if you have long running request, it falls on its face. Pros Avoid polling. CPU bound vs IO bound Scales well vs spawning many threads No deadlocks Cons You block the even loop, all goes bad Program flow is “spaghettish” Callback hell Hard to debug, you lose the stack 114
@anildigital coroutines with asyncio in Python New trend in asynchronous programming Coroutines allow asynchronous code can be paused and resumed functions which cooperatively multitasks with other function Program puts coroutine on hold for async and control goes to event loop 116
@anildigital coroutines with asyncio in Python Coroutines are a language construct Event loop is a scheduler for coroutine Coroutines are best of the both the worlds, They can wait as callbacks do. They yield to each other so less chance of race conditions Coroutines are extremely lightweight and never preempted Coroutines wait on socket and if something happens on socket they become runnable. Compared to multi-threaded, co-routines can run sequence of operations in a single function. Exception handling works too Limitation Coroutines execute on single thread only 117
@anildigital Actor model Mathematical model of concurrent computation. The Actor model treats “Actors” as the universal primitives of concurrent digital computation Actors communicate with messages In response to a message that it receives, Actor can Make local decisions Create more Actors Send more messages Determine how to respond to the next message received 120
@anildigital CSP Practically applied to in industry as a tool for specifying and verifying the concurrent aspects of variety of different systems Processes - No threads. No shared memory. Fixed number of processes. Channels - Communication is synchronous (Unlike Actor model) Influences on design Go, Limbo 123
@anildigital CSP Pros Uses message passing and channels heavily, alternative to locks Cons Handling very big messages, or a lot of messages, unbounded buffers Messaging is essentially a copy of shared 125
@anildigital Elixir Elixir is written on top of Erlang. Shares runtime of Erlang & can use Erlang libraries. A functional programming language Immutable data structures 128
@anildigital Elixir - Processes Everything is process “Green” processes. Elixir process != Operating system process Elixir process != Operating system thread Completely isolated in execution time and space 130
@anildigital Elixir/Erlang VM CPU CPU CPU CPU Scheduler Scheduler Scheduler Scheduler Process Process Process Process Process Process Process Process Process Process Process Process Process Process Process Process OS Thread OS Thread OS Process BEAM OS Thread OS Thread
@anildigital Elixir - Processes Each Elixir process has own local GC No stop-the world garbage collection (like Ruby/Java) Much easier for process scheduler to schedule Very little overhead Cheap to create thousands of Elixir processes Concurrency is not difficult in Erlang / Elixir and in built (not conventional) Isolated crashes. Process crash doesn’t affect other processes. 132
@anildigital Elixir - Actor model Implements Actor model (similar to Postal service) Processes are actors Every actor has its address Actors communicate via messages asynchronously “Do not communicate by sharing memory; instead, share memory by communicating” 133
@anildigital Elixir - Actor model Actor do one of the following 3 things Create more actors Send messages to other actors whose addressees it knows Define how it reacts to a message Actors may receive messages in random order 134
@anildigital Elixir - Actor model 135 defmodule Talker do def loop do receive do {:greet, name} -> IO.puts("Hello #{name}") {:praise, name} -> IO.puts("#{name}, you're amazing") {:celebrate, name, age} -> IO.puts("Here's to another #{age} years, #{name}") end loop end end pid = spawn(&Talker.loop/0) send(pid, {:greet, "Huey"}) send(pid, {:praise, "Dewey"}) send(pid, {:celebrate, "Louie", 16}) # Outputs Hello Huey Dewey, you're amazing Here's to another 16 years, Louie
@anildigital Elixir - Actor model vs. CSP 136 CSP Actor model Send & Receive may block (synchronous) Only receive blocks Messages are delivered when they are sent No guarantee of delivery of messages Synchronous Send message and forget Works on one machine Work on multiple machines (Distributed by default) Lacks fault tolerance Fault tolerance
@anildigital Elixir - Actor model Pros Uses message passing heavily No shared state (avoid locks, easier to scale) Easier to maintain code. Declarative Cons When shared state is required, doesn’t fit well Handling of very big messages, or a lot of messages Messaging is essentially a copy of data 137
@anildigital Elixir / Erlang advanced tools OTP Defines systems in terms of hierarchies of applications. Agents Background process that maintain state State can be accessed at different places within a process or node, or across multiple nodes. Good at dealing with very-specific background activities, 138
@anildigital Elixir / Erlang advanced tools Nodes Instances of Erlang VM Can be connected to there Nodes Remote execution Supervisors Has one purpose. It manages one or more worker processes. Uses process-linking and monitoring facilities of Erlang VM Heart of Reliability 139
@anildigital Go Built-in concurrency based on Tony Hoare’s CSP paper. goroutines Lightweight processes Runs concurrently and parallel Communicate using channels Channels is something you create and pass around as object Read and write operations on channels block e.g. send from process A will not complete unless process B receives 142
@anildigital Go - Channels Channels are means of synchronization For multiple channels we use select statement select allows you to listen for multiple channels There is default case and timeout case. e.g. for network IO if nothing happens, a timeout will happen. timeout case is important (in a highly concurrent system with network calls) 145
@anildigital Go - Channels select 146 c1 := make(chan string) c2 := make(chan string) go func() { time.Sleep(time.Second * 1) c1 }() go func() { time.Sleep(time.Second * 2) c2 }() for i := 0; i < 2; i++ { select { case msg1 := fmt.Println("received", msg1) case msg2 := fmt.Println("received", msg2) } } $ time go run select.go received one received two real 0m2.245s Output
@anildigital Go - goroutines goroutines run in user space and are scheduled by user space runtime in Golang start with small stack and can be efficiently grown can be multiplexed on operating system thread has very well defined scheduling order for IO, scheduler will preempt it, it will make resume goroutine once IO is complete. Java/Ruby lack the concept of goroutine or lightweight thread or select (important for CSP). Without concept of cheap threads to communicate between channels. It won’t be possible to have go like concurrency. In Java/Ruby, it would be heavy weight thread, Not as performant. 147
@anildigital Ruby - Fibers A Fiber is a lightweight thread that uses cooperative multitasking instead of preemptive multitasking. A running fiber must explicitly "yield" to allow another fiber to run, which makes their implementation much easier than kernel or user threads. A fiber is a unit of execution that must be manually scheduled by the application. Fibers run in the context of the threads that schedule them. Each thread can schedule multiple fibers. In general, fibers do not provide advantages over a well-designed multithreaded application.However, using fibers can make it easier to port applications that were designed to schedule their own threads. 149
@anildigital Ruby - Fibers 150 Thread 1 Thread 2 Fiber 1 Fiber 2 Blocking IO (red) CPU Time (orange) Co-operative: 60ms Blocking IO (red) CPU Time (orange) 10 quanta * 10 ms = 100 ms Cooperative - Handling execution rights between one and another, saving local state
@anildigital Ruby - Fibers Pros Expressive state: state based computations much easier to understand and implement No need for locks (cooperative scheduling) Scales vertically (add more CPU power) Cons Single thread: Harder to paralleize/scale horizontally (Use more cases, add more nodes) Constrained to have all components work together symbiotically Fibers are more of a code organization pattern than a way of doing concurrent tasks. 151
@anildigital Ruby - GVL What is GVL? Global VM Lock (aka GIL - Global Interpreter Lock) What happens with GVL? With GVL, only one thread executes a time Thread must request a lock If lock is available, it is acquired If not, the thread blocks and waits for the lock to become available Ruby’s runtime guarantees thread safety. But it makes no guarantees about your code. 153
@anildigital Ruby - GVL Blocking or long-running operations happens outside of GVL You can still write performant concurrent (as good as Java, Node.js) in a Ruby app if it does only heavy IO Multithreaded CPU-bound requests GVL is still issue. Ruby is fast enough for IO (network) heavy applications (In most cases) 154
@anildigital Ruby - Why GVL? Makes developer’s life easier (It’s harder to corrupt data) Avoids race conditions C extensions It makes C extensions development easier Most C libraries are not thread safe Parts of Ruby’s implementation aren’t thread safe (Hash for instance) 155
@anildigital Ruby - Multiprocess vs. Multithreading 156 fork do puts "Hello Kaigi" end Thread.new do puts "Hello Kaigi" end Multiprocess Multithreading
@anildigital 157 Processes Threads Uses more memory Uses less memory If parent dies before children have exited, children can become zombie processes All threads die when the process dies (no chance of zombies) More expensive for forked processes to switch context since OS needs to save and reload everything Threads have considerably less overhead since they share address space and memory Forked processes are given a new virtual memory space (process isolation) Threads share the same memory, so need to control and deal with concurrent memory issues Requires inter-process communication Can "communicate" via queues and shared memory Slower to create and destroy Faster to create and destroy Easier to code and debug Can be significantly more complex to code and debug Ruby - Multiprocess vs. Multithreading
@anildigital Ruby lacked better concurrency abstractions Java has java.util.concurrent, Ruby didn't not have actor model Ruby didn’t have STM Ruby didn’t have better concurrency abstractions. Ruby has concurrent-ruby gem now concurrent-ruby gem provides concurrency aware abstractions (Inspired from other languages) 161
@anildigital concurrent-ruby General-purpose Concurrency Abstractions Async - A mixin module that provides simple asynchronous behavior to a class. Loosely based on Erlang's gen_server. Future - An asynchronous operation that produces a value. Promise - Similar to Futures, with more features. ScheduledTask - Like a Future scheduled for a specific future time. TimerTask - A Thread that periodically wakes up to perform work at regular intervals. 165
@anildigital concurrent-ruby Thread-safe Value Objects, Structures, and Collections Array - A thread-safe subclass of Ruby's standard Array. Hash - A thread-safe subclass of Ruby's standard Hash. Map - A hash-like object that should have much better performance characteristic Tuple - A fixed size array with volatile (synchronized, thread safe) getters/setters. 166
@anildigital concurrent-ruby Value objects Maybe - A thread-safe, immutable object representing an optional value. Delay - Lazy evaluation of a block yielding an immutable result. Based on Clojure's delay. 167
@anildigital concurrent-ruby Thread-safe variables Agent - A way to manage shared, mutable, asynchronous, independent, state. Based on Clojure's Agent. Atom - A way to manage shared, mutable, synchronous, independent state. Based on Clojure's Atom. AtomicBoolean - A boolean value that can be updated atomically. AtomicFixnum - A numeric value that can be updated atomically. AtomicReference - An object reference that may be updated atomically. Exchanger - A synchronization point at which threads can pair and swap elements within pairs. Based on Java's Exchanger. Java-inspired ThreadPools and Other Executors 168
@anildigital concurrent-ruby Thread Synchronization Classes and Algorithms CountDownLatch - A synchronization object that allows one thread to wait on multiple other threads. CyclicBarrier - A synchronization aid that allows a set of threads to all wait for each other to reach a common barrier point. Event - Old school kernel-style event. ReadWriteLock - A lock that supports multiple readers but only one writer. ReentrantReadWriteLock - A read/write lock with reentrant and upgrade features. Semaphore - A counting-based locking mechanism that uses permits. 169
@anildigital concurrent-ruby Edge features Actor - Implements the Actor Model, where concurrent actors exchange messages. New Future Framework Channel: Communicating Sequential Processes (CSP) - Functionally equivalent to Go channels with additional inspiration from Clojure core.async. 170