Slide 1

Slide 1 text

Python Fosdem - 2014/02/02 by Benoît Chesneau Concurrent programming with Python and my little experiment

Slide 2

Slide 2 text

• Collections of interacting computational processes that *may* run in parallel • Can be executed on a single processor by interleaving the execution steps of each in a time-slicing way • or run on multiple CPU or machines concurrent programming?

Slide 3

Slide 3 text

• shared memory • message passing concurrent programming?

Slide 4

Slide 4 text

• Asyncore • Threads & Futures • Asyncio nowadays in Python

Slide 5

Slide 5 text

• Asynchronous and Evented programs • Event Loop • Explicit yielding: don’t require to decorate or patch the standard lib asyncio

Slide 6

Slide 6 text

• 2 main groups • implicit yielding, asynchronous handling is hidden : gevent, eventlet, evergreen • evented libraries: twisted,… • All are using an event loop External libs

Slide 7

Slide 7 text

gevent echo server from __future__ import print_function from gevent.server import StreamServer ! def echo(socket, address): print('New connection from %s:%s' % address) fileobj = socket.makefile() while True: line = fileobj.readline() if not line: print("client disconnected") break if line.strip().lower() == 'quit': print("client quit") break fileobj.write(line) fileobj.flush() print("echoed %r" % line) ! ! if __name__ == '__main__': server = StreamServer(('0.0.0.0', 6000), echo) server.serve_forever()

Slide 8

Slide 8 text

evergreen echo server import sys ! import evergreen from evergreen.io import tcp ! loop = evergreen.EventLoop() ! class EchoServer(tcp.TCPServer): ! @evergreen.task def handle_connection(self, connection): print('client connected from {}'.format(connection.peername)) while True: data = connection.read_until('\n') if not data: break connection.write(data) print('connection closed') ! ! def main(): server = EchoServer() port = int(sys.argv[1] if len(sys.argv) > 1 else 1234) server.bind(('0.0.0.0', port)) server.serve() ! evergreen.spawn(main) loop.run()

Slide 9

Slide 9 text

gevent echo server from twisted.internet.protocol import Protocol, Factory from twisted.internet import reactor ! class Echo(Protocol): def dataReceived(self, data): """ As soon as any data is received, write it back. """ self.transport.write(data) ! ! def main(): f = Factory() f.protocol = Echo reactor.listenTCP(8000, f) reactor.run() ! if __name__ == '__main__': main()

Slide 10

Slide 10 text

asyncio echo server import asyncio ! class EchoServer(asyncio.Protocol): ! def connection_made(self, transport): print('connection made') self.transport = transport ! def data_received(self, data): print('data received: ', data.decode()) self.transport.write(b'Re: ' + data) ! f = loop.create_server(EchoServer,”localhost”, 7000) return loop.run_until_complete(f)

Slide 11

Slide 11 text

Implementing 
 the go concurrency model
 in Python my little experiment

Slide 12

Slide 12 text

an offset is a small, virtually complete daughter plant that has been naturally and asexually produced on the mother plant.

Slide 13

Slide 13 text

• goroutines (coroutines) • a goroutine doesn’t know anything about others • share nothing: channels are the only way to communicate • select to wait on multiple channels the go memory model

Slide 14

Slide 14 text

• GIL • 1 OS thread is executed at a time • works well for I/O bound operations since they can work in the background • no built-in implementation of the coroutines Drawbacks of Python

Slide 15

Slide 15 text

• implicit yielding • 1 context / Python VM • Coroutines are always scheduled in the main thread. • the main thread hold the context • Blocking call operations (mostly io) are executed in their thread choices

Slide 16

Slide 16 text

• offset, library implementing the go concurrency model in Python • compatible Python 2.7, Python 3.3x and sup • ...and Pypy • http://github.com/benoitc/offset offset

Slide 17

Slide 17 text

• python-fibers on cpython
 https://github.com/saghul/python-fibers • continulets on pypy • Atomic locking using atomic operations implemented using CFFI. • abstracted in a Proc class goroutines

Slide 18

Slide 18 text

• OS Thread (F) • Scheduler context (P) • A goroutine offset scheduler: 3 entities F P G

Slide 19

Slide 19 text

the offset scheduler F P G G G G

Slide 20

Slide 20 text

• The context lives in the main thread • 1 context / Python VM • a context holds the coroutines in a run queue. goroutines

Slide 21

Slide 21 text

the offset scheduler P G G G G running goroutine

Slide 22

Slide 22 text

• patched functions and objects from the stdlib maintained in the syscall module • a system call is executed in its own thread • use the Futures from the concurrent module • number of io threads can be set syscalls

Slide 23

Slide 23 text

blocking call P G G G G0 P G G G F G0 syscall

Slide 24

Slide 24 text

• the goroutine is put out the run queue • a Future is launched and will execute the blocking function (syscall) • the link between the future is stored in the scheduler • the scheduler always wait for syscalls if no goroutines are scheduled entering in syscall

Slide 25

Slide 25 text

• the goroutine is put back on the top of the run queue • the result is returned syscall exit

Slide 26

Slide 26 text

Goroutine example from offset import go, maintask, run from offset import time ! def say(s): for i in range(5): time.sleep(100 * time.MILLISECOND) print(s) ! @maintask def main(): go(say, "world") say("hello") ! if __name__ == "__main__": run() package main ! import ( "fmt" "time" ) ! func say(s string) { for i := 0; i < 5; i++ { time.Sleep(100 * time.Millisecond) fmt.Println(s) } } ! func main() { go say("world") say("hello") } Python Go

Slide 27

Slide 27 text

• channels are fully implemented in offset, even select • when a sender or a receiver need to wait, the goroutine is put out of the run queue • when a message can be sent or received the target is put back in the run queue implementing the channel

Slide 28

Slide 28 text

channel example from offset import ( makechan, go, maintask, run ) ! def sum(a, c): s = 0 for v in a: s += v c.send(s) ! @maintask def main(): a = [7, 2, 8, -9, 4, 0] ! c = makechan() go(sum, a[:int(len(a)/2)], c) go(sum, a[int(len(a)/2):], c) x, y = c.recv(), c.recv() ! print(x, y, x+y) ! if __name__ == "__main__": run() package main ! import "fmt" ! func sum(a []int, c chan int) { sum := 0 for _, v := range a { sum += v } c <- sum // send sum to c } ! func main() { a := []int{7, 2, 8, -9, 4, 0} ! c := make(chan int) go sum(a[:len(a)/2], c) go sum(a[len(a)/2:], c) x, y := <-c, <-c // receive from c ! fmt.Println(x, y, x+y) } Python Go

Slide 29

Slide 29 text

• a buffer to maintain new messages • the sender can send until the buffer is full • the receiver always block waiting a message and eventually wake up the sender buffered channels

Slide 30

Slide 30 text

buffered channel example from offset import ( makechan, maintask, run ) ! @maintask def main(): c = makechan(2) c.send(1) c.send(2) print(c.recv()) print(c.recv()) ! ! if __name__ == "__main__": run() package main ! import "fmt" ! func main() { c := make(chan int, 2) c <- 1 c <- 2 fmt.Println(<-c) fmt.Println(<-c) } Python Go

Slide 31

Slide 31 text

example of select from offset import ( makechan, select, go, run, maintask ) ! def fibonacci(c, quit): x, y = 0, 1 while True: ret = select(c.if_send(x), quit.if_recv()) if ret == c.if_send(x): x, y = y, x+y elif ret == quit.if_recv(): print("quit") return ! @maintask def main(): c = makechan() quit = makechan() def f(): for i in range(10): print(c.recv()) ! quit.send(0) ! go(f) fibonacci(c, quit) ! if __name__ == "__main__": run() package main ! import "fmt" ! func fibonacci(c, quit chan int) { x, y := 0, 1 for { select { case c <- x: x, y = y, x+y case <-quit: fmt.Println("quit") return } } } ! func main() { c := make(chan int) quit := make(chan int) go func() { for i := 0; i < 10; i++ { fmt.Println(<-c) } quit <- 0 }() fibonacci(c, quit) } Python Go

Slide 32

Slide 32 text

• sync: basic synchronization primitives 100% • time: timer, ticker and sleep implemented • net and io modules (to release) other modules implemented

Slide 33

Slide 33 text

• channels have been rewritten to use mmap and share memory . full implementation using CFFI. • Serialisation is done using Dill: https:// github.com/uqfoundation/dill • spawn on different processes and machines • move to be more similar to the julia language in the API changes

Slide 34

Slide 34 text

? @benoitc