Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Concurrent programming with Python and my littl...

Concurrent programming with Python and my little experiment

Presentation of offset given at the Python Fosdem

Benoit Chesneau

March 19, 2014
Tweet

More Decks by Benoit Chesneau

Other Decks in Programming

Transcript

  1. • Collections of interacting computational processes that *may* run in

    parallel • Can be executed on a single processor by interleaving the execution steps of each in a time-slicing way • or run on multiple CPU or machines concurrent programming?
  2. • Asynchronous and Evented programs • Event Loop • Explicit

    yielding: don’t require to decorate or patch the standard lib asyncio
  3. • 2 main groups • implicit yielding, asynchronous handling is

    hidden : gevent, eventlet, evergreen • evented libraries: twisted,… • All are using an event loop External libs
  4. gevent echo server from __future__ import print_function from gevent.server import

    StreamServer ! def echo(socket, address): print('New connection from %s:%s' % address) fileobj = socket.makefile() while True: line = fileobj.readline() if not line: print("client disconnected") break if line.strip().lower() == 'quit': print("client quit") break fileobj.write(line) fileobj.flush() print("echoed %r" % line) ! ! if __name__ == '__main__': server = StreamServer(('0.0.0.0', 6000), echo) server.serve_forever()
  5. evergreen echo server import sys ! import evergreen from evergreen.io

    import tcp ! loop = evergreen.EventLoop() ! class EchoServer(tcp.TCPServer): ! @evergreen.task def handle_connection(self, connection): print('client connected from {}'.format(connection.peername)) while True: data = connection.read_until('\n') if not data: break connection.write(data) print('connection closed') ! ! def main(): server = EchoServer() port = int(sys.argv[1] if len(sys.argv) > 1 else 1234) server.bind(('0.0.0.0', port)) server.serve() ! evergreen.spawn(main) loop.run()
  6. gevent echo server from twisted.internet.protocol import Protocol, Factory from twisted.internet

    import reactor ! class Echo(Protocol): def dataReceived(self, data): """ As soon as any data is received, write it back. """ self.transport.write(data) ! ! def main(): f = Factory() f.protocol = Echo reactor.listenTCP(8000, f) reactor.run() ! if __name__ == '__main__': main()
  7. asyncio echo server import asyncio ! class EchoServer(asyncio.Protocol): ! def

    connection_made(self, transport): print('connection made') self.transport = transport ! def data_received(self, data): print('data received: ', data.decode()) self.transport.write(b'Re: ' + data) ! f = loop.create_server(EchoServer,”localhost”, 7000) return loop.run_until_complete(f)
  8. an offset is a small, virtually complete daughter plant that

    has been naturally and asexually produced on the mother plant.
  9. • goroutines (coroutines) • a goroutine doesn’t know anything about

    others • share nothing: channels are the only way to communicate • select to wait on multiple channels the go memory model
  10. • GIL • 1 OS thread is executed at a

    time • works well for I/O bound operations since they can work in the background • no built-in implementation of the coroutines Drawbacks of Python
  11. • implicit yielding • 1 context / Python VM •

    Coroutines are always scheduled in the main thread. • the main thread hold the context • Blocking call operations (mostly io) are executed in their thread choices
  12. • offset, library implementing the go concurrency model in Python

    • compatible Python 2.7, Python 3.3x and sup • ...and Pypy • http://github.com/benoitc/offset offset
  13. • python-fibers on cpython
 https://github.com/saghul/python-fibers • continulets on pypy •

    Atomic locking using atomic operations implemented using CFFI. • abstracted in a Proc class goroutines
  14. • OS Thread (F) • Scheduler context (P) • A

    goroutine offset scheduler: 3 entities F P G
  15. • The context lives in the main thread • 1

    context / Python VM • a context holds the coroutines in a run queue. goroutines
  16. • patched functions and objects from the stdlib maintained in

    the syscall module • a system call is executed in its own thread • use the Futures from the concurrent module • number of io threads can be set syscalls
  17. • the goroutine is put out the run queue •

    a Future is launched and will execute the blocking function (syscall) • the link between the future is stored in the scheduler • the scheduler always wait for syscalls if no goroutines are scheduled entering in syscall
  18. • the goroutine is put back on the top of

    the run queue • the result is returned syscall exit
  19. Goroutine example from offset import go, maintask, run from offset

    import time ! def say(s): for i in range(5): time.sleep(100 * time.MILLISECOND) print(s) ! @maintask def main(): go(say, "world") say("hello") ! if __name__ == "__main__": run() package main ! import ( "fmt" "time" ) ! func say(s string) { for i := 0; i < 5; i++ { time.Sleep(100 * time.Millisecond) fmt.Println(s) } } ! func main() { go say("world") say("hello") } Python Go
  20. • channels are fully implemented in offset, even select •

    when a sender or a receiver need to wait, the goroutine is put out of the run queue • when a message can be sent or received the target is put back in the run queue implementing the channel
  21. channel example from offset import ( makechan, go, maintask, run

    ) ! def sum(a, c): s = 0 for v in a: s += v c.send(s) ! @maintask def main(): a = [7, 2, 8, -9, 4, 0] ! c = makechan() go(sum, a[:int(len(a)/2)], c) go(sum, a[int(len(a)/2):], c) x, y = c.recv(), c.recv() ! print(x, y, x+y) ! if __name__ == "__main__": run() package main ! import "fmt" ! func sum(a []int, c chan int) { sum := 0 for _, v := range a { sum += v } c <- sum // send sum to c } ! func main() { a := []int{7, 2, 8, -9, 4, 0} ! c := make(chan int) go sum(a[:len(a)/2], c) go sum(a[len(a)/2:], c) x, y := <-c, <-c // receive from c ! fmt.Println(x, y, x+y) } Python Go
  22. • a buffer to maintain new messages • the sender

    can send until the buffer is full • the receiver always block waiting a message and eventually wake up the sender buffered channels
  23. buffered channel example from offset import ( makechan, maintask, run

    ) ! @maintask def main(): c = makechan(2) c.send(1) c.send(2) print(c.recv()) print(c.recv()) ! ! if __name__ == "__main__": run() package main ! import "fmt" ! func main() { c := make(chan int, 2) c <- 1 c <- 2 fmt.Println(<-c) fmt.Println(<-c) } Python Go
  24. example of select from offset import ( makechan, select, go,

    run, maintask ) ! def fibonacci(c, quit): x, y = 0, 1 while True: ret = select(c.if_send(x), quit.if_recv()) if ret == c.if_send(x): x, y = y, x+y elif ret == quit.if_recv(): print("quit") return ! @maintask def main(): c = makechan() quit = makechan() def f(): for i in range(10): print(c.recv()) ! quit.send(0) ! go(f) fibonacci(c, quit) ! if __name__ == "__main__": run() package main ! import "fmt" ! func fibonacci(c, quit chan int) { x, y := 0, 1 for { select { case c <- x: x, y = y, x+y case <-quit: fmt.Println("quit") return } } } ! func main() { c := make(chan int) quit := make(chan int) go func() { for i := 0; i < 10; i++ { fmt.Println(<-c) } quit <- 0 }() fibonacci(c, quit) } Python Go
  25. • sync: basic synchronization primitives 100% • time: timer, ticker

    and sleep implemented • net and io modules (to release) other modules implemented
  26. • channels have been rewritten to use mmap and share

    memory . full implementation using CFFI. • Serialisation is done using Dill: https:// github.com/uqfoundation/dill • spawn on different processes and machines • move to be more similar to the julia language in the API changes