Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Elegant Concurrency

Elegant Concurrency

Writing concurrent program is hard; maintaining concurrent program even is a nightmare. Actually, a pattern which helps us to write good concurrent code is available, that is, using “channels” to communicate.

This talk will share the channel concept with common libraries, like threading and multiprocessing, to make concurrent code elegant.

It's the talk at PyCon TW 2017 [1] and PyCon APAC/MY 2017 [2].

[1]: https://tw.pycon.org/2017
[2]: https://pycon.my/pycon-apac-2017-program-schedule/

Mosky Liu

June 11, 2017
Tweet

More Decks by Mosky Liu

Other Decks in Programming

Transcript

  1. Elegant
    Concurrency

    View Slide

  2. Why
    Concurrency?

    View Slide

  3. View Slide

  4. Why Concurrency?
    Be a Good Machine Tamer!
    © Eduardo Woo

    View Slide

  5. As a Good Machine Tamer
    Why Concurrency?
    • Get the machine into full play!
    • The capacities:
    • CPU
    • IO
    • Disk
    • Network bandwidth
    • Network connections
    • etc.

    View Slide

  6. Concurrency

    Is Hard?

    View Slide

  7. ∵ The Various Ways?
    Concurrency Is Hard?
    • threading
    • queue
    • multiprocessing
    • concurrent.futures
    • asyncio
    • thread
    • process
    • coroutine
    • gevent
    • lock
    • rlock
    • condition
    • semaphore
    • event
    • barrier
    • manager
    • …
    • ???

    View Slide

  8. With Today's Sharing
    Concurrency Is Hard?
    ★ queue
    ★ thread

    View Slide

  9. Plus Some
    Concurrency Is Hard?
    ★ queue
    ★ thread
    ★ process
    ★ coroutine
    ★ gevent

    View Slide

  10. ❤ Python & open source
    Mosky
    • Python Charmer at Pinkoi.
    • Has spoken at
    • PyCons in TW, KR, JP, SG, HK
    • COSCUPs & TEDx, etc.
    • Countless hours 

    for teaching Python.
    • Has serval Python packages:
    • ZIPCodeTW, 

    MoSQL, Clime, etc.
    • http://mosky.tw/

    View Slide

  11. Frontend &
    Backend

    Engineers
    We're looking for

    View Slide

  12. Outline
    • Why Concurrency?
    • Concurrency Is Hard?
    ★ Communicating Sequential Processes (CSP)
    ★ Channel-Based Concurrency
    ★ Concurrent Units
    ★ CSP vs. X

    View Slide

  13. Communicating
    Sequential Processes

    View Slide

  14. Communicating Sequential Processes
    Is a Formal Language

    View Slide

  15. Communicating Sequential Processes
    • A formal language for describing concurrent systems.
    • The main ideas:
    • “Processes” and
    • Interact with each other solely through channels.
    • But why CSP?

    View Slide

  16. — Effective Go
    Do not communicate by 

    sharing memory; instead, 

    share memory by communicating.


    View Slide

  17. — Effective Go
    Using channels to control access
    makes it easier to write 

    clear, correct programs.


    View Slide

  18. — The Python Wiki
    Use locks and shared memory to
    shoot yourself in the foot 

    in parallel.


    View Slide

  19. In Python
    Communicating Sequential Processes
    • “Processes”
    • → threads, processes, coroutines, etc.
    • → concurrent units
    • Interact with each other solely through channels.
    • → concurrent units' channels
    • → usually the queues

    View Slide

  20. Channel-Based
    Concurrency

    View Slide

  21. Channel-Based Concurrency
    • Not going to talk the exact CSP.
    • Just adapt the concepts.
    • → Use channel to communicate between concurrent units.
    • Will continue with the code: http://bit.ly/econcurrency.

    View Slide

  22. But The Traditional Way
    NOT Channel-Based Concurrency
    def consume(url_q):
    while True:
    url = url_q.get()
    content = requests.get(url).content
    print('Queried', url)
    # mark a task is done
    url_q.task_done()

    View Slide

  23. url_q = Queue()
    for url in urls:
    url_q.put(url)
    for _ in range(2):
    # the “daemon” is not the Unix's deamon
    # daemon threads are stopped at shutdown
    call_in_daemon_thread(consume, url_q)
    # block and unblock when all tasks are done
    url_q.join()
    # when main thread exits, Python shuts down

    View Slide

  24. But the Traditional Way
    NOT Channel-Based Concurrency
    • The queue is a thread-safe queue.
    • .task_done()
    • If 0, notify all by releasing the locks.
    • .join()
    • Block by a double acquired lock.
    • Daemon threads – are stopped abruptly at shutdown.
    • How do I know? The uncleared docs & the Python source code.
    • Let's make the it simpler.

    View Slide

  25. The Channel-Based Concurrency
    def consume(url_q):
    while True:
    url = url_q.get()
    if url is TO_RETURN:
    return
    content = requests.get(url).content
    print('Queried', url)

    View Slide

  26. url_q = Queue()
    for url in urls:
    url_q.put(url)
    for _ in range(N):
    url_q.put(TO_RETURN)
    for _ in range(N):
    call_in_thread(consume, url_q)

    View Slide

  27. Much easier!

    View Slide

  28. Layered Channel-Based Concurrency
    • Model more complex concurrent system.
    • Use 3 layers:
    • Atomic Utils
    • Each function must be concurrency-safe.
    • Channel Operators
    • Functions interacts with each other solely through channel.
    • Graph Initializer
    • A function initializes the whole graph.

    View Slide

  29. Concurrency-Safe?
    Layered Channel-Based Concurrency
    • Depends on the concurrent unit, e.g., thread-safe.
    • Tips for keeping atomic:
    • Access only its frame.
    • Use atomic operations – http://bit.ly/aoperations.
    • Redesign with channels.
    • Use lock – the last option.

    View Slide

  30. The Crawler
    Layered Channel-Based Concurrency
    • A crawler crawls all the PyCon TW website's pages.
    • f1: url → text via channel
    • f2: text → url via channel
    • Plus a channel to break loop when end.
    • And run concurrently, of course!

    View Slide

  31. Atomic Utils
    Layered Channel-Based Concurrency
    # conform accessing only its frame
    def query_text(url):
    return requests.get(url).text
    def parse_out_href_gen(text):
    soup = BeautifulSoup(text, 'html.parser')
    return (a_tag.get('href', '')
    for a_tag in soup.find_all('a'))
    def is_relative_href(url):
    return (not url.startswith('http') and
    not url.startswith('mailto:'))

    View Slide

  32. # conform using atomic operators
    url_visted_map = {}
    def is_visited_or_mark(url):
    visited = url_visted_map.get(url, False)
    if not visited:
    url_visted_map[url] = True
    return visited

    View Slide

  33. Channel Operators
    Layered Channel-Based Concurrency
    • Function put_text_q operates
    • url_q → text_q
    • run_q
    • Function put_url_q operates
    • text_q → url_q
    • run_q

    View Slide

  34. def put_text_q(url_q, text_q, run_q):
    while True:
    url = url_q.get()
    run_q.put(RUNNING)
    if url is TO_RETURN:
    url_q.put(TO_RETURN) # broadcast
    return
    text = query_text(url)
    text_q.put(text)
    run_q.get()

    View Slide

  35. def put_url_q(text_q, url_q, run_q):
    while True:
    text = text_q.get()
    run_q.put(RUNNING)
    if text is TO_RETURN:
    text_q.put(TO_RETURN)
    return
    href_gen = parse_out_href_gen(text)
    # continue to the next page

    View Slide

  36. for href in href_gen:
    if not is_relative_href(href):
    continue
    url = urljoin(PYCON_TW_ROOT_URL, href)
    if is_visited_or_mark(url):
    continue
    url_q.put(url)
    if run_q.qsize() == 1 and url_q.qsize() == 0:
    url_q.put(TO_RETURN)
    text_q.put(TO_RETURN)
    run_q.get()

    View Slide

  37. Graph Initializer
    Layered Channel-Based Concurrency
    url_q = Queue()
    text_q = Queue()
    run_q = Queue()
    init_url_q(url_q)
    for _ in range(8):
    call_in_thread(put_text_q,
    url_q, text_q, run_q)
    for _ in range(4):
    call_in_thread(put_url_q,
    text_q, url_q, run_q)

    View Slide

  38. The Output
    Layered Channel-Based Concurrency
    $ py3 graph_initializer.py 2 1 # even 1 1 when debug
    Thread-1put_text_q:52 url_q.get() -> https://P/a
    Thread-1put_text_q:54 run_q.put(RUNNING) # query
    Thread-1put_text_q:65 run_q.get() # done
    ...
    Thread-3put_url_q:75 len(text_q.get()) -> 12314
    Thread-3put_url_q:78 run_q.put(RUNNING) # parse
    Thread-3put_url_q:98 url_q: 14 # more url -> not the end
    Thread-3put_url_q:99 run_q: 1
    Thread-3put_url_q:104 run_q.get() # done
    ...
    Thread-2put_text_q:49 url_q.get() -> https://P/b
    ...
    Thread-3put_url_q:98 url_q: 0 # no more url and
    Thread-3put_url_q:99 run_q: 1 # only 1 running -> end
    Thread-3put_url_q:103 url_q.put(TO_RETURN) # signal to return
    Thread-3put_url_q:104 text_q.put(TO_RETURN)

    View Slide

  39. Not so easy,
    but clear.

    View Slide

  40. The Crawler With Error Handling
    Layered Channel-Based Concurrency
    • A new function: get errors for further handling

    View Slide

  41. Concurrent Units

    View Slide

  42. The Standard Options
    • threading.thread
    • queue.Queue
    • multiprocessing.Process
    • multiprocessing.Queue
    • @asyncio.coroutine ≡ async def
    • asyncio.Queue
    • gevent.Greenlet
    • gevent.queue.Queue
    Concurrent Units
    Pro Tip: DO NOT mix them!

    View Slide

  43. threading multiprocessing asyncio gevent
    CPU ❌ ⭐ ❌ ❌
    IO ⭐ ⭐ ⭐ ⭐
    Run-Time Cost ⚡ ⚡
    Note Easy!
    Note processes'
    memories are
    isolated.
    IMO, 

    the API is 

    too basic.
    The API is 

    rich and similar
    to threading.

    View Slide

  44. Scale Out
    • The channel can also be
    • RabbitMQ.
    • Redis.
    • Apache Kafka.
    • Scale out from a single machine with a similar design.
    Concurrent Units

    View Slide

  45. CSP vs. X

    View Slide

  46. CSP vs. X
    • X:
    • Lock
    • Parallel Map
    • Actor Model
    • Reactive Programming
    • MapReduce

    View Slide

  47. CSP vs. Lock
    • Channel is just lock plus message passing.
    • Locks and its variants cause complexity.
    • Channels provide a better abstraction to control the complexity.
    • Just like Python vs. C.
    • Design with channels first, and then transform to locks if need.

    View Slide

  48. CSP vs. Parallel Map
    • Level: Lock < CSP < Parallel Map
    • If you problem fits parallel map, just use,
    • e.g., concurrent.futures.
    • i.e., if you don't need to share memory, why communicate?
    • If can't fit perfectly, consider using CSP to model it.

    View Slide

  49. • Both are mathematical models.
    • Model CSP with Actor model? Yes.
    • Model Actor model with CSP? Yes.
    • The major differences are
    • Actor model emphasizes the “worker”.
    • CSP emphasizes the “channel”.
    CSP vs. Actor Model

    View Slide

  50. • When implement, using Actor model tends to
    • class.
    • private state.
    • CSP tends to
    • function
    • implies more functional, so simpler testing.
    • explicit channel
    • implies easier visualize, so simpler optimizing.
    • IMO, I prefer CSP.

    View Slide

  51. print('Testing query_text ... ', end='')
    text = query_text('https://tw.pycon.org')
    print(repr(text[:40]))
    print('Testing parse_out_href_gen ... ', end='')
    href_gen = parse_out_href_gen(text)
    print(repr(list(href_gen)[:3]))
    print('Testing is_relative_href ...')
    assert is_relative_href('2017/en-us')
    assert is_relative_href('/2017/en-us')
    assert not is_relative_href('https://tw.pycon.org')
    assert not is_relative_href('mailto:[email protected]')
    print('Testing is_visited_or_mark ...')
    assert not is_visited_or_mark('/')
    assert is_visited_or_mark('/')

    View Slide

  52. $ py3 atomic_utils.py
    ...
    Benchmarking query_text ... 0.7407s
    Benchmarking parse_out_href_gen ... 0.01298s
    # optimize by the ratio
    $ py3 graph_initializer.py 40 1
    ...

    View Slide

  53. • Both support multiple concurrent units.
    • CSP can build flexible data flow easier.
    • In reactive, the default one-way stream may limit you.
    • CSP can use concurrency easier ∵ old-school.
    • In reactive, have to understand its flat_map and/or schedulers.
    • Reactive has richer APIs, especially for UI events.
    CSP vs. Reactive Programming

    View Slide

  54. • CSP is more lightweight.
    • CSP is more flexible.
    • In MapReduce,
    • The algorithm must fit the MapReduce form, and
    • Even more fixed data flow than reactive.
    • MapReduce system is designed for PB-level data at the first.
    CSP vs. MapReduce

    View Slide

  55. At the End

    View Slide

  56. At the End
    • Channel-base concurrency from CSP consists of
    • Concurrent units.
    • Channels.
    • CSP helps to avoid the pitfalls, but not all of the pitfalls.
    • Logging and visualizing help debugging.
    • When your problem fits a higher-level model? Use it!
    • But always can model with CSP.

    View Slide

  57. Notes
    At the End
    • The crawler is for showing the flexibility of using channels.
    • Looks good, but not perfect, since the
    • url_q.get()
    • run_q.put(RUNNING)
    • must be synced.
    • The issue occurs on the high threads ratio.
    • Keep your graph simple!

    View Slide