Elegant Concurrency

D16bc1f94b17ddc794c2dfb48ef59456?s=47 Mosky
June 11, 2017

Elegant Concurrency

Writing concurrent program is hard; maintaining concurrent program even is a nightmare. Actually, a pattern which helps us to write good concurrent code is available, that is, using “channels” to communicate.

This talk will share the channel concept with common libraries, like threading and multiprocessing, to make concurrent code elegant.

It's the talk at PyCon TW 2017 [1] and PyCon APAC/MY 2017 [2].

[1]: https://tw.pycon.org/2017
[2]: https://pycon.my/pycon-apac-2017-program-schedule/

D16bc1f94b17ddc794c2dfb48ef59456?s=128

Mosky

June 11, 2017
Tweet

Transcript

  1. Elegant Concurrency

  2. Why Concurrency?

  3. None
  4. Why Concurrency? Be a Good Machine Tamer! © Eduardo Woo

  5. As a Good Machine Tamer Why Concurrency? • Get the

    machine into full play! • The capacities: • CPU • IO • Disk • Network bandwidth • Network connections • etc.
  6. Concurrency
 Is Hard?

  7. ∵ The Various Ways? Concurrency Is Hard? • threading •

    queue • multiprocessing • concurrent.futures • asyncio • thread • process • coroutine • gevent • lock • rlock • condition • semaphore • event • barrier • manager • … • ???
  8. With Today's Sharing Concurrency Is Hard? ★ queue ★ thread

  9. Plus Some Concurrency Is Hard? ★ queue ★ thread ★

    process ★ coroutine ★ gevent
  10. ❤ Python & open source Mosky • Python Charmer at

    Pinkoi. • Has spoken at • PyCons in TW, KR, JP, SG, HK • COSCUPs & TEDx, etc. • Countless hours 
 for teaching Python. • Has serval Python packages: • ZIPCodeTW, 
 MoSQL, Clime, etc. • http://mosky.tw/
  11. Frontend & Backend
 Engineers We're looking for

  12. Outline • Why Concurrency? • Concurrency Is Hard? ★ Communicating

    Sequential Processes (CSP) ★ Channel-Based Concurrency ★ Concurrent Units ★ CSP vs. X
  13. Communicating Sequential Processes

  14. Communicating Sequential Processes Is a Formal Language

  15. Communicating Sequential Processes • A formal language for describing concurrent

    systems. • The main ideas: • “Processes” and • Interact with each other solely through channels. • But why CSP?
  16. — Effective Go Do not communicate by 
 sharing memory;

    instead, 
 share memory by communicating. ” “
  17. — Effective Go Using channels to control access makes it

    easier to write 
 clear, correct programs. ” “
  18. — The Python Wiki Use locks and shared memory to

    shoot yourself in the foot 
 in parallel. ” “
  19. In Python Communicating Sequential Processes • “Processes” • → threads,

    processes, coroutines, etc. • → concurrent units • Interact with each other solely through channels. • → concurrent units' channels • → usually the queues
  20. Channel-Based Concurrency

  21. Channel-Based Concurrency • Not going to talk the exact CSP.

    • Just adapt the concepts. • → Use channel to communicate between concurrent units. • Will continue with the code: http://bit.ly/econcurrency.
  22. But The Traditional Way NOT Channel-Based Concurrency def consume(url_q): while

    True: url = url_q.get() content = requests.get(url).content print('Queried', url) # mark a task is done url_q.task_done()
  23. url_q = Queue() for url in urls: url_q.put(url) for _

    in range(2): # the “daemon” is not the Unix's deamon # daemon threads are stopped at shutdown call_in_daemon_thread(consume, url_q) # block and unblock when all tasks are done url_q.join() # when main thread exits, Python shuts down
  24. But the Traditional Way NOT Channel-Based Concurrency • The queue

    is a thread-safe queue. • .task_done() • If 0, notify all by releasing the locks. • .join() • Block by a double acquired lock. • Daemon threads – are stopped abruptly at shutdown. • How do I know? The uncleared docs & the Python source code. • Let's make the it simpler.
  25. The Channel-Based Concurrency def consume(url_q): while True: url = url_q.get()

    if url is TO_RETURN: return content = requests.get(url).content print('Queried', url)
  26. url_q = Queue() for url in urls: url_q.put(url) for _

    in range(N): url_q.put(TO_RETURN) for _ in range(N): call_in_thread(consume, url_q)
  27. Much easier!

  28. Layered Channel-Based Concurrency • Model more complex concurrent system. •

    Use 3 layers: • Atomic Utils • Each function must be concurrency-safe. • Channel Operators • Functions interacts with each other solely through channel. • Graph Initializer • A function initializes the whole graph.
  29. Concurrency-Safe? Layered Channel-Based Concurrency • Depends on the concurrent unit,

    e.g., thread-safe. • Tips for keeping atomic: • Access only its frame. • Use atomic operations – http://bit.ly/aoperations. • Redesign with channels. • Use lock – the last option.
  30. The Crawler Layered Channel-Based Concurrency • A crawler crawls all

    the PyCon TW website's pages. • f1: url → text via channel • f2: text → url via channel • Plus a channel to break loop when end. • And run concurrently, of course!
  31. Atomic Utils Layered Channel-Based Concurrency # conform accessing only its

    frame def query_text(url): return requests.get(url).text def parse_out_href_gen(text): soup = BeautifulSoup(text, 'html.parser') return (a_tag.get('href', '') for a_tag in soup.find_all('a')) def is_relative_href(url): return (not url.startswith('http') and not url.startswith('mailto:'))
  32. # conform using atomic operators url_visted_map = {} def is_visited_or_mark(url):

    visited = url_visted_map.get(url, False) if not visited: url_visted_map[url] = True return visited
  33. Channel Operators Layered Channel-Based Concurrency • Function put_text_q operates •

    url_q → text_q • run_q • Function put_url_q operates • text_q → url_q • run_q
  34. def put_text_q(url_q, text_q, run_q): while True: url = url_q.get() run_q.put(RUNNING)

    if url is TO_RETURN: url_q.put(TO_RETURN) # broadcast return text = query_text(url) text_q.put(text) run_q.get()
  35. def put_url_q(text_q, url_q, run_q): while True: text = text_q.get() run_q.put(RUNNING)

    if text is TO_RETURN: text_q.put(TO_RETURN) return href_gen = parse_out_href_gen(text) # continue to the next page
  36. for href in href_gen: if not is_relative_href(href): continue url =

    urljoin(PYCON_TW_ROOT_URL, href) if is_visited_or_mark(url): continue url_q.put(url) if run_q.qsize() == 1 and url_q.qsize() == 0: url_q.put(TO_RETURN) text_q.put(TO_RETURN) run_q.get()
  37. Graph Initializer Layered Channel-Based Concurrency url_q = Queue() text_q =

    Queue() run_q = Queue() init_url_q(url_q) for _ in range(8): call_in_thread(put_text_q, url_q, text_q, run_q) for _ in range(4): call_in_thread(put_url_q, text_q, url_q, run_q)
  38. The Output Layered Channel-Based Concurrency $ py3 graph_initializer.py 2 1

    # even 1 1 when debug Thread-1put_text_q:52 url_q.get() -> https://P/a Thread-1put_text_q:54 run_q.put(RUNNING) # query Thread-1put_text_q:65 run_q.get() # done ... Thread-3put_url_q:75 len(text_q.get()) -> 12314 Thread-3put_url_q:78 run_q.put(RUNNING) # parse Thread-3put_url_q:98 url_q: 14 # more url -> not the end Thread-3put_url_q:99 run_q: 1 Thread-3put_url_q:104 run_q.get() # done ... Thread-2put_text_q:49 url_q.get() -> https://P/b ... Thread-3put_url_q:98 url_q: 0 # no more url and Thread-3put_url_q:99 run_q: 1 # only 1 running -> end Thread-3put_url_q:103 url_q.put(TO_RETURN) # signal to return Thread-3put_url_q:104 text_q.put(TO_RETURN)
  39. Not so easy, but clear.

  40. The Crawler With Error Handling Layered Channel-Based Concurrency • A

    new function: get errors for further handling
  41. Concurrent Units

  42. The Standard Options • threading.thread • queue.Queue • multiprocessing.Process •

    multiprocessing.Queue • @asyncio.coroutine ≡ async def • asyncio.Queue • gevent.Greenlet • gevent.queue.Queue Concurrent Units Pro Tip: DO NOT mix them!
  43. threading multiprocessing asyncio gevent CPU ❌ ⭐ ❌ ❌ IO

    ⭐ ⭐ ⭐ ⭐ Run-Time Cost ⚡ ⚡ Note Easy! Note processes' memories are isolated. IMO, 
 the API is 
 too basic. The API is 
 rich and similar to threading.
  44. Scale Out • The channel can also be • RabbitMQ.

    • Redis. • Apache Kafka. • Scale out from a single machine with a similar design. Concurrent Units
  45. CSP vs. X

  46. CSP vs. X • X: • Lock • Parallel Map

    • Actor Model • Reactive Programming • MapReduce
  47. CSP vs. Lock • Channel is just lock plus message

    passing. • Locks and its variants cause complexity. • Channels provide a better abstraction to control the complexity. • Just like Python vs. C. • Design with channels first, and then transform to locks if need.
  48. CSP vs. Parallel Map • Level: Lock < CSP <

    Parallel Map • If you problem fits parallel map, just use, • e.g., concurrent.futures. • i.e., if you don't need to share memory, why communicate? • If can't fit perfectly, consider using CSP to model it.
  49. • Both are mathematical models. • Model CSP with Actor

    model? Yes. • Model Actor model with CSP? Yes. • The major differences are • Actor model emphasizes the “worker”. • CSP emphasizes the “channel”. CSP vs. Actor Model
  50. • When implement, using Actor model tends to • class.

    • private state. • CSP tends to • function • implies more functional, so simpler testing. • explicit channel • implies easier visualize, so simpler optimizing. • IMO, I prefer CSP.
  51. print('Testing query_text ... ', end='') text = query_text('https://tw.pycon.org') print(repr(text[:40])) print('Testing

    parse_out_href_gen ... ', end='') href_gen = parse_out_href_gen(text) print(repr(list(href_gen)[:3])) print('Testing is_relative_href ...') assert is_relative_href('2017/en-us') assert is_relative_href('/2017/en-us') assert not is_relative_href('https://tw.pycon.org') assert not is_relative_href('mailto:organizers@pycon.tw') print('Testing is_visited_or_mark ...') assert not is_visited_or_mark('/') assert is_visited_or_mark('/')
  52. $ py3 atomic_utils.py ... Benchmarking query_text ... 0.7407s Benchmarking parse_out_href_gen

    ... 0.01298s # optimize by the ratio $ py3 graph_initializer.py 40 1 ...
  53. • Both support multiple concurrent units. • CSP can build

    flexible data flow easier. • In reactive, the default one-way stream may limit you. • CSP can use concurrency easier ∵ old-school. • In reactive, have to understand its flat_map and/or schedulers. • Reactive has richer APIs, especially for UI events. CSP vs. Reactive Programming
  54. • CSP is more lightweight. • CSP is more flexible.

    • In MapReduce, • The algorithm must fit the MapReduce form, and • Even more fixed data flow than reactive. • MapReduce system is designed for PB-level data at the first. CSP vs. MapReduce
  55. At the End

  56. At the End • Channel-base concurrency from CSP consists of

    • Concurrent units. • Channels. • CSP helps to avoid the pitfalls, but not all of the pitfalls. • Logging and visualizing help debugging. • When your problem fits a higher-level model? Use it! • But always can model with CSP.
  57. Notes At the End • The crawler is for showing

    the flexibility of using channels. • Looks good, but not perfect, since the • url_q.get() • run_q.put(RUNNING) • must be synced. • The issue occurs on the high threads ratio. • Keep your graph simple!