$30 off During Our Annual Pro Sale. View Details »

Concurrency in Python

Concurrency in Python

It is mainly about the multithreading and the multiprocessing in Python, and *in Python's flavor*.

It's also the share at Taipei.py [1].

[1] http://www.meetup.com/Taipei-py/events/220452029/

Mosky Liu

March 26, 2015
Tweet

More Decks by Mosky Liu

Other Decks in Programming

Transcript

  1. CONCURRENCY IN PYTHON
    MOSKY
    1

    View Slide

  2. MULTITHREADING & 

    MULTIPROCESSING IN PYTHON
    MOSKY
    2

    View Slide

  3. MOSKY
    PYTHON CHARMER @ PINKOI

    MOSKY.TW
    3

    View Slide

  4. OUTLINE
    4

    View Slide

  5. OUTLINE
    • Introduction
    4

    View Slide

  6. OUTLINE
    • Introduction
    • Producer-Consumer Pattern
    4

    View Slide

  7. OUTLINE
    • Introduction
    • Producer-Consumer Pattern
    • Python’s Flavor
    4

    View Slide

  8. OUTLINE
    • Introduction
    • Producer-Consumer Pattern
    • Python’s Flavor
    • Misc. Techiques
    4

    View Slide

  9. INTRODUCTION
    5

    View Slide

  10. MULTITHREADING
    6

    View Slide

  11. MULTITHREADING
    • GIL
    6

    View Slide

  12. MULTITHREADING
    • GIL
    • Only one thread runs at any given time.
    6

    View Slide

  13. MULTITHREADING
    • GIL
    • Only one thread runs at any given time.
    • It still can improves IO-bound problems.
    6

    View Slide

  14. MULTIPROCESSING
    7

    View Slide

  15. MULTIPROCESSING
    • It uses fork.
    7

    View Slide

  16. MULTIPROCESSING
    • It uses fork.
    • Processes can run at the same time.
    7

    View Slide

  17. MULTIPROCESSING
    • It uses fork.
    • Processes can run at the same time.
    • Use more memory.
    7

    View Slide

  18. MULTIPROCESSING
    • It uses fork.
    • Processes can run at the same time.
    • Use more memory.
    • Note the initial cost.
    7

    View Slide

  19. IS IT HARD?
    8

    View Slide

  20. IS IT HARD?
    • Avoid shared resources.
    8

    View Slide

  21. IS IT HARD?
    • Avoid shared resources.
    • e.g., vars or shared memory, files, connections, …
    8

    View Slide

  22. IS IT HARD?
    • Avoid shared resources.
    • e.g., vars or shared memory, files, connections, …
    • Understand Python’s flavor.
    8

    View Slide

  23. IS IT HARD?
    • Avoid shared resources.
    • e.g., vars or shared memory, files, connections, …
    • Understand Python’s flavor.
    • Then it will be easy.
    8

    View Slide

  24. SHARED RESOURCE
    9

    View Slide

  25. SHARED RESOURCE
    • Race condition:

    T1: RW

    T2: RW

    T1+T2: RRWW
    9

    View Slide

  26. SHARED RESOURCE
    • Race condition:

    T1: RW

    T2: RW

    T1+T2: RRWW
    • Use lock → Thread-safe:

    T1+T2: (RW) (RW)
    9

    View Slide

  27. SHARED RESOURCE
    • Race condition:

    T1: RW

    T2: RW

    T1+T2: RRWW
    • Use lock → Thread-safe:

    T1+T2: (RW) (RW)
    • But lock causes worse performance and deadlock.
    9

    View Slide

  28. SHARED RESOURCE
    • Race condition:

    T1: RW

    T2: RW

    T1+T2: RRWW
    • Use lock → Thread-safe:

    T1+T2: (RW) (RW)
    • But lock causes worse performance and deadlock.
    • Which is the hard part.
    9

    View Slide

  29. DIAGNOSE PROBLEM
    10

    View Slide

  30. DIAGNOSE PROBLEM
    • Where is the bottleneck?
    10

    View Slide

  31. DIAGNOSE PROBLEM
    • Where is the bottleneck?
    • Divide your problem.
    10

    View Slide

  32. PRODUCER-CONSUMER
    PATTERN
    11

    View Slide

  33. PRODUCER-CONSUMER
    PATTERN
    12

    View Slide

  34. PRODUCER-CONSUMER
    PATTERN
    • A queue
    12

    View Slide

  35. PRODUCER-CONSUMER
    PATTERN
    • A queue
    • Producers → A queue
    12

    View Slide

  36. PRODUCER-CONSUMER
    PATTERN
    • A queue
    • Producers → A queue
    • A queue → Consumers
    12

    View Slide

  37. PRODUCER-CONSUMER
    PATTERN
    • A queue
    • Producers → A queue
    • A queue → Consumers
    • Python has built-in Queue module for it.
    12

    View Slide

  38. EXAMPLES
    • https://docs.python.org/2/library/
    queue.html#queue-objects
    • https://github.com/moskytw/mrbus/blob/master/
    mrbus/base/pool.py
    13

    View Slide

  39. WHY .TASK_DONE?
    14

    View Slide

  40. WHY .TASK_DONE?
    • It’s for .join.
    14

    View Slide

  41. WHY .TASK_DONE?
    • It’s for .join.
    • When the counter goes zero, 

    it will notify the threads which are waiting.
    14

    View Slide

  42. WHY .TASK_DONE?
    • It’s for .join.
    • When the counter goes zero, 

    it will notify the threads which are waiting.
    • It’s implemented by threading.Condition.
    14

    View Slide

  43. 15
    THE THREADING MODULE

    View Slide

  44. 15
    • Lock — primitive lock: .acquire / .release
    THE THREADING MODULE

    View Slide

  45. 15
    • Lock — primitive lock: .acquire / .release
    • RLock — owner can reenter
    THE THREADING MODULE

    View Slide

  46. 15
    • Lock — primitive lock: .acquire / .release
    • RLock — owner can reenter
    • Semaphore — lock when counter goes zero
    THE THREADING MODULE

    View Slide

  47. 16

    View Slide

  48. • Condition — 

    .wait for .notify / .notify_all
    16

    View Slide

  49. • Condition — 

    .wait for .notify / .notify_all
    • Event — .wait for .set; simplifed Condition
    16

    View Slide

  50. • Condition — 

    .wait for .notify / .notify_all
    • Event — .wait for .set; simplifed Condition
    • with lock: …
    16

    View Slide

  51. THE MULTIPROCESSING MODULE
    17

    View Slide

  52. THE MULTIPROCESSING MODULE
    • .Process
    17

    View Slide

  53. THE MULTIPROCESSING MODULE
    • .Process
    • .JoinableQueue
    17

    View Slide

  54. THE MULTIPROCESSING MODULE
    • .Process
    • .JoinableQueue
    • .Pool
    17

    View Slide

  55. THE MULTIPROCESSING MODULE
    • .Process
    • .JoinableQueue
    • .Pool
    • …
    17

    View Slide

  56. PYTHON’S FLAVOR
    18

    View Slide

  57. 19
    DAEMONIC THREAD

    View Slide

  58. 19
    • It’s not that “daemon”.
    DAEMONIC THREAD

    View Slide

  59. 19
    • It’s not that “daemon”.
    • Just will be killed when Python shutting down.
    DAEMONIC THREAD

    View Slide

  60. 19
    • It’s not that “daemon”.
    • Just will be killed when Python shutting down.
    • Immediately.
    DAEMONIC THREAD

    View Slide

  61. 19
    • It’s not that “daemon”.
    • Just will be killed when Python shutting down.
    • Immediately.
    • Others keep running until return.
    DAEMONIC THREAD

    View Slide

  62. SO, HOW TO STOP?
    20

    View Slide

  63. SO, HOW TO STOP?
    • Set demon and let Python clean it up.
    20

    View Slide

  64. SO, HOW TO STOP?
    • Set demon and let Python clean it up.
    • Let it return.
    20

    View Slide

  65. BUT, THE THREAD IS BLOCKING
    21

    View Slide

  66. BUT, THE THREAD IS BLOCKING
    • Set timeout.
    21

    View Slide

  67. HOW ABOUT CTRL+C?
    22

    View Slide

  68. HOW ABOUT CTRL+C?
    • Only main thread can receive that.
    22

    View Slide

  69. HOW ABOUT CTRL+C?
    • Only main thread can receive that.
    • BSD-style.
    22

    View Slide

  70. BROADCAST SIGNAL 

    TO SUB-THREAD
    23

    View Slide

  71. BROADCAST SIGNAL 

    TO SUB-THREAD
    • Set a global flag when get signal.
    23

    View Slide

  72. BROADCAST SIGNAL 

    TO SUB-THREAD
    • Set a global flag when get signal.
    • Let thread read it before each task.
    23

    View Slide

  73. BROADCAST SIGNAL 

    TO SUB-THREAD
    • Set a global flag when get signal.
    • Let thread read it before each task.
    • No, you can’t kill non-daemonic thread.
    23

    View Slide

  74. BROADCAST SIGNAL 

    TO SUB-THREAD
    • Set a global flag when get signal.
    • Let thread read it before each task.
    • No, you can’t kill non-daemonic thread.
    • Just can’t do so.
    23

    View Slide

  75. BROADCAST SIGNAL 

    TO SUB-THREAD
    • Set a global flag when get signal.
    • Let thread read it before each task.
    • No, you can’t kill non-daemonic thread.
    • Just can’t do so.
    • It’s Python.
    23

    View Slide

  76. BROADCAST SIGNAL 

    TO SUB-PROCESS
    24

    View Slide

  77. BROADCAST SIGNAL 

    TO SUB-PROCESS
    • Just broadcast the signal to sub-processes.
    24

    View Slide

  78. BROADCAST SIGNAL 

    TO SUB-PROCESS
    • Just broadcast the signal to sub-processes.
    • Start with register signal handler:

    signal(SIGINT, _handle_to_term_signal)
    24

    View Slide

  79. 25

    View Slide

  80. • Realize process context if need:

    pid = getpid()

    pgid = getpgid(0)

    proc_is_parent = (pid == pgid)
    25

    View Slide

  81. • Realize process context if need:

    pid = getpid()

    pgid = getpgid(0)

    proc_is_parent = (pid == pgid)
    • Off the handler:

    signal(signum, SIG_IGN)
    25

    View Slide

  82. • Realize process context if need:

    pid = getpid()

    pgid = getpgid(0)

    proc_is_parent = (pid == pgid)
    • Off the handler:

    signal(signum, SIG_IGN)
    • Broadcast:

    killpg(pgid, signum)
    25

    View Slide

  83. MISC. TECHIQUES
    26

    View Slide

  84. JUST THREAD IT OUT
    27

    View Slide

  85. JUST THREAD IT OUT
    • Or process it out.
    27

    View Slide

  86. JUST THREAD IT OUT
    • Or process it out.
    • Let main thread exit earlier. (Looks faster!)
    27

    View Slide

  87. JUST THREAD IT OUT
    • Or process it out.
    • Let main thread exit earlier. (Looks faster!)
    • Let main thread keep dispatching tasks.
    27

    View Slide

  88. JUST THREAD IT OUT
    • Or process it out.
    • Let main thread exit earlier. (Looks faster!)
    • Let main thread keep dispatching tasks.
    • “Async”
    27

    View Slide

  89. JUST THREAD IT OUT
    • Or process it out.
    • Let main thread exit earlier. (Looks faster!)
    • Let main thread keep dispatching tasks.
    • “Async”
    • And fix some stupid behavior.

    (I meant atexit with multiprocessing.Pool.)
    27

    View Slide

  90. COLLECT RESULT SMARTER
    28

    View Slide

  91. COLLECT RESULT SMARTER
    • Put into a safe queue.
    28

    View Slide

  92. COLLECT RESULT SMARTER
    • Put into a safe queue.
    • Use a thread per instance.
    28

    View Slide

  93. COLLECT RESULT SMARTER
    • Put into a safe queue.
    • Use a thread per instance.
    • Learn “let it go”.
    28

    View Slide

  94. EXAMPLES
    • https://github.com/moskytw/mrbus/blob/master/
    mrbus/base/pool.py#L45
    • https://github.com/moskytw/mrbus/blob/master/
    mrbus/model/core.py#L30
    29

    View Slide

  95. MONITOR THEM
    30

    View Slide

  96. MONITOR THEM
    • No one is a master at first.
    30

    View Slide

  97. MONITOR THEM
    • No one is a master at first.
    • Don’t guess.
    30

    View Slide

  98. MONITOR THEM
    • No one is a master at first.
    • Don’t guess.
    • Just use a function to print log.
    30

    View Slide

  99. BENCHMARK THEM
    31

    View Slide

  100. BENCHMARK THEM
    • No one is a master at first.
    31

    View Slide

  101. BENCHMARK THEM
    • No one is a master at first.
    • Don’t guess.
    31

    View Slide

  102. BENCHMARK THEM
    • No one is a master at first.
    • Don’t guess.
    • Just prove it.
    31

    View Slide

  103. CONCLUSION
    32

    View Slide

  104. CONCLUSION
    • Avoid shared resource 

    — or just use producer-consumer pattern.
    32

    View Slide

  105. CONCLUSION
    • Avoid shared resource 

    — or just use producer-consumer pattern.
    • Signals only go main thread.
    32

    View Slide

  106. CONCLUSION
    • Avoid shared resource 

    — or just use producer-consumer pattern.
    • Signals only go main thread.
    • Just thread it out.
    32

    View Slide

  107. CONCLUSION
    • Avoid shared resource 

    — or just use producer-consumer pattern.
    • Signals only go main thread.
    • Just thread it out.
    • Collect your result smarter.
    32

    View Slide

  108. CONCLUSION
    • Avoid shared resource 

    — or just use producer-consumer pattern.
    • Signals only go main thread.
    • Just thread it out.
    • Collect your result smarter.
    • Monitor and benchmark your code.
    32

    View Slide