Slide 1

Slide 1 text

Intro  to  Multiprocessing  with  Python     Rudy  Gilmore     Data  Scien3st,  TrueCar  Analy3cs  Team     PyData  Meetup,  11/3/14    

Slide 2

Slide 2 text

Code  Paralleliza,on     •  Modern  processors  are  not  becoming  much  faster,  but  are  more  numerous   •  Many  problems  in  analy3cs  are  easily  parallelizable     •  Wri3ng  parallel  code  will  oGen  allow  you  to  get  done  in  1/nth  the  3me   •  Amdahl’s  Law:     •  Python  has  some  barriers  to  paralleliza3on,  but  there  are  simple  workarounds   There  are  many  op3ons  for  high-­‐performance  parallel  compu3ng   Ø  Cluster  Compu,ng?     Ø  Hadoop?   Ø  Distributed  Processing?   Ø  GPGPUs?     Let’s  start  simple,  how  to  get  mul,ple  cores  on  one  machine  into  the  ac,on    

Slide 3

Slide 3 text

“Embarrassingly  Parallel”   (Processes  completely  independent)     Examples:   1  independent    for    loop   2  .map  ops  on  dataset   3  integra3on   4  Monte-­‐Carlo  methods   5  Some  ML  problems   “Inherently  Serial”   (Difficult  or  impossible  to   run  in  parallel)     Example:    numerical  PDE   “Somewhat  Parallelizable”   (Some  communica3on  needed)     Example:  sor3ng   Parallel  algorithms  can  be  classified  by  data  transfer  required  between  processes    -­‐  this  can  be  done  via  message  passing  or  shared  memory  

Slide 4

Slide 4 text

Python’s  Global  Interpreter  Lock  (GIL)   Only  one  thread  may  access  code  in  python  interpreter  at  a  ,me     •  Mul3ple  threads  will  automa3cally  switch  off  at  standard  interval   •  GIL  appears  in  Cython;  some  other  distros  like  Jython  and  PyPy  do   not  have  this  limita3on    

Slide 5

Slide 5 text

Python’s  thread  and  threading  modules     •  Provide  resources  for  spli^ng  program  into   mul3ple  threads   •  However,  for  CPU-­‐intensive  tasks...                ....there  will  not  be  any  speedup  from                                  mul3threading  alone   •  GIL  s3ll  in  effect   •  So  what  good  is  mul3threading  anyways?   •  CPU-­‐bound  vs  I/O  bound:              threading  useful  in  lacer  but  not  former       What  you  want   What  you’re     gonna  get  

Slide 6

Slide 6 text

mul,processing  module     •  part  of  standard  lib  as  of  python  2.6     •  launchs  mul3ple  processes   •  processes  include  separate  interpreters  -­‐  and   therefore  separate  GILs   •  each  process  operates  on  a  separate  copy  of   memory  from  3me  of  launch   •  similar  syntax  to  threading   •  beware,  processes  have  significant  overhead  in   some  OS,  namely  Windows       GIL  1   GIL  2  

Slide 7

Slide 7 text

Some  simple  examples  of  threading  and  mul,processing   Running  Cpython  v2.7.6     First,  let’s  set  up  a  CPU-­‐bound  task:     def isprime(n):! for i in range(2,int(n**(0.5))+1):! if n%i==0:! return False! return True! ! def prime(Nth,q=None): # prints Nth prime! n_found = 0! i = 0! while n_found

Slide 8

Slide 8 text

import time! import threading as th! import multiprocessing as mp! ! start=20000! ! if __name__=='__main__':! t1=time.time() #time serial segment! print prime(start), prime(start+1), prime(start+2), prime(start+3)! print 'Serial test took',time.time() - t1,'seconds'! ! t2 = time.time() #time multithreaded segment! jobs = [th.Thread(target=prime, args=(start,q))\! ,th.Thread(target=prime, args=(start+1,q))\! ,th.Thread(target=prime, args=(start+2,q))\! ,th.Thread(target=prime, args=(start+3,q))]! for j in jobs:! j.start()! for j in jobs:! j.join()! print 'Multithreaded test took',time.time() - t2,'seconds'! ! q = mp.Queue()! t3 = time.time() #time multiprocessing segment! jobs = [mp.Process(target=prime, args=(start,q))\! ,mp.Process(target=prime, args=(start+1,q))\! ,mp.Process(target=prime, args=(start+2,q))\! ,mp.Process(target=prime, args=(start+3,q))]! for j in jobs:! j.start()! for j in jobs:! j.join()! print 'Multiprocessing test took',time.time() - t3,'seconds'!

Slide 9

Slide 9 text

Output:     224729 224737 224743 224759! Serial test took 3.68699979782 seconds! Multithreaded test took 5.64900016785 seconds! Multiprocessing test took 1.29299998283 seconds!

Slide 10

Slide 10 text

mul3processing.Pool()  provides  a  map-­‐like  interface  with  automa3c   paralleliza3on  among  pool  of  workers      # converting into a pool process! t4 = time.time()! pool = mp.Pool(processes=4)! result = pool.map(prime,range(start,start+4))! print result! print 'Pool test took',time.time() - t4,'seconds'!   Output:     Serial test took 3.68699979782 seconds! Multithreaded test took 5.64900016785 seconds! Multiprocessing test took 1.29299998283 seconds! [224729, 224737, 224743, 224759]! Pool test took 1.31299996376 seconds!   Notes:     •  Tasks  should  be  roughly  equal  size  -­‐  adjust  manually  if  possible   •  map() will  block  un3l  job  complete,  can  use    map_async()  to  return   result  immediately   •  mul3ple  args  will  need  to  be  combined  into  a  single  list,  unwrap  with  *  

Slide 11

Slide 11 text

Further  reading:     •  mul3processing  supports  inter-­‐process  communica3on  using    Queue() and  Pipe() ! •  support  for  sharing  objects  in  memory  using    Value() and    Array()! •  "premature  op,miza,on  is  the  root  of  all  evil”.    Discuss.     In  Conclusion:     •  Use  threading  if  you  have  a  poten3ally  blocking  I/O  procedure,  like  a  download   or  SQL  query   •  Use  mul3processing.Process()  and  mul3processing.Pool()  to  run  CPU-­‐intensive   tasks  in  parallel     References:   hcp://sebas3anraschka.com/Ar3cles/2014_mul3processing_intro.html#An-­‐introduc3on-­‐to-­‐parallel-­‐ programming-­‐using-­‐Python%27s-­‐mul3processing-­‐module   hcp://www.quantstart.com/ar3cles/Parallelising-­‐Python-­‐with-­‐Threading-­‐and-­‐Mul3processing   hcp://www.dabeaz.com/python/GIL.pdf   hcp://calcul.math.cnrs.fr/Documents/Ecoles/2010/cours_mul3processing.pdf   hcp://pymotw.com/2/mul3processing/communica3on.html#process-­‐pools