Beating Python's GIL!
to
Max Out Your CPUs
Andrew Montalenti!
CTO, Parse.ly
@amontalenti
Slide 2
Slide 2 text
Scaling Python!
to
3,000 Cores
Andrew Montalenti!
CTO, Parse.ly
@amontalenti
OR:
Slide 3
Slide 3 text
No content
Slide 4
Slide 4 text
No content
Slide 5
Slide 5 text
No content
Slide 6
Slide 6 text
No content
Slide 7
Slide 7 text
What happens when you have 153 TB of compressed
customer data that may need to be reprocessed at any time,
and it’s now growing at 10-20TB per month?
Slide 8
Slide 8 text
No content
Slide 9
Slide 9 text
No content
Slide 10
Slide 10 text
@dabeaz = “the GIL guy”
Slide 11
Slide 11 text
No content
Slide 12
Slide 12 text
Is the GIL a feature, not a bug?!
In one Python process,
at any one time,
only one Python bytecode instruction
is executing at once.
Slide 13
Slide 13 text
No content
Slide 14
Slide 14 text
No content
Slide 15
Slide 15 text
should we just rewrite it in Go?
Slide 16
Slide 16 text
No content
Slide 17
Slide 17 text
No content
Slide 18
Slide 18 text
No content
Slide 19
Slide 19 text
No content
Slide 20
Slide 20 text
fast functions!
running in parallel
Slide 21
Slide 21 text
Python
State
Code
Server 1
Core 2
Core 1
Server 2
Core 2
Core 1
Server 3
Core 2
Core 1
from urllib.parse import urlparse
urls = ["http://arstechnica.com/",
"http://ars.to/1234",
"http://ars.to/5678",
...]
Slide 22
Slide 22 text
Python
State
Code
Server 1
Core 2
Core 1
Server 2
Core 2
Core 1
Server 3
Core 2
Core 1
map(urlparse, urls)
from urllib.parse import urlparse
urls = ["http://arstechnica.com/",
"http://ars.to/1234",
"http://ars.to/5678",
...]
Slide 23
Slide 23 text
Cython
speeding up functions on a single core
Slide 24
Slide 24 text
No content
Slide 25
Slide 25 text
concurrent.futures
good map API, but odd implementation details
Slide 26
Slide 26 text
Python
State
Code
Server 1
Core 2
Core 1
Server 2
Core 2
Core 1
Server 3
Core 2
Core 1
executor = ThreadPoolExecutor()
executor.map(urlparse, urls)
Slide 27
Slide 27 text
Python
State
Code
Server 1
Core 2
Core 1
Server 2
Core 2
Core 1
Server 3
Core 2
Core 1
executor = ProcessPoolExecutor()
executor.map(urlparse, urls)
Python
subprocess
State
Code
Python
subprocess
State
Code
pickle.dumps()
os.fork()
Slide 28
Slide 28 text
No content
Slide 29
Slide 29 text
joblib
map functions over local machine cores
by cleaning up stdlib facilities
Slide 30
Slide 30 text
Python
State
Code
Server 1
Core 2
Core 1
Server 2
Core 2
Core 1
Server 3
Core 2
Core 1
par = Parallel(n_jobs=2)
do_urlparse = delayed(urlparse)
par(do_urlparse(url)
for url in urls)
Python
subprocess
State
Code
Python
subprocess
State
Code
pickle.dumps()
os.fork()
Slide 31
Slide 31 text
ipyparallel
map functions over a pet compute cluster
Slide 32
Slide 32 text
Python
State
Code
Server 1
Core 2
Core 1
Server 2
Core 2
Core 1
Server 3
Core 2
Core 1
rc = Client()
rc[:].map_sync(urlparse, urls)
Python
State
Code
Python
State
Code
ipengine
Python
State
Code
Python
State
Code
Python
State
Code
ipengine
ipengine
ipcontroller
Python
State
Code
pickle.dumps()
pickle.dumps()
Slide 33
Slide 33 text
No content
Slide 34
Slide 34 text
pykafka
map functions over a multi-consumer log
Slide 35
Slide 35 text
Python
State
Code
Server 1
Core 2
Core 1
Server 2
Core 2
Core 1
Server 3
Core 2
Core 1
consumer = ... # balanced
while True:
msg = consumer.consume()
msg = json.loads(msg)
urlparse(msg["url"])
Python
State
Code
Python
State
Code
Python
State
Code
Python
State
Code
Python
State
Code
pykafka.producer
Python
State
Code
Slide 36
Slide 36 text
No content
Slide 37
Slide 37 text
pystorm
map functions over a stream of inputs
to generate a stream of outputs
Slide 38
Slide 38 text
Python
State
Code
Server 1
Core 2
Core 1
Server 2
Core 2
Core 1
Server 3
Core 2
Core 1
Python
State
Code
Python
State
Code
Python
State
Code
Python
State
Code
pykafka.producer
Python
State
Code
multi-lang
json protocol
class UrlParser(Topology):
url_spout = UrlSpout.spec(p=1)
url_bolt = UrlBolt.spec(p=4,
input=url_spout)
Slide 39
Slide 39 text
No content
Slide 40
Slide 40 text
No content
Slide 41
Slide 41 text
pyspark
map functions over a dataset representation
to perform transformations and actions
Slide 42
Slide 42 text
Python
State
Code
Server 1
Core 2
Core 1
Server 2
Core 2
Core 1
Server 3
Core 2
Core 1
Python
State
Code
Python
State
Code
Python
State
Code
Python
State
Code
pyspark.SparkContext
sc = SparkContext()
file_rdd = sc.textFile(files)
file_rdd.map(urlparse).take(1)
cloudpickle
py4j and binary pipes
Slide 43
Slide 43 text
No content
Slide 44
Slide 44 text
No content
Slide 45
Slide 45 text
No content
Slide 46
Slide 46 text
No content
Slide 47
Slide 47 text
"lambda architecture"
Slide 48
Slide 48 text
No content
Slide 49
Slide 49 text
Parse.ly "Batch Layer" Topologies
with Spark & S3
Parse.ly "Speed Layer" Topologies
with Storm & Kafka
Parse.ly Dashboards and APIs
with Elasticsearch & Cassandra
Parse.ly Raw Data Warehouse
with Streaming & SQL Access
Technology Component Summary
Slide 50
Slide 50 text
parting thoughts
Slide 51
Slide 51 text
No content
Slide 52
Slide 52 text
No content
Slide 53
Slide 53 text
the free lunch is over,
but not how we thought
Slide 54
Slide 54 text
multi-process, not multi-thread
multi-node, not multi-core
message passing, not shared memory
!
heaps of data and streams of data
Slide 55
Slide 55 text
No content
Slide 56
Slide 56 text
GIL: it's a feature, not a bug.
help us!
pystorm
pykafka
streamparse