Slide 1

Slide 1 text

A Python and a Gopher walk into a bar - Embedding Python in Go. Massimiliano Pippi

Slide 2

Slide 2 text

● Software Engineer ○ C++, Python and Go ● OSS fan and contributor ● 1+ year at Datadog, working on Agent and Integrations Hi, I’m Massi!

Slide 3

Slide 3 text

● SaaS based infrastructure, app and logs monitoring ● Open Source Agent ● Time series data (metrics and events) ● Intelligent Alerting and Insightful Dashboards ● Trillions of data points processed per day About Datadog

Slide 4

Slide 4 text

Monitor everything

Slide 5

Slide 5 text

Meet the Datadog Agent Agent check ● Written in Python ● Open Source, https://github.com/DataDog/dd-agent

Slide 6

Slide 6 text

The anatomy of a check import psutil from checks import AgentCheck class SystemSwap(AgentCheck): def check(self, instance): swap_mem = psutil.swap_memory() self.rate('system.swap.swapped_in', swap_mem.sin) self.rate('system.swap.swapped_out', swap_mem.sout) self.gauge(‘system.swap.total’, swap_mem.total) self.gauge(‘system.swap.used’, swap_mem.used)

Slide 7

Slide 7 text

The way to Go - Our Goals ● Make it smaller faster stronger ● Keep Python as an extension language ○ ~75 Python checks currently part of the official package ○ Python is the right tool to implement most of them ○ Undetermined number of custom checks in the wild ○ Update a check without recompiling the Agent

Slide 8

Slide 8 text

Embedding for the win ● Python can be embedded: you keep an interpreter in memory and make it run Python code at will ● CPython provides a C API to allow embedding ● Cgo enables the creation of Go packages that call C code.

Slide 9

Slide 9 text

Demo time! Let’s run a Python module from a go application ● https://github.com/masci/golab17/tree/master/01

Slide 10

Slide 10 text

The dreadful GIL ● GIL stands for global interpreter lock ● Prevents multiple threads from executing Python code at once ● Yes, even if those threads are run on a multi-core processor

Slide 11

Slide 11 text

The dreadful GIL ● Embedding CPython means embedding the GIL ● The GIL knows about threads created from Python code... ● ...but the GIL can’t do its job if we run Python from separate Go threads!

Slide 12

Slide 12 text

Demo time! Run Python in different goroutines and watch the world burn ● https://github.com/masci/golab17/tree/master/02 ● Rule of thumb: any time you use Python in some piece of code that could be executed in a separate thread, lock the GIL!

Slide 13

Slide 13 text

The revenge of the dreadful GIL ● We lock and unlock the GIL in goroutines, not threads ● The GIL protects a specific thread state, we cannot lock/unlock it from different threads! ● But the Go scheduler might pause and resume goroutines in different threads

Slide 14

Slide 14 text

Demo time! See what happens when the Go scheduler relocates our Pythonic goroutines (spoiler alert: it’ll crash your software) ● https://github.com/masci/golab17/tree/master/03

Slide 15

Slide 15 text

Beyond embedding: extending! ● Once you have an embedded interpreter, you can extend Python capabilities with Go code ● This involves a little bit of C so no demo here ● Still very easy to achieve, Python scripts import a module that actually lives in memory and points to Go instructions

Slide 16

Slide 16 text

Extending Python: the Go code //export MyGoFunc func MyGoFunc() { fmt.Println(“Hello, World from Go!”) }

Slide 17

Slide 17 text

Extending Python: the C code static PyMethodDef MyMethods[] = { {"my_func", (PyCFunction)myGoFunc, METH_VARARGS, "YAY!"}, {NULL, NULL} // guards }; PyObject *m = Py_InitModule("my_module", MyMethods);

Slide 18

Slide 18 text

Extending Python: the Python code # WARNING! This only works on the embedded interpreter import my_module my_module.my_func() # prints “Hello, World from Go!”

Slide 19

Slide 19 text

Lessons learned: the good ● Embedded Python plays nice with Go concurrency model ● From/To Python overhead is negligible ○ BenchmarkCallPyFunc 300000 3606 ns/op ● Extending Python is a very powerful tool ○ Expose functions and data to the Python world

Slide 20

Slide 20 text

Lessons learned: the bad ● The GIL prevents Python parallel execution ○ This was expected ● The GIL feels the effects of the Go scheduler ○ Honestly, didn’t see this coming ● Using multiple interpreters doesn’t help ○ They share a unique GIL

Slide 21

Slide 21 text

Lessons learned: the ugly ● You must carry on some C code, how much depending on the use case ● You will likely carry on some Python code too, to offer base classes and utilities to external modules running in embedded mode

Slide 22

Slide 22 text

What we have now ● Datadog Agent 6.0.0-beta1 ○ Embedded CPython 2.7.13 ○ Linux, OSX and Windows ○ We now run checks concurrently, knowing that many of them will wait for each other... ○ ...even if some of them were ported to Go, so we also have some parallelism

Slide 23

Slide 23 text

Thanks for listening! ● Try Datadog at https://www.datadoghq.com ● Find our OSS on https://github.com/DataDog ● Our tech blog is http://engineering.datadoghq.com Check out the new Agent on Github! https://github.com/DataDog/datadog-agent/

Slide 24

Slide 24 text

WE’RE HIRING! NYC, PARIS, Remote

Slide 25

Slide 25 text

Embedding: an example // #cgo pkg-config: python-2.7 // #include import "C" C.Py_Initialize() cmd := C.CString(“print ‘Hello, World!’”) // cmd must be freed! C.PyRun_SimpleString(cmd) C.Py_Finalize() ● we still do go build and that’s it.

Slide 26

Slide 26 text

Embedding: a better example ● Use go-python to eliminate boilerplate ○ https://github.com/sbinet/go-python import "github.com/sbinet/go-python" python.Initialize() python.PyRun_SimpleString(“print ‘Hello, World!’”) python.Finalize()