Upgrade to Pro — share decks privately, control downloads, hide ads and more …

A Python and a Gopher walk into a bar - Embedding Python in Go.

A Python and a Gopher walk into a bar - Embedding Python in Go.

Success stories about rewriting Python applications in Go are not big news anymore. The pros and cons are well known, best practices are in place, and the standard library is there to help. But what if you want to keep some of your Python code? When we chose to port the Datadog Agent to Go, we needed to maintain support for our existing library of plugins written in Python. During the talk we will share lessons learned from our experiences with cgo, the GIL and the quest for performance as we bridge multiple languages in a single application.

Massimiliano Pippi

October 05, 2017
Tweet

More Decks by Massimiliano Pippi

Other Decks in Programming

Transcript

  1. A Python and a Gopher walk into a bar -

    Embedding Python in Go. Massimiliano Pippi
  2. • Software Engineer ◦ C++, Python and Go • OSS

    fan and contributor • 1+ year at Datadog, working on Agent and Integrations Hi, I’m Massi!
  3. • SaaS based infrastructure, app and logs monitoring • Open

    Source Agent • Time series data (metrics and events) • Intelligent Alerting and Insightful Dashboards • Trillions of data points processed per day About Datadog
  4. Meet the Datadog Agent Agent check • Written in Python

    • Open Source, https://github.com/DataDog/dd-agent
  5. The anatomy of a check import psutil from checks import

    AgentCheck class SystemSwap(AgentCheck): def check(self, instance): swap_mem = psutil.swap_memory() self.rate('system.swap.swapped_in', swap_mem.sin) self.rate('system.swap.swapped_out', swap_mem.sout) self.gauge(‘system.swap.total’, swap_mem.total) self.gauge(‘system.swap.used’, swap_mem.used)
  6. The way to Go - Our Goals • Make it

    smaller faster stronger • Keep Python as an extension language ◦ ~75 Python checks currently part of the official package ◦ Python is the right tool to implement most of them ◦ Undetermined number of custom checks in the wild ◦ Update a check without recompiling the Agent
  7. Embedding for the win • Python can be embedded: you

    keep an interpreter in memory and make it run Python code at will • CPython provides a C API to allow embedding • Cgo enables the creation of Go packages that call C code.
  8. Demo time! Let’s run a Python module from a go

    application • https://github.com/masci/golab17/tree/master/01
  9. The dreadful GIL • GIL stands for global interpreter lock

    • Prevents multiple threads from executing Python code at once • Yes, even if those threads are run on a multi-core processor
  10. The dreadful GIL • Embedding CPython means embedding the GIL

    • The GIL knows about threads created from Python code... • ...but the GIL can’t do its job if we run Python from separate Go threads!
  11. Demo time! Run Python in different goroutines and watch the

    world burn • https://github.com/masci/golab17/tree/master/02 • Rule of thumb: any time you use Python in some piece of code that could be executed in a separate thread, lock the GIL!
  12. The revenge of the dreadful GIL • We lock and

    unlock the GIL in goroutines, not threads • The GIL protects a specific thread state, we cannot lock/unlock it from different threads! • But the Go scheduler might pause and resume goroutines in different threads
  13. Demo time! See what happens when the Go scheduler relocates

    our Pythonic goroutines (spoiler alert: it’ll crash your software) • https://github.com/masci/golab17/tree/master/03
  14. Beyond embedding: extending! • Once you have an embedded interpreter,

    you can extend Python capabilities with Go code • This involves a little bit of C so no demo here • Still very easy to achieve, Python scripts import a module that actually lives in memory and points to Go instructions
  15. Extending Python: the Go code //export MyGoFunc func MyGoFunc() {

    fmt.Println(“Hello, World from Go!”) }
  16. Extending Python: the C code static PyMethodDef MyMethods[] = {

    {"my_func", (PyCFunction)myGoFunc, METH_VARARGS, "YAY!"}, {NULL, NULL} // guards }; PyObject *m = Py_InitModule("my_module", MyMethods);
  17. Extending Python: the Python code # WARNING! This only works

    on the embedded interpreter import my_module my_module.my_func() # prints “Hello, World from Go!”
  18. Lessons learned: the good • Embedded Python plays nice with

    Go concurrency model • From/To Python overhead is negligible ◦ BenchmarkCallPyFunc 300000 3606 ns/op • Extending Python is a very powerful tool ◦ Expose functions and data to the Python world
  19. Lessons learned: the bad • The GIL prevents Python parallel

    execution ◦ This was expected • The GIL feels the effects of the Go scheduler ◦ Honestly, didn’t see this coming • Using multiple interpreters doesn’t help ◦ They share a unique GIL
  20. Lessons learned: the ugly • You must carry on some

    C code, how much depending on the use case • You will likely carry on some Python code too, to offer base classes and utilities to external modules running in embedded mode
  21. What we have now • Datadog Agent 6.0.0-beta1 ◦ Embedded

    CPython 2.7.13 ◦ Linux, OSX and Windows ◦ We now run checks concurrently, knowing that many of them will wait for each other... ◦ ...even if some of them were ported to Go, so we also have some parallelism
  22. Thanks for listening! • Try Datadog at https://www.datadoghq.com • Find

    our OSS on https://github.com/DataDog • Our tech blog is http://engineering.datadoghq.com Check out the new Agent on Github! https://github.com/DataDog/datadog-agent/
  23. Embedding: an example // #cgo pkg-config: python-2.7 // #include <Python.h>

    import "C" C.Py_Initialize() cmd := C.CString(“print ‘Hello, World!’”) // cmd must be freed! C.PyRun_SimpleString(cmd) C.Py_Finalize() • we still do go build and that’s it.
  24. Embedding: a better example • Use go-python to eliminate boilerplate

    ◦ https://github.com/sbinet/go-python import "github.com/sbinet/go-python" python.Initialize() python.PyRun_SimpleString(“print ‘Hello, World!’”) python.Finalize()