Upgrade to Pro — share decks privately, control downloads, hide ads and more …

A Python and a Gopher walk into a bar - Embedding Python in Go.

A Python and a Gopher walk into a bar - Embedding Python in Go.

Success stories about rewriting Python applications in Go are not big news anymore. The pros and cons are well known, best practices are in place, and the standard library is there to help. But what if you want to keep some of your Python code? When we chose to port the Datadog Agent to Go, we needed to maintain support for our existing library of plugins written in Python. During the talk we will share lessons learned from our experiences with cgo, the GIL and the quest for performance as we bridge multiple languages in a single application.

Massimiliano Pippi

October 05, 2017
Tweet

More Decks by Massimiliano Pippi

Other Decks in Programming

Transcript

  1. A Python and a Gopher walk into a bar -
    Embedding Python in Go.
    Massimiliano Pippi

    View full-size slide

  2. ● Software Engineer
    ○ C++, Python and Go
    ● OSS fan and contributor
    ● 1+ year at Datadog, working on Agent and Integrations
    Hi, I’m Massi!

    View full-size slide

  3. ● SaaS based infrastructure, app and logs monitoring
    ● Open Source Agent
    ● Time series data (metrics and events)
    ● Intelligent Alerting and Insightful Dashboards
    ● Trillions of data points processed per day
    About Datadog

    View full-size slide

  4. Monitor everything

    View full-size slide

  5. Meet the Datadog Agent
    Agent
    check
    ● Written in Python
    ● Open Source,
    https://github.com/DataDog/dd-agent

    View full-size slide

  6. The anatomy of a check
    import psutil
    from checks import AgentCheck
    class SystemSwap(AgentCheck):
    def check(self, instance):
    swap_mem = psutil.swap_memory()
    self.rate('system.swap.swapped_in', swap_mem.sin)
    self.rate('system.swap.swapped_out', swap_mem.sout)
    self.gauge(‘system.swap.total’, swap_mem.total)
    self.gauge(‘system.swap.used’, swap_mem.used)

    View full-size slide

  7. The way to Go - Our Goals
    ● Make it smaller faster stronger
    ● Keep Python as an extension language
    ○ ~75 Python checks currently part of the official package
    ○ Python is the right tool to implement most of them
    ○ Undetermined number of custom checks in the wild
    ○ Update a check without recompiling the Agent

    View full-size slide

  8. Embedding for the win
    ● Python can be embedded: you keep an interpreter in
    memory and make it run Python code at will
    ● CPython provides a C API to allow embedding
    ● Cgo enables the creation of Go packages that call C
    code.

    View full-size slide

  9. Demo time!
    Let’s run a Python module from a go application
    ● https://github.com/masci/golab17/tree/master/01

    View full-size slide

  10. The dreadful GIL
    ● GIL stands for global interpreter lock
    ● Prevents multiple threads from executing
    Python code at once
    ● Yes, even if those threads are run on a
    multi-core processor

    View full-size slide

  11. The dreadful GIL
    ● Embedding CPython means embedding the GIL
    ● The GIL knows about threads created from
    Python code...
    ● ...but the GIL can’t do its job if we run Python
    from separate Go threads!

    View full-size slide

  12. Demo time!
    Run Python in different goroutines and watch the
    world burn
    ● https://github.com/masci/golab17/tree/master/02
    ● Rule of thumb: any time you use Python in
    some piece of code that could be executed in
    a separate thread, lock the GIL!

    View full-size slide

  13. The revenge of the dreadful GIL
    ● We lock and unlock the GIL in goroutines, not
    threads
    ● The GIL protects a specific thread state, we
    cannot lock/unlock it from different threads!
    ● But the Go scheduler might pause and resume
    goroutines in different threads

    View full-size slide

  14. Demo time!
    See what happens when the Go scheduler
    relocates our Pythonic goroutines (spoiler alert:
    it’ll crash your software)
    ● https://github.com/masci/golab17/tree/master/03

    View full-size slide

  15. Beyond embedding: extending!
    ● Once you have an embedded interpreter, you
    can extend Python capabilities with Go code
    ● This involves a little bit of C so no demo here
    ● Still very easy to achieve, Python scripts
    import a module that actually lives in memory
    and points to Go instructions

    View full-size slide

  16. Extending Python: the Go code
    //export MyGoFunc
    func MyGoFunc() {
    fmt.Println(“Hello, World from Go!”)
    }

    View full-size slide

  17. Extending Python: the C code
    static PyMethodDef MyMethods[] = {
    {"my_func", (PyCFunction)myGoFunc, METH_VARARGS, "YAY!"},
    {NULL, NULL} // guards
    };
    PyObject *m = Py_InitModule("my_module", MyMethods);

    View full-size slide

  18. Extending Python: the Python code
    # WARNING! This only works on the embedded interpreter
    import my_module
    my_module.my_func() # prints “Hello, World from Go!”

    View full-size slide

  19. Lessons learned: the good
    ● Embedded Python plays nice with Go
    concurrency model
    ● From/To Python overhead is negligible
    ○ BenchmarkCallPyFunc 300000 3606 ns/op
    ● Extending Python is a very powerful tool
    ○ Expose functions and data to the Python world

    View full-size slide

  20. Lessons learned: the bad
    ● The GIL prevents Python parallel execution
    ○ This was expected
    ● The GIL feels the effects of the Go scheduler
    ○ Honestly, didn’t see this coming
    ● Using multiple interpreters doesn’t help
    ○ They share a unique GIL

    View full-size slide

  21. Lessons learned: the ugly
    ● You must carry on some C code, how much
    depending on the use case
    ● You will likely carry on some Python code too,
    to offer base classes and utilities to external
    modules running in embedded mode

    View full-size slide

  22. What we have now
    ● Datadog Agent 6.0.0-beta1
    ○ Embedded CPython 2.7.13
    ○ Linux, OSX and Windows
    ○ We now run checks concurrently, knowing that many of
    them will wait for each other...
    ○ ...even if some of them were ported to Go, so we also
    have some parallelism

    View full-size slide

  23. Thanks for listening!
    ● Try Datadog at https://www.datadoghq.com
    ● Find our OSS on https://github.com/DataDog
    ● Our tech blog is http://engineering.datadoghq.com
    Check out the new Agent on Github!
    https://github.com/DataDog/datadog-agent/

    View full-size slide

  24. WE’RE HIRING!
    NYC, PARIS, Remote

    View full-size slide

  25. Embedding: an example
    // #cgo pkg-config: python-2.7
    // #include
    import "C"
    C.Py_Initialize()
    cmd := C.CString(“print ‘Hello, World!’”) // cmd must be freed!
    C.PyRun_SimpleString(cmd)
    C.Py_Finalize()
    ● we still do go build and that’s it.

    View full-size slide

  26. Embedding: a better example
    ● Use go-python to eliminate boilerplate
    ○ https://github.com/sbinet/go-python
    import "github.com/sbinet/go-python"
    python.Initialize()
    python.PyRun_SimpleString(“print ‘Hello, World!’”)
    python.Finalize()

    View full-size slide