Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Go implementation for Flatdata

Go implementation for Flatdata

Once in the past, routing team at HERE Technologies was in the situation when routing backend was in danger of failing SLA. It’s triggered a massive work to improve its performance. As a side effect of this work, https://github.com/heremaps/flatdata was born: zero-copy memory-mapped data storage.

This talk will be about sharing an experience of creation of Go implementation for Flatdata: zero-copy memory-mapped data storage.

We will compare Go implementation with implementations for other languages. I will show in which cases Go has advantages compared to other languages and opposite cases (hi, generics!). We will look also at the performance aspect of different implementations.

Avatar for Artem Nikitin

Artem Nikitin

May 23, 2018
Tweet

More Decks by Artem Nikitin

Other Decks in Programming

Transcript

  1. A long time ago in a galaxy far, far away....

    Routing team at HERE Technologies was in the situation when routing backend was in danger of failing SLA. It’s triggered a massive work to improve its performance. As a side effect of this work, Flatdata was born: zero-copy memory-mapped data storage © 2018 HERE | Public DevDays Vilnius | May, 2018
  2. Routing • Creating a path from point A to point

    B • Hmm… It looks familiar… • Can we represent a map as a graph and then use Dijkstra? © 2018 HERE | Public DevDays Vilnius | May, 2018
  3. Routing • It looks easy, but reality is different •

    What about negative “turn costs”? • What to do with time awareness? • How to handle dynamic info? • Usually, different algorithms are used for different cases © 2018 HERE | Public DevDays Vilnius | May, 2018
  4. Go • Initially developed by: Ken Thompson (C, Unix) Rob

    Pike (Plan9, UTF-8) Robert Griesemer (V8) • Statically typed, compiled • With GC • Created with support for concurrency in mind © 2018 HERE | Public DevDays Vilnius | May, 2018
  5. Go Why do we need yet another language? • Fast

    compilation • Simple syntax and quick start • One code style! © 2018 HERE | Public DevDays Vilnius | May, 2018
  6. Go What is written in Go? • Docker • Kubernetes

    • HashiCorp tools • Prometheus and many more… © 2018 HERE | Public DevDays Vilnius | May, 2018
  7. Go What I like in Go: • Simplicity and even

    primitivity No more AbstractSingletonProxyFactoryBean ever J • Code which is easy to understand © 2018 HERE | Public DevDays Vilnius | May, 2018
  8. Go What I DON’T like about Go: • Dependency management

    • Ecosystem isn’t that good comparing to more mature languages, like Java © 2018 HERE | Public DevDays Vilnius | May, 2018
  9. Go If you are interested, then: • https://tour.golang.org • The

    Go Programming Language (Alan A. A. Donovan, Brian W. Kernighan) © 2018 HERE | Public DevDays Vilnius | May, 2018
  10. What is Flatdata? https://github.com/heremaps/flatdata Flatdata is a library providing data

    structures for convenient creation, storage and access of packed memory-mappable immutable data structures with minimal overhead © 2018 HERE | Public DevDays Vilnius | May, 2018
  11. Why Flatdata? Existed data format was optimized for embedded clients:

    • Reduced space consumption (Disk) • Incremental processing tile-by-tile (CPU) © 2018 HERE | Public DevDays Vilnius | May, 2018
  12. Why Flatdata? Drawbacks for backend: • Decoding tiles and reconstructing

    world scene (CPU Time) • Custom tile cache not shared between processes (RAM) © 2018 HERE | Public DevDays Vilnius | May, 2018
  13. Solution • Store the data in directly-accessible memory-mapped files •

    Store it sorted to preserve locality • Use constant-time lookups wherever possible • Avoid unnecessary copying of data © 2018 HERE | Public DevDays Vilnius | May, 2018
  14. When it's useful? • Your data updates infrequently and accessed

    much more often than updated • You can afford to recreate the full data archive on every update • Your data fits in the RAM on target instance • You want to optimize your data to be cache-friendly © 2018 HERE | Public DevDays Vilnius | May, 2018
  15. Alternatives We looked at several alternatives with Flatbuffers as most

    well-known. We prototyped a routing graph with 10M nodes and compared our solution with Flatbuffers © 2018 HERE | Public DevDays Vilnius | May, 2018
  16. Prototype Graph creation © 2018 HERE | Public DevDays Vilnius

    | May, 2018 cpu_time (ms) peak memory consumption (kb) flatbuffers 1 936.2096 1 486 116 reference 1 446.0785 743 640
  17. Prototype Dijkstra © 2018 HERE | Public DevDays Vilnius |

    May, 2018 cpu_time (ms) peak memory consumption (kb) flatbuffers 62 526.3584 1 735 836 reference 57 970.6155 1 718 032
  18. Prototype BFS © 2018 HERE | Public DevDays Vilnius |

    May, 2018 cpu_time (ms) peak memory consumption (kb) flatbuffers 10 997.4264 1 163 352 reference 9 300.9597 1 144 048
  19. Prototype © 2018 HERE | Public DevDays Vilnius | May,

    2018 • Double memory footprint during graph compilation due to immutability of Flatbuffers • Performance is a little bit worse then our solution • Possibilities for fine-grained optimizations are limited
  20. Flatdata Library consists of: • Schema language • Code generator

    for C++, Python and Go • Target language libraries. © 2018 HERE | Public DevDays Vilnius | May, 2018
  21. Implementations comparison © 2018 HERE | Public DevDays Vilnius |

    May, 2018 C++ feature complete very efficient used in production has writer Python feature complete slow debugging, tooling etc. no writer Go feature complete performance unclear beta no writer
  22. Current status of Go implementation • It's feature complete comparing

    to C++/Python • It's passes tests for C++/Python implementations • Probably, a lots of bugs missed :) • Performance is unclear • No guarantees of backward compatible changes for public API • No writer :( © 2018 HERE | Public DevDays Vilnius | May, 2018