Slide 1

Slide 1 text

Go at Cybozu @ymmt2005 Go Conference 2018 Spring

Slide 2

Slide 2 text

Me ▌@ymmt2005 ▌Love computers  and Go ☺ ▌Currently working as  Architect, and  Project manager

Slide 3

Slide 3 text

Cybozu ▌No.1 Groupware vendor in Japan  for Japan  for world-wide

Slide 4

Slide 4 text

New projects at Cybozu ▌Neco  Rearchitecting with  Kubernetes and  CLOS + Pure L3 network ▌Yakumo  Migrate to AWS Here I am

Slide 5

Slide 5 text

History of Go usage ▌2013-08  Start using Go to replace Python services ▌2014-02  First Go OSS: kintone Go SDK ▌2016-02  Org. for Go OSS:

Slide 6

Slide 6 text

Go usage right now ▌In-house  30+ programs  40K LoC ▌Open Source  12 repositories ▌People  4 teams  20+ developers

Slide 7

Slide 7 text

Go at Yakumo ▌Yakumo consists of 50%+ Go! 50.5% ⇒

Slide 8

Slide 8 text

Why do we prefer Go? ▌Statically typed, yet compiles fast ▌Small and container-friendly executables ▌Concurrent programming made easy ▌Less to learn than C++, Java, …

Slide 9

Slide 9 text

Managing Go Code

Slide 10

Slide 10 text

Problem ▌We have a lot of Go products  using a lot of third-party libraries  to run in a production environment. ▌We should be able to:  update third-party libraries easily,  reproduce the executable reliably, and  keep enough quality for production use.

Slide 11

Slide 11 text

Our solutions 1.Mono-repository 2.Mirroring 3.Frameworks

Slide 12

Slide 12 text

Mono-repository ▌We maintain a single Git repository  to share library packages  between all programs used in-house ▌No package-local vendoring ▌Updating a library is just easy!

Slide 13

Slide 13 text

Mirroring ▌We use `git subtree` to mirror packages on Internet  including ours. ▌By mirroring, we can:  tolerate failures,  tolerate repository deletions, and  reproduce the same executable.

Slide 14

Slide 14 text

Frameworks ▌No, it doesn’t mean Web frameworks. ▌We think production-grade programs should:  leak no resources,  output enough logs to shoot troubles,  have proper timeouts, and  exit / restart gracefully.

Slide 15

Slide 15 text

Frameworks (contd) ▌To help creating production-grade Go products, we have two frameworks:  standardize log fields and formats.  manage goroutines, logging, signals, etc. ▌Both are open source.

Slide 16

Slide 16 text

Our Go Products

Slide 17

Slide 17 text ▌Structured logging framework (≠ library) ▌Support three formats:  plain, logfmt, JSON Lines ▌Simple and very fast  449K log/s in JSON  Plain is a bit slow because it sorts fields.

Slide 18

Slide 18 text ▌ This framework imposes the use of context in virtually all goroutines. ▌ Features  Standardized signal handling  Graceful restart of servers  Logging using cybozu-go/log  etc.

Slide 19

Slide 19 text

transocks and usocksd ▌  Redirect outgoing TCP connections to a SOCKS or HTTP proxy transparently  Use iptables instead of LD_PRELOAD  works for programs independent of libc. ▌  SOCKS4/5 server (and library)  Dynamic IP deselection using DNSBL

Slide 20

Slide 20 text ▌Provide alternatives to  apt-cacher-ng  apt-mirror ▌Features  No inconsistent cache/mirror!

Slide 21

Slide 21 text ▌Virtual data center construction tool  A product from Neco project to simulate CLOS & BGP network  Under active development ▌Features  Stateless (unlike libvirt)  cloud-init & Ignition  UEFI boot  …

Slide 22

Slide 22 text

Vermeer ▌Not an OSS yet ▌Thumbnail generation service  supports JPEG, GIF, TIFF, PNG, BMP ▌Pure Go implementation  using  to minimize vulnerabilities

Slide 23

Slide 23 text

Logshipper ▌Not an OSS yet ▌Export logs to Kafka  at least once. ▌That’s all!

Slide 24

Slide 24 text

We were so successful using Go, but

Slide 25

Slide 25 text

Journald Journald Journarld …

Slide 26

Slide 26 text

We are running Go programs as systemd services.

Slide 27

Slide 27 text

systemd unit file to run transocks [Unit] Description=transocks [Service] Type=simple Restart=on-failure ExecStart=/path/to/transocks

Slide 28

Slide 28 text

One day, transocks exited silently. systemd did not restart it.

Slide 29

Slide 29 text

This led us to HUGE service breakage.

Slide 30

Slide 30 text

What happened 1. Journald died. 2. Go got EPIPE and sent SIGPIPE to itself. 3. transocks died with SIGPIPE. 4. systemd did not restart transocks because it figured that exit with SIGPIPE is not a failure!!!

Slide 31

Slide 31 text

Journald dies ▌Journald is not PID 1. ▌So, the process may be killed, for example, by OOM killer. ▌Journald had bugs that killed it.  One bug was fixed by us:

Slide 32

Slide 32 text

Go and SIGPIPE ▌Go masks SIGPIPE to get EPIPE errors from broken sockets or pipes. ▌For stdout & stderr, Go raises SIGPIPE manually.

Slide 33

Slide 33 text

systemd and SIGPIPE ▌SuccessExitStatus directive defines what should be considered successful exit. ▌The default is exit code 0, SIGHUP, SIGINT, SIGTERM, and SIGPIPE!

Slide 34

Slide 34 text

So, Restart=on-failure did not help.

Slide 35

Slide 35 text

Our recommendations ▌Output logs to files rather than journald. ▌Adjust OOM score of journald: ▌Add this line to your service unit files: $ cat /etc/systemd/system/systemd-journald.service.d/oom_score_adj.conf [Service] OOMScoreAdjust=-1000 RestartForceExitStatus=SIGPIPE

Slide 36

Slide 36 text

Thank you for listening! Meet us at:

Slide 37

Slide 37 text