Slide 1

Slide 1 text

Cloning Git in Go Aditya Mukerjee Risk & Systems Engineer at Stripe

Slide 2

Slide 2 text

Why write Gitgo? •Git operations from Go applications •Portability •Simplicity •…why not? @chimeracoder

Slide 3

Slide 3 text

The dirty secret about Git •Git wasn’t always meant to be a version control system •Git is a decentralized, userland filesystem for preserving historical state and synchronizing data @chimeracoder

Slide 4

Slide 4 text

What We’ve Enabled •Backup •Content management •Distributed build systems •Establishing consensus @chimeracoder

Slide 5

Slide 5 text

Have you ever seen a “friend’s” computer that looked like this? @chimeracoder

Slide 6

Slide 6 text

@chimeracoder 107724969b2af751a1ffd74e3f38d0d76cb8aa55

Slide 7

Slide 7 text

@chimeracoder commits trees blobs objects

Slide 8

Slide 8 text

Discovery #1: Layered interfaces are more flexible than union types @chimeracoder commits trees blobs objects

Slide 9

Slide 9 text

Bootstrapping Git in Go “We can just use the real .git directory for our tests, right?” @chimeracoder …nope.

Slide 10

Slide 10 text

hash(“I love Go”) != hash(“I love Go!”) @chimeracoder Packfiles

Slide 11

Slide 11 text

Reference: Unpacking Git Packfiles binarydiff(1, 2, 3) 1. “I love Go” 2. “I love Go!” 3. “You love Go, and so do I!” “I love Go” + “!”

Slide 12

Slide 12 text

Discovery #2: Pay attention to interface contracts @chimeracoder zlib data zlib data

Slide 13

Slide 13 text

@chimeracoder type Reader interface { Read(p []byte) (n int, err error) } Object 1 Object 2 Object 3 Object 4 …

Slide 14

Slide 14 text

• Discovery #3: Use concurrency to define state machines, and use channels for the input/output @chimeracoder f1 f3 f2 g g1 if f g2

Slide 15

Slide 15 text

• Discovery #4: Treat errors as values, and ask how they behave @chimeracoder httpdir.Open() ParsePackfile() GitFetch()

Slide 16

Slide 16 text

• Discovery #5: Well-architected interfaces perform very well @chimeracoder

Slide 17

Slide 17 text

#1: Layered interfaces are more flexible than union types #2: Pay attention to interface contracts #3: Use concurrency to define state machines, and use channels for the input/output #4: Treat errors as values, and ask how they behave #5: Well-architected interfaces perform very well @chimeracoder

Slide 18

Slide 18 text

Further References • “Unpacking Git Packfiles”: https://codewords.recurse.com/issues/three/unpacking-git-packfiles/ • Gitgo: https://github.com/ChimeraCoder/gitgo/ • “Git from the Inside Out”: https://codewords.recurse.com/issues/two/git-from-the-inside-out

Slide 19

Slide 19 text

Aditya Mukerjee @chimeracoder https://github.com/ChimeraCoder