Cloning Git in Go

Cloning Git in Go

Gitgo began as an experiment: how would programming today be different if Git had been written in Go instead of C?

The result: an implementation of Git that is compatible with existing Git repositories, but simpler, more portable, and fast enough to be used as a general-purpose data store. With Gitgo, Git is no longer just for managing source control. Go makes Git a practical choice for content distribution, distributed build systems, establishing consensus, and more.

94dcff33cbdf74b5d785369ac54bc1a8?s=128

Aditya Mukerjee

November 18, 2016
Tweet

Transcript

  1. Cloning Git in Go Aditya Mukerjee Risk & Systems Engineer

    at Stripe
  2. Why write Gitgo? •Git operations from Go applications •Portability •Simplicity

    •…why not? @chimeracoder
  3. The dirty secret about Git •Git wasn’t always meant to

    be a version control system •Git is a decentralized, userland filesystem for preserving historical state and synchronizing data @chimeracoder
  4. What We’ve Enabled •Backup •Content management •Distributed build systems •Establishing

    consensus @chimeracoder
  5. Have you ever seen a “friend’s” computer that looked like

    this? @chimeracoder
  6. @chimeracoder 107724969b2af751a1ffd74e3f38d0d76cb8aa55

  7. @chimeracoder commits trees blobs objects

  8. Discovery #1: Layered interfaces are more flexible than union types

    @chimeracoder commits trees blobs objects
  9. Bootstrapping Git in Go “We can just use the real

    .git directory for our tests, right?” @chimeracoder …nope.
  10. hash(“I love Go”) != hash(“I love Go!”) @chimeracoder Packfiles

  11. Reference: Unpacking Git Packfiles binarydiff(1, 2, 3) 1. “I love

    Go” 2. “I love Go!” 3. “You love Go, and so do I!” “I love Go” <string 1> + “!”
  12. Discovery #2: Pay attention to interface contracts @chimeracoder zlib data

    <arbitrary data> zlib data <arbitrary data>
  13. @chimeracoder type Reader interface { Read(p []byte) (n int, err

    error) } Object 1 Object 2 Object 3 Object 4 …
  14. • Discovery #3: Use concurrency to define state machines, and

    use channels for the input/output @chimeracoder f1 f3 f2 g g1 if f g2
  15. • Discovery #4: Treat errors as values, and ask how

    they behave @chimeracoder httpdir.Open() ParsePackfile() GitFetch()
  16. • Discovery #5: Well-architected interfaces perform very well @chimeracoder

  17. #1: Layered interfaces are more flexible than union types #2:

    Pay attention to interface contracts #3: Use concurrency to define state machines, and use channels for the input/output #4: Treat errors as values, and ask how they behave #5: Well-architected interfaces perform very well @chimeracoder
  18. Further References • “Unpacking Git Packfiles”: https://codewords.recurse.com/issues/three/unpacking-git-packfiles/ • Gitgo: https://github.com/ChimeraCoder/gitgo/

    • “Git from the Inside Out”: https://codewords.recurse.com/issues/two/git-from-the-inside-out
  19. Aditya Mukerjee @chimeracoder https://github.com/ChimeraCoder