Upgrade to Pro — share decks privately, control downloads, hide ads and more …

My Mom told me that Git doesn't scale

Vicent Martí
November 08, 2012
1.5k

My Mom told me that Git doesn't scale

Vicent Martí

November 08, 2012
Tweet

Transcript

  1. Thanks
    for being here

    View full-size slide

  2. github
    Git hosting:
    No longer a pain in the ass

    View full-size slide

  3. github
    Git hosting:
    No longer a pain in the ass
    for you.
    Not for us.
    Because, goddamnit,
    if I ever find the guy who invented
    this thing I’m going to hang him from

    View full-size slide

  4. Let’s host some
    Git repos!
     file.c
     src
     file.h
     README.md
     COPYING.md
     .git
     Bare Repository
     HEAD
     index
     objects
     refs
    git-daemon

    View full-size slide

  5. OK, now about
    the web...
    grit Ruby - Git
    interface

    View full-size slide

  6. OK, now about
    the web...
    grit  Bare Repo
     Bare Repo
     Bare Repo
    Ruby - Git
    interface

    View full-size slide

  7. 1VM
    grit

    storage
    rails app

    View full-size slide

  8. nVM

    storage
    (GFS)

    View full-size slide

  9. Rails was making us slow.

    View full-size slide

  10. Time to move to
    Real Hardware

    View full-size slide

  11. fileservers
    frontends

    db

    View full-size slide

  12. fileservers
    frontends

    db
    ?????????

    View full-size slide

  13. bert
    (binary Erlang term)

    View full-size slide

  14. bert
    (binary Erlang term)
    ernie
    (not an acronym)

    View full-size slide


  15. chimney
    (Redis)
    frontend
    fileserver
    smoke
    grit
    ernie
    grit

    View full-size slide

  16. Horizontal Scaling
    Vertical Scaling

    View full-size slide

  17. Horizontal Scaling
    Vertical Scaling
    problums wit them gigabits
    problums wit them gigahurtz

    View full-size slide

  18. them
    gigabits

    View full-size slide

  19. bummer
    x4180 =
    A LOT.

    View full-size slide

  20. NetShard

         ...
    alternate
    network

    View full-size slide

  21. NetShard

         ... 
    alternate
    network

    View full-size slide

  22. them
    gigahurtz

    View full-size slide

  23. bottleneck:
    grit

    View full-size slide

  24. bottleneck:
    grit
    solution:
    git
    shell out to

    View full-size slide

  25. bottleneck:
    git
    shell out to

    View full-size slide

  26. bottleneck:
    git
    solution:
    git
    shell out to
    shell out to

    View full-size slide

  27. bottleneck:
    git
    solution:
    git
    shell out to
    shell out to
    properly

    View full-size slide

  28. posix_spawn
    Seriously.
    < 1ms

    View full-size slide

  29. posix_spawn
    Seriously.
    < 1ms
    The issue is not in
    “shelling out”,
    the issue is in the
    spawned process.

    View full-size slide

  30. GUISE
    GUISE
    GUISE

    View full-size slide

  31. GUISE
    GUISE
    GUISE
    ...what?

    View full-size slide

  32. Why don’t we take

    View full-size slide

  33. Why don’t we take
    the Git binary...

    View full-size slide

  34. Why don’t we take
    the Git binary...
    yeah?

    View full-size slide

  35. Why don’t we take
    the Git binary...
    yeah? and compile it as

    View full-size slide

  36. Why don’t we take
    the Git binary...
    yeah? and compile it as
    a library

    View full-size slide

  37. Why don’t we take
    the Git binary...
    yeah? and compile it as
    a library
    oh... go on...

    View full-size slide

  38. Why don’t we take
    the Git binary...
    yeah? and compile it as
    a library
    oh... go on...
    and link that into

    View full-size slide

  39. Why don’t we take
    the Git binary...
    yeah? and compile it as
    a library
    oh... go on...
    and link that into
    our server

    View full-size slide

  40. Scientific
    Graph™

    View full-size slide

  41. Memory Usage
    Time
    Scientific
    Graph™

    View full-size slide

  42. Memory Usage
    Time
    Scientific
    Graph™

    View full-size slide

  43. Memory Usage
    Time
    Scientific
    Graph™

    View full-size slide

  44. Memory Usage
    Time
    Scientific
    Graph™

    View full-size slide

  45. Memory Usage
    Time
    Scientific
    Graph™

    View full-size slide

  46. Memory Usage
    Time
    Scientific
    Graph™

    View full-size slide

  47. Well, we didn’t think about

    View full-size slide

  48. Well, we didn’t think about
    freeing memory, but...

    View full-size slide

  49. Well, we didn’t think about
    freeing memory, but...
    THIS IS THE KIND
    OF PROBLEM
    WE COULD SOLVE
    WITH CGI

    View full-size slide

  50. Well, we didn’t think about
    freeing memory, but...
    THIS IS THE KIND
    OF PROBLEM
    WE COULD SOLVE
    WITH CGI
    IN 1995

    View full-size slide

  51. Scientific
    Graph™

    View full-size slide

  52. Memory Usage
    Time
    Scientific
    Graph™

    View full-size slide

  53. Memory Usage
    Time
    Scientific
    Graph™

    View full-size slide

  54. Memory Usage
    Time
    Scientific
    Graph™

    View full-size slide

  55. Memory Usage
    Time
    Scientific
    Graph™

    View full-size slide

  56. Memory Usage
    Time
    Scientific
    Graph™

    View full-size slide

  57. Memory Usage
    Time
    Scientific
    Graph™

    View full-size slide

  58. Memory Usage
    Time
    Scientific
    Graph™

    View full-size slide

  59. What do you mean
    the server died?

    View full-size slide

  60. die("BUG: non-INDEX attr direction
    in a bare repo");
    What do you mean
    the server died?

    View full-size slide

  61. die("BUG: non-INDEX attr direction
    in a bare repo");
    die("a bad revision is needed");
    What do you mean
    the server died?

    View full-size slide

  62. die("BUG: non-INDEX attr direction
    in a bare repo");
    die("a bad revision is needed");
    die("'%s' is not a valid
    branch name.", name);
    What do you mean
    the server died?

    View full-size slide

  63. die("BUG: non-INDEX attr direction
    in a bare repo");
    die("a bad revision is needed");
    die("'%s' is not a valid
    branch name.", name); die("Empty patch.
    Aborted.");
    What do you mean
    the server died?

    View full-size slide

  64. die("BUG: non-INDEX attr direction
    in a bare repo");
    die("a bad revision is needed");
    die("'%s' is not a valid
    branch name.", name); die("Empty patch.
    Aborted.");
    die("unable to read index file");
    What do you mean
    the server died?

    View full-size slide

  65. libgit2
    the “2” means this one
    frees memory

    View full-size slide

  66. libgit2
    the “2” means this one
    frees memory
    NOT ENOUGH
    ABSTRACT
    FACTORIES

    View full-size slide

  67. JGit
    the “J” means this one
    is in Java
    ...not our thing.

    View full-size slide

  68. Java
    a brief timeline
    New companies
    don’t use Java
    because it’s
    not like Unix
    1995
    New companies
    use Java
    because it’s
    new and shiny
    1997
    New companies
    don’t use Java
    because it’s
    ooooooold
    2005
    New companies
    use the JVM
    because
    2011

    View full-size slide

  69. Java
    a brief timeline
    New companies
    don’t use Java
    because it’s
    not like Unix
    1995
    New companies
    use Java
    because it’s
    new and shiny
    1997
    New companies
    don’t use Java
    because it’s
    ooooooold
    2005
    New companies
    use the JVM
    because
    2011
    github

    View full-size slide

  70. If you think you understand
    the JVM, you are either:

    View full-size slide

  71. If you think you understand
    the JVM, you are either:
    a) Very smart

    View full-size slide

  72. If you think you understand
    the JVM, you are either:
    a) Very smart
    b) Very wrong

    View full-size slide

  73. If you think you understand
    the JVM, you are either:
    a) Very smart
    b) Very wrong

    View full-size slide

  74. Some people think
    that
    github
    is a
    Rails shop
    Ruby shop.
    or even a

    View full-size slide

  75. Some people think
    that
    github
    is a
    Rails shop
    Ruby shop.
    or even a
    github
    is a
    Unix shop
    and everything else is
    just a detail.

    View full-size slide

  76. libgit2
    a brief timeline
    Shawn
    Pearce
    The Past

    View full-size slide

  77. libgit2
    a brief timeline
    Shawn
    Pearce
    myself
    The Past

    View full-size slide

  78. libgit2
    a brief timeline
    Shawn
    Pearce
    myself
    myself
    (about to have a
    mental breakdown)
    The Past

    View full-size slide

  79. libgit2
    a brief timeline
    Shawn
    Pearce
    myself
    myself
    (about to have a
    mental breakdown)
    myself
    (having a mental
    breakdown)
    The Past

    View full-size slide

  80. libgit2
    a brief timeline
    Shawn
    Pearce
    myself
    myself
    (about to have a
    mental breakdown)
    myself
    (having a mental
    breakdown)
    myself
    (reaching Git nirvana)
    The Past

    View full-size slide

  81. libgit2
    a brief timeline
    Shawn
    Pearce
    myself
    myself
    (about to have a
    mental breakdown)
    myself
    (having a mental
    breakdown)
    myself
    (reaching Git nirvana)
    The Past
    Russell
    Belfer
    Carlos
    Martín
    Michael
    Schubert
    Ben
    Straub
    real contributors

    View full-size slide

  82. libgit2
    a brief timeline
    ?

    View full-size slide

  83. libgit2
    a brief timeline
    ?

    View full-size slide

  84. libgit2
    a brief timeline
    ?
    1.0 release

    View full-size slide

  85. Good Heavens,
    just look at the
    time.
    It’s NoSQL o’clock
    NoSQL
    NoSQL NoSQL
    NoSQL
    NoSQL NoSQL
    NoSQL
    NoSQL

    View full-size slide

  86. ...do you even

    View full-size slide

  87. ...do you even
    mongo?

    View full-size slide

  88. Key-Value
    Stores
    The Magic of
    If you wish upon a star,
    and have a pure heart...

    View full-size slide

  89. Key-Value
    Stores
    The Magic of
    If you wish upon a star,
    and have a pure heart...
    Anything can be
    a Key-Value store!

    View full-size slide

  90. id name state lat
    13 San Francisco CA 24
    24 Phoenix AZ 33
    7 Denver CO 40
    8 Caribou ME 47
    2 Los Angeles CA 22
    SELECT * FROM CITIES
    WHERE name = ‘San Francisco’
    Key-Value
    Stores
    The Magic of

    View full-size slide

  91. Git is queried like
    a Key-Value Store
    But it is not a
    Key-Value store
    git show
    f3c896c1949476e85abc0d75bb2143656a9580a6

    View full-size slide

  92. a b r i e f i n t r o d u c t i o n
    t o t h e G i t d a t a m o d e l

    View full-size slide

  93.  file.c
     src
     file.h
     README.md
     COPYING.md

    View full-size slide

  94.  file.c
     src
     file.h
     README.md
     COPYING.md
    tree
    src/
    README.md
    COPYING.md
    tree
    file.c
    file.h
    blob
    blob
    blob
    blob

    View full-size slide

  95. commit
    parent
    tree T
    metadata

    View full-size slide

  96. commit
    T
    commit
    T
    commit
    T
    commit
    T
    commit
    T
    commit
    T
    Behold,
    a graph.

    View full-size slide

  97. Well that was easy.

    View full-size slide

  98. master
    Oh
    god
    kill
    me

    View full-size slide

  99. Li le known
    torture
    methods:

    View full-size slide

  100. warning:
    the rabbit
    hole is
    pretty deep

    View full-size slide

  101. warning:
    git totally
    wasn’t designed
    for this

    View full-size slide

  102. Git doesn’t give
    a #!%$ about CAP

    View full-size slide

  103. Number of hops on a complex query
    1,000,000

    View full-size slide

  104. Number of hops on a complex query
    1,000,000
    Required hops for a successful query
    1,000,000

    View full-size slide

  105. Number of hops on a complex query
    1,000,000
    Required hops for a successful query
    1,000,000
    Replica count to ensure 100% availability
    a metric shitton

    View full-size slide

  106. We could fix it.

    View full-size slide

  107. We could fix it.
    But we won’t.

    View full-size slide

  108. GitRPC
    Rugged
    libgit2
    server
    Ruby
    Ruby
    C

    View full-size slide


  109. chimney
    (Redis)
    frontend
    fileserver
    smoke
    grit
    ernie
    grit
    GitRPC GitRPC

    View full-size slide


  110. chimney
    (Redis)
    frontend fileserver
    GitRPC GitRPC
    server
    client

    View full-size slide

  111. New
    serialization
    protocol
    Banana
    Pack
    MessagePack
    + more

    View full-size slide

  112. New
    serialization
    protocol
    Banana
    Pack
    MessagePack
    + more
    mochilo

    View full-size slide

  113. New
    serialization
    protocol
    Banana
    Pack
    MessagePack
    + more
    mochilo
    Banana Phone

    View full-size slide

  114. evolutionary
    (disappointing?)

    View full-size slide

  115. Ruby
    C
    Unix
    Unix
    Unix
    Unix
    Unix
    Unix
    Unix
    Unix
    Boring
    Boring
    Boring
    Boring
    Boring
    Summary:

    View full-size slide

  116. Use the
    most reliable
    tools you know.

    View full-size slide

  117. Challenge yourself to build
    the simplest thing.
    Not because it’s easy,
    but because it works.

    View full-size slide

  118. Innovate
    where it really ma ers.

    View full-size slide

  119. revolutionary
    product
    create a
    revolutionary
    backend.
    not a

    View full-size slide

  120. Q: Does Git scale?

    View full-size slide

  121. Q: Does Git scale?
    A: Who cares?

    View full-size slide