Upgrade to Pro — share decks privately, control downloads, hide ads and more …

MongoDB — confessions of a PostgreSQL lover

Conrad Irwin
October 17, 2013

MongoDB — confessions of a PostgreSQL lover

So, I switched to MongoDB. After using PostgreSQL for 7 years, and enjoying every minute, it wasn't a switch I was expecting to make. But I'm loving it!

This talk will explore some of the differences between building apps with relational databases and document databases. We'll cover a few of the expected differences, and also some of the more subtle ones. We'll also discuss how some of the perceived limitations of a document database are actually strengths in
disguise.

Conrad Irwin

October 17, 2013
Tweet

More Decks by Conrad Irwin

Other Decks in Programming

Transcript

  1. MongoDB
    Confessions of a PostgreSQL lover
    @ConradIrwin

    View Slide

  2. I ♥

    View Slide

  3. I ♥

    View Slide

  4. Fight!

    View Slide

  5. • Documents • Records
    • Flexibility • Integrity
    • Availability • Consistency

    View Slide

  6. Documents
    {
    "
    _
    i
    d
    "
    : "
    5
    2
    3
    3
    b
    b
    7
    7
    3
    1
    e
    f
    2
    0
    5
    2
    1
    d
    0
    0
    0
    0
    0
    2
    "
    ,
    "
    n
    a
    m
    e
    "
    : "
    C
    o
    n
    r
    a
    d I
    r
    w
    i
    n
    "
    ,
    "
    f
    a
    c
    e
    b
    o
    o
    k
    "
    : {
    "
    a
    c
    c
    e
    s
    s
    _
    k
    e
    y
    "
    : "
    $
    u
    p
    e
    r
    $
    e
    c
    r
    e
    t
    "
    }
    ,
    "
    l
    i
    n
    k
    e
    d
    i
    n
    "
    : {
    "
    a
    c
    c
    e
    s
    s
    _
    k
    e
    y
    "
    : "
    $
    o
    m
    e
    w
    h
    a
    t
    $
    e
    c
    r
    e
    t
    "
    ,
    "
    a
    c
    c
    e
    s
    s
    _
    s
    e
    c
    r
    e
    t
    "
    : "
    $
    t
    u
    p
    e
    n
    d
    o
    u
    $
    l
    y
    $
    e
    c
    r
    e
    t
    "
    }
    }
    {
    "
    _
    i
    d
    "
    : "
    5
    2
    3
    3
    b
    b
    7
    7
    3
    1
    e
    f
    2
    0
    5
    4
    2
    e
    0
    0
    0
    0
    0
    9
    "
    ,
    "
    n
    a
    m
    e
    "
    : "
    J
    a
    m
    e
    s S
    m
    i
    t
    h
    "
    ,
    "
    f
    a
    c
    e
    b
    o
    o
    k
    "
    : {
    "
    a
    c
    c
    e
    s
    s
    _
    k
    e
    y
    "
    : "
    $
    o
    m
    e
    t
    h
    i
    n
    g
    e
    l
    $
    e
    "
    }
    ,
    "
    t
    w
    i
    t
    t
    e
    r
    "
    : {
    "
    a
    c
    c
    e
    s
    s
    _
    k
    e
    y
    "
    : "
    $
    e
    e
    m
    $
    l
    e
    g
    i
    t
    "
    ,
    "
    a
    c
    c
    e
    s
    s
    _
    s
    e
    c
    r
    e
    t
    "
    : "
    $
    e
    r
    i
    o
    u
    $
    l
    y
    "
    }
    }

    View Slide

  7. Records
    U
    s
    e
    r
    s
    i
    d n
    a
    m
    e
    1 C
    o
    n
    r
    a
    d I
    r
    w
    i
    n
    2 J
    a
    m
    e
    s S
    m
    i
    t
    h
    A
    c
    c
    o
    u
    n
    t
    s
    i
    d u
    s
    e
    r s
    i
    t
    e a
    c
    c
    e
    s
    s
    _
    k
    e
    y a
    c
    c
    e
    s
    s
    _
    s
    e
    c
    r
    e
    t
    1 1 F
    a
    c
    e
    b
    o
    o
    k $
    u
    p
    e
    r
    $
    e
    c
    r
    e
    t —
    2 1 L
    i
    n
    k
    e
    d
    I
    n $
    o
    m
    e
    w
    h
    a
    t
    $
    e
    c
    r
    e
    t $
    t
    u
    p
    e
    n
    d
    o
    u
    $
    l
    y
    $
    e
    c
    r
    e
    t
    3 2 F
    a
    c
    e
    b
    o
    o
    k $
    o
    m
    e
    t
    h
    i
    n
    g
    e
    l
    $
    e —
    4 2 T
    w
    i
    t
    t
    e
    r $
    e
    e
    m
    $
    l
    e
    g
    i
    t $
    e
    r
    i
    o
    u
    $
    l
    y

    View Slide

  8. Records vs. Documents
    Records Documents
    • Lots of small things • One big thing
    • All the same • Can differ
    • Split into chunks • Related data together
    • Atomic access to
    many
    • Atomic within
    document
    • "Space efficient" • "Locality efficient"

    View Slide

  9. Aside: Space vs. Locality
    Disk latency (EBS): ~2ms
    Disk throughput (EBS): ~60MB/s
    Read 4kb from disk: 2ms + 0.07ms = ~2ms
    Read 8kb from disk: 2ms + 0.14ms = ~2.1ms
    Read 2 * 4kb from disk: 4ms + 0.14ms = ~4.1ms
    (Network round-trip (EC2): 2-3ms)

    View Slide

  10. Aside: Space vs. Locality
    PostgreSQL: Overhead per row ~24bytes
    MongoDB: Overhead per document: size of JSON
    keys + padding
    Both: Per-table/per-collection overhead

    View Slide

  11. Flexibility
    MongoDB: You can put anything in any
    document.
    PostgreSQL: You can only match the schema.

    View Slide

  12. Integrity
    MongoDB: You could read anything out.
    PostgreSQL: You will only read valid data.

    View Slide

  13. Flexibility vs. Integrity
    Flexibility optimizes for change.
    Integrity optimizes for validity.
    You can't set a schema in MongoDB.
    PostgreSQL requires an explicit schema.

    View Slide

  14. e.g. Social network
    connections
    Each social network has some things in common.
    But they're all different, OAuth, OAuth2, etc.
    Don't want a table for each...

    View Slide

  15. In PostgreSQL:
    C
    R
    E
    A
    T
    E T
    A
    B
    L
    E a
    c
    c
    o
    u
    n
    t
    s (
    i
    d I
    N
    T
    E
    G
    E
    R P
    R
    I
    M
    A
    R
    Y K
    E
    Y
    ,
    u
    s
    e
    r
    _
    i
    d I
    N
    T
    E
    G
    E
    R N
    O
    T N
    U
    L
    L
    ,
    s
    o
    c
    i
    a
    l
    _
    n
    e
    t
    w
    o
    r
    k T
    E
    X
    T N
    O
    T N
    U
    L
    L
    ,
    p
    r
    o
    p
    e
    r
    t
    i
    e
    s J
    S
    O
    N N
    O
    T N
    U
    L
    L D
    E
    F
    A
    U
    L
    T '
    {
    }
    '
    ,
    c
    r
    e
    a
    t
    e
    d
    _
    a
    t T
    I
    M
    E
    S
    T
    A
    M
    P
    ,
    u
    p
    d
    a
    t
    e
    d
    _
    a
    t T
    I
    M
    E
    S
    T
    A
    M
    P
    )

    View Slide

  16. In MongoDB:
    {
    "
    _
    i
    d
    "
    : "
    5
    2
    3
    3
    b
    b
    7
    7
    3
    1
    e
    f
    2
    0
    5
    2
    1
    d
    0
    0
    0
    0
    0
    2
    "
    ,
    "
    n
    a
    m
    e
    "
    : "
    C
    o
    n
    r
    a
    d I
    r
    w
    i
    n
    "
    ,
    "
    f
    a
    c
    e
    b
    o
    o
    k
    "
    : {
    "
    a
    c
    c
    e
    s
    s
    _
    k
    e
    y
    "
    : "
    $
    u
    p
    e
    r
    $
    e
    c
    r
    e
    t
    "
    }
    ,
    "
    l
    i
    n
    k
    e
    d
    i
    n
    "
    : {
    "
    a
    c
    c
    e
    s
    s
    _
    k
    e
    y
    "
    : "
    $
    o
    m
    e
    w
    h
    a
    t
    $
    e
    c
    r
    e
    t
    "
    ,
    "
    a
    c
    c
    e
    s
    s
    _
    s
    e
    c
    r
    e
    t
    "
    : "
    $
    t
    u
    p
    e
    n
    d
    o
    u
    $
    l
    y
    $
    e
    c
    r
    e
    t
    "
    }
    }

    View Slide

  17. e.g. Start using first names
    Have a table will full names.
    Start collecting first name & last name instead.
    Ensure all code works for all users...

    View Slide

  18. MongoDB:
    u
    s
    e
    r
    .
    f
    i
    r
    s
    t
    _
    n
    a
    m
    e |
    | u
    s
    e
    r
    .
    n
    a
    m
    e
    .
    s
    p
    l
    i
    t
    (
    " "
    )
    [
    0
    ]
    ;

    View Slide

  19. MongoDB data-layer
    /
    / e
    .
    g
    . u
    s
    i
    n
    g m
    o
    n
    g
    o
    s
    k
    i
    n f
    o
    r n
    o
    d
    e
    d
    b
    .
    b
    i
    n
    d
    (
    '
    u
    s
    e
    r
    s
    '
    , {
    f
    e
    t
    c
    h
    : f
    u
    n
    c
    t
    i
    o
    n (
    e
    m
    a
    i
    l
    , c
    a
    l
    l
    b
    a
    c
    k
    ) {
    t
    h
    i
    s
    .
    f
    i
    n
    d
    O
    n
    e
    (
    {
    e
    m
    a
    i
    l
    : e
    m
    a
    i
    l
    }
    , f
    u
    n
    c
    t
    i
    o
    n (
    e
    r
    r
    , u
    s
    e
    r
    ) {
    i
    f (
    u
    s
    e
    r &
    & u
    s
    e
    r
    .
    n
    a
    m
    e &
    & !
    u
    s
    e
    r
    .
    f
    i
    r
    s
    t
    _
    n
    a
    m
    e
    ) {
    u
    s
    e
    r
    .
    f
    i
    r
    s
    t
    _
    n
    a
    m
    e = u
    s
    e
    r
    .
    n
    a
    m
    e
    .
    s
    p
    l
    i
    t
    (
    " "
    )
    [
    0
    ]
    ;
    }
    c
    a
    l
    l
    b
    a
    c
    k
    (
    e
    r
    r
    , u
    s
    e
    r
    )
    }
    )
    }
    }
    )
    ;

    View Slide

  20. PostgreSQL:
    A
    L
    T
    E
    R T
    A
    B
    L
    E u
    s
    e
    r
    s
    A
    D
    D C
    O
    L
    U
    M
    N f
    i
    r
    s
    t
    _
    n
    a
    m
    e
    N
    O
    T N
    U
    L
    L
    D
    E
    F
    A
    U
    L
    T s
    p
    l
    i
    t
    _
    p
    a
    r
    t
    (
    n
    a
    m
    e
    , ' '
    , 1
    )
    ;

    View Slide

  21. Consistency
    PostgreSQL: Written means written, no exceptions.
    (except disk failure, but use RAID)
    MongoDB: Written means written, unless
    something goes wrong.
    (e.g. server crash, network partition, disk failure)

    View Slide

  22. Availability
    PostgreSQL: If the master dies, stop to avoid
    corruption.
    MongoDB: If the master dies, rebalance to avoid
    downtime.
    'You cannot have consistency, availability and
    partition tolerance'.
    — CAP theorem

    View Slide

  23. Which is better?
    PostgreSQL: Easier to understand.
    MongoDB: Pretty much "just works".
    Which do you prefer: A broken app, or data loss?

    View Slide

  24. Scaling
    RAM is fast, Disk is slow.
    Ideal: fit all data in RAM.
    Good: fit working set in RAM.
    Bearable: fit working set indexes in RAM.

    View Slide

  25. Making things faster
    Use a bigger database server.
    Replicate all your data over multiple servers.
    Shard portions of your data across multiple
    servers.

    View Slide

  26. Use a bigger server.
    Good for PostgreSQL, up to 64 cores, 1TB RAM.
    Bad for MongoDB, per-database write locks.
    Expensive? Can't use cloud?

    View Slide

  27. Sharding
    Good for MongoDB, built in support via mongos.
    Bad for PostgreSQL. Hard to chose shards to
    maintain integrity.
    Cheaper? Works in the cloud.

    View Slide

  28. Replication
    Doesn't help write-throughput, always hits
    master.
    Doesn't give you more working-set ram.
    Gives you more disk heads.
    Gives you faster failover.

    View Slide

  29. I ♥ PostgreSQL
    I ♥ MongoDB

    View Slide

  30. MongoDB
    Confessions of a PostgreSQL lover
    @ConradIrwin

    View Slide