Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Consistency types of Riak

Consistency types of Riak

Riakの整合性は2.0からもっといろいろ調整できるようになるので、その解説(25分)

UENISHI Kota

June 04, 2014
Tweet

More Decks by UENISHI Kota

Other Decks in Technology

Transcript

  1. Consistency types

    in Riak
    2014/6/4 Riak Meetup #4

    @kuenishi

    View Slide

  2. ࣗݾ঺հ
    •UENISHI Kota @ kuenishi
    •github, twitter, …
    •Erlangྺ5೥ɺ෼ࢄγεςϜྺ6೥
    •࠷ۙ͸ Riak CS ͷ։ൃ

    View Slide

  3. Consistencyɹ੔߹ੑ
    •ʮNoSQL͸ίϯγεςϯγʔ͕ͳ͍ʯ
    •ʮSQLʹ͸ίϯγεςϯγʔ͕͋Δʯ
    •ʢ; °Д°ʣ

    View Slide

  4. SQL => ੔߹ੑʁ
    NO

    View Slide

  5. ੔߹ੑ㲈

    τϥϯβΫγϣϯ
    ˚

    View Slide

  6. ੔߹ੑʹෆม৚݅

    View Slide

  7. ෆม৚݅
    •࿦ཧతͳ੔߹ੑ
    •ςʔϒϧؒͷ੍໿৚݅ɺΠϯσοΫε
    •෺ཧతͳ੔߹ੑ
    •ෳ੡͕ඞͣಉ͡Α͏ʹ؍ଌ͞ΕΔ͔

    View Slide

  8. RiakͰ͸
    •Siblings
    •CRDT (2.0~)
    •Strong Consistency (2.0~)

    View Slide

  9. Why Siblings?

    View Slide

  10. Consistent Hashing
    • 160-bit Ωʔۭؒ
    • ۭؒΛ౳෼͢Δ
    • ύʔςΟγϣϯ͸ϊʔ
    υ͕ݸผ؅ཧ
    • ϨϓϦΧ͸Nݸͷύʔ
    ςΟγϣϯʹίϐʔ͞
    ΕΔ
    OPEF
    OPEF
    OPEF
    OPEF
    hash(“meetups/spamham”)
    N=3

    View Slide

  11. CAPఆཧͱཧ૝ͷDB
    •ͲΜͳނোʹରͯ͠΋ (partition
    tolerance)
    •σʔλ͸ৗʹ੔߹͓ͯ͠Γ (consistency)
    •γεςϜ͕ࢭ·Δ͜ͱ͸ͳ͍
    (availability)
    ͜ͷ3ͭΛಉ࣌ʹຬͨ͢γεςϜ͸ଘࡏ͠ͳ͍

    View Slide

  12. Consistency͸೉͍͠
    •ߋ৽ΛࢭΊΔʢAvailabilityΛԼ͛Δʣ͔ɺߋ৽ͷ্ॻ͖Λ
    ڐ͢ʢσʔλΛࣦ͏ʣ͔͔͠બ୒ࢶ͕ͳ͍
    Server2
    Server1 Server3
    PUT V=42
    PUT V=0
    V=?

    View Slide

  13. Consistencyͷ୅ΘΓʹ
    •ͱΓ͋͑ͣෳ਺ͷόʔδϣϯͷڞଘΛڐ͢
    •Ͳͷόʔδϣϯ͕ਖ਼͍͔͠ɺ΋͘͠͸Ϛʔδ͢Δ͔ΛRead࣌ʹܾఆ
    Server2
    Server1 Server3
    PUT V=42
    PUT V=0
    V=0 or 42
    V=0 V=0 or 42 V=42

    View Slide

  14. APΛ࣮ݱ
    •ωοτϫʔΫ෼அ͕ى͖͍ͯͯ΋ͱΓ͋͑ͣॻ͖ࠐΈΛڐ͢
    Server2
    Server1 Server3
    PUT V=42
    PUT V=0
    Server4
    ෮چͨ͠Βॻ͖໭͢
    ྆ํ͓࣋ͬͯ͘

    View Slide

  15. γϣοϐϯάΧʔτͷྫ
    •UnionΛͱΕ͹Α͍
    Server2
    Server1 Server3
    PUT cart=[a,b,d]
    PUT cart=[a,b,c]
    union([a,b,c], [a,b,d]) => [a,b,c,d]
    [a,b,c] [a,b,c] or [a,b,d] [a,b,d]

    View Slide

  16. Siblings
    riak_object = Riak.fetch(bucket, key)!
    riak_object.version!
    riak_object.has_siblings!
    for value in riak_object.values: …!
    riak_object.data = new_value!
    riak_object.store!

    View Slide

  17. Siblings ͷෆม৚݅
    •ಛʹͳ͠…͍͋͑ͯ͏ͱϨϓϦΧͷ࿨ू߹
    •Data = R1 ∪ R2 ∪ R3

    View Slide

  18. ෳ਺όʔδϣϯΛ

    ڐ͢͜ͱͷ೉఺
    •ϓϩάϥϛϯά͕೉͍͠ʢτϥϯβΫγϣϯ͸ૉ੖
    Β͍͠ʣ
    •ݱ࣮ੈք͸γϣοϐϯάΧʔτͱΧ΢ϯλʔ͚ͩ
    Ͱ͸ͳ͍
    •҆શͳMerge, update͕Ͱ͖Δσʔλߏ଄Λຖճ
    ߟ͑ͳ͚Ε͹ͳΒͳ͍
    •࢖͍ͬͯΔ͏ͪʹࣅͨΑ͏ͳϥΠϒϥϦ͕͋ͪ͜
    ͪͰग़དྷ্͕Δ

    View Slide

  19. ͳͥ೉͍͠ͷ͔ʁ
    •σʔλͷWriteͱWrite͕ೖΕସΘΓ͏
    ΔʹSerializableͲ͜Ζ͔Write΋Ұ؏
    ͨ͠ঢ়ଶʹͰ͖ͳ͍
    Server2
    Server1 Server3
    w1
    w2
    w1
    w2
    w2
    (w1 lost)

    View Slide

  20. ౴͑: CRDT
    •ʮෳ੡ՄೳͳՄ׵σʔλܕʯ
    •Conflict-Free Replicated Data Types
    •Commutative Replicated Data Types
    •…
    •(Going to be included in Riak 2.0)
    ஫) CRDTͷ࡞ऀ͸Logical Monotinicy ͱ͍͏ݴ༿͸࢖͍ͬͯͳ͍

    View Slide

  21. CRDTͷෆม৚݅
    •σʔλʹର͢ΔՄ׵ͳૢ࡞ͷΈΛڐ͢ʂ
    Data = update(w2, update(w1, Data0))

    = update(w1, update(w2, Data0))
    Data = merge(update(w2, Data0), Data)

    View Slide

  22. CRDT in Riak 2.0
    •KVSͷVʹʮܕʯΛ࣋ͨͤͯɺܕʹΑͬͯ
    UpdateͱMergeͷϩδοΫΛܾΊΔ
    •Read࣌ʹMerge͕αʔόʔଆͰࣗಈతʹ࣮
    ߦ͞ΕΔ
    •ΞϓϦέʔγϣϯ͸ܕΛࢦఆ͢Δ͚ͩͰΑ͘ɺ
    ෳ਺όʔδϣϯͷϋϯυϦϯά͕ෆཁʹͳΔ

    View Slide

  23. CRDT
    riak_object = Riak.fetch(bucket, key)!
    riak_object.type => counter|set|…!
    riak_object.set << element!
    riak_object.set.delete(old_element)!
    riak_object.store!

    View Slide

  24. CRDT example
    •PN-Counter
    •Set
    •OR-sets
    •LWW-register
    •Graph…

    View Slide

  25. PN-Counter
    • merge
    • {a: {1,-1}, b: {1,0}, c: {2,0}}
    • {a: {0,0}, b: {2, 0}, c: {0, -2}}
    • => {a: {1,-1}, b:{2,0}, c:{2,-2}} => 2
    • update
    • a͕ {increment, 3} Λड͚෇͚Δͱ
    • {a: {4,-1}, b: {1,0}, c: {2,0}}

    View Slide

  26. OR-Sets
    • merge
    • {a:{“foo”:true}, b:{“bar”:false}}
    • + {a:{“foo”:true}, b:{“foo”:false, “bar”:false}}
    • => {a:{“foo”:true}, b:{“foo”:false, “bar”:true}}
    • => [“bar”]
    • update
    • add: {a:{}} => +”foo” => {a:{“foo”:false}}
    • remove: {a: {“foo”:false}} => {a: {“foo”:true}}

    View Slide

  27. Ϣʔεέʔε
    •ΫϦοΫ਺ͷΧ΢ϯτ (G-counter)
    • riak-server/types/counters/buckets/likes/datatypes/basho.com -d 1
    •γϣοϐϯάΧʔτ (OR-sets)
    •ϩάΠϯϢʔβʔ਺ (PN-counter)
    •͜ΕΒͷ૊Έ߹Θͤ (map & LWW-register,
    boolean)
    •{ name : “basho.com”, likes: 20000, users: 3000,
    links: [ “basho.co.jp”, “basho.co.uk” ], cool: true }

    View Slide

  28. Ͱ͖ͳ͍͜ͱ
    •ʮ0Ҏ্ʯͷPN-counter
    •ϢχʔΫͳIDൃߦ
    •ͦͷଞCAS͕ඞཁͳσʔλߏ଄ͱૢ࡞
    •RESTfulͳૢ࡞

    View Slide

  29. Strong Consistency
    •ෆม৚݅: t(w1) > t(w2) or t(w2) > t(w1)
    •Sequencial Consistencyʹ͍ۙ
    Riak
    c1
    c2
    get
    get
    v1
    v1
    w1(v1) ok
    w2(v1) fail

    View Slide

  30. Strong Consistency
    •͍ͪͲGETͯ͠ɺͦͷόʔδϣϯʹର͢
    Δૢ࡞ΛૹΔ
    •ͦͷؒʹߋ৽͞Ε͍ͯͨΒࣦഊɺ੒ޭ
    •MultiPaxos

    View Slide

  31. Strong Consistency
    do {!
    riak_object = Riak.fetch(bucket, key)!
    riak_object.data = new_value!
    } while (riak_object.store != ok)

    View Slide

  32. ·ͱΊ
    •Riak 1.x ͸Մ༻ੑͷ͋Δ෼ࢄσʔλϕʔε
    •RESTfulͳઃܭΛ͢ΔͳΒSiblings
    •2.0 ͔Β͸ෳ਺ͷ੔߹ੑϞσϧΛબ୒Ͱ͖ΔΑ
    ͏ʹͳͬͨ
    •ΞϓϦΛ؆୯ʹ࡞Γ͍ͨͳΒCRDT
    •CASతߋ৽Λ͍ͨ͠ͳΒStrong Consistency

    View Slide

  33. Questions?
    •Riak 2.0 Λָ͠Έʹ͍ͯͩ͘͠͞
    •Web: http://basho.co.jp
    •Twitter: @BashoJapan
    •Me: [email protected]
    •ML: [email protected]

    View Slide