Riakの整合性は2.0からもっといろいろ調整できるようになるので、その解説(25分)
Consistency types in Riak2014/6/4 Riak Meetup #4 @kuenishi
View Slide
ࣗݾհ•UENISHI Kota @ kuenishi •github, twitter, … •Erlangྺ5ɺࢄγεςϜྺ6 •࠷ۙ Riak CS ͷ։ൃ
Consistencyɹ߹ੑ•ʮNoSQLίϯγεςϯγʔ͕ͳ͍ʯ •ʮSQLʹίϯγεςϯγʔ͕͋Δʯ •ʢ; °Д°ʣ
SQL => ߹ੑʁNO
߹ੑ㲈 τϥϯβΫγϣϯ˚
߹ੑʹෆม݅
ෆม݅•ཧతͳ߹ੑ •ςʔϒϧؒͷ੍݅ɺΠϯσοΫε •ཧతͳ߹ੑ •ෳ͕ඞͣಉ͡Α͏ʹ؍ଌ͞ΕΔ͔
RiakͰ•Siblings •CRDT (2.0~) •Strong Consistency (2.0~)
Why Siblings?
Consistent Hashing• 160-bit Ωʔۭؒ • ۭؒΛ͢Δ • ύʔςΟγϣϯϊʔυ͕ݸผཧ • ϨϓϦΧNݸͷύʔςΟγϣϯʹίϐʔ͞ΕΔOPEFOPEFOPEFOPEFhash(“meetups/spamham”)N=3
CAPఆཧͱཧͷDB•ͲΜͳނোʹରͯ͠ (partitiontolerance) •σʔλৗʹ߹͓ͯ͠Γ (consistency) •γεςϜ͕ࢭ·Δ͜ͱͳ͍(availability)͜ͷ3ͭΛಉ࣌ʹຬͨ͢γεςϜଘࡏ͠ͳ͍
Consistency͍͠•ߋ৽ΛࢭΊΔʢAvailabilityΛԼ͛Δʣ͔ɺߋ৽ͷ্ॻ͖Λڐ͢ʢσʔλΛࣦ͏ʣ͔͔͠બࢶ͕ͳ͍Server2Server1 Server3PUT V=42PUT V=0V=?
ConsistencyͷΘΓʹ•ͱΓ͋͑ͣෳͷόʔδϣϯͷڞଘΛڐ͢ •Ͳͷόʔδϣϯ͕ਖ਼͍͔͠ɺ͘͠Ϛʔδ͢Δ͔ΛRead࣌ʹܾఆServer2Server1 Server3PUT V=42PUT V=0V=0 or 42V=0 V=0 or 42 V=42
APΛ࣮ݱ•ωοτϫʔΫஅ͕ى͖͍ͯͯͱΓ͋͑ͣॻ͖ࠐΈΛڐ͢Server2Server1 Server3PUT V=42PUT V=0Server4෮چͨ͠Βॻ͖྆͢ํ͓࣋ͬͯ͘
γϣοϐϯάΧʔτͷྫ•UnionΛͱΕΑ͍Server2Server1 Server3PUT cart=[a,b,d]PUT cart=[a,b,c]union([a,b,c], [a,b,d]) => [a,b,c,d][a,b,c] [a,b,c] or [a,b,d] [a,b,d]
Siblingsriak_object = Riak.fetch(bucket, key)!riak_object.version!riak_object.has_siblings!for value in riak_object.values: …!riak_object.data = new_value!riak_object.store!
Siblings ͷෆม݅•ಛʹͳ͠…͍͋͑ͯ͏ͱϨϓϦΧͷू߹ •Data = R1 ∪ R2 ∪ R3
ෳόʔδϣϯΛ ڐ͢͜ͱͷ•ϓϩάϥϛϯά͕͍͠ʢτϥϯβΫγϣϯૉΒ͍͠ʣ •ݱ࣮ੈքγϣοϐϯάΧʔτͱΧϯλʔ͚ͩͰͳ͍ •҆શͳMerge, update͕Ͱ͖ΔσʔλߏΛຖճߟ͑ͳ͚ΕͳΒͳ͍ •͍ͬͯΔ͏ͪʹࣅͨΑ͏ͳϥΠϒϥϦ͕͋ͪͪ͜Ͱग़དྷ্͕Δ
ͳ͍ͥ͠ͷ͔ʁ•σʔλͷWriteͱWrite͕ೖΕସΘΓ͏ΔʹSerializableͲ͜Ζ͔WriteҰ؏ͨ͠ঢ়ଶʹͰ͖ͳ͍Server2Server1 Server3w1w2w1w2w2(w1 lost)
͑: CRDT•ʮෳՄೳͳՄσʔλܕʯ •Conflict-Free Replicated Data Types •Commutative Replicated Data Types •… •(Going to be included in Riak 2.0)) CRDTͷ࡞ऀLogical Monotinicy ͱ͍͏ݴ༿͍ͬͯͳ͍
CRDTͷෆม݅•σʔλʹର͢ΔՄͳૢ࡞ͷΈΛڐ͢ʂData = update(w2, update(w1, Data0)) = update(w1, update(w2, Data0))Data = merge(update(w2, Data0), Data)
CRDT in Riak 2.0•KVSͷVʹʮܕʯΛ࣋ͨͤͯɺܕʹΑͬͯUpdateͱMergeͷϩδοΫΛܾΊΔ •Read࣌ʹMerge͕αʔόʔଆͰࣗಈతʹ࣮ߦ͞ΕΔ •ΞϓϦέʔγϣϯܕΛࢦఆ͢Δ͚ͩͰΑ͘ɺෳόʔδϣϯͷϋϯυϦϯά͕ෆཁʹͳΔ
CRDTriak_object = Riak.fetch(bucket, key)!riak_object.type => counter|set|…!riak_object.set << element!riak_object.set.delete(old_element)!riak_object.store!
CRDT example•PN-Counter •Set •OR-sets •LWW-register •Graph…
PN-Counter• merge • {a: {1,-1}, b: {1,0}, c: {2,0}} • {a: {0,0}, b: {2, 0}, c: {0, -2}} • => {a: {1,-1}, b:{2,0}, c:{2,-2}} => 2 • update • a͕ {increment, 3} Λड͚͚Δͱ • {a: {4,-1}, b: {1,0}, c: {2,0}}
OR-Sets• merge • {a:{“foo”:true}, b:{“bar”:false}} • + {a:{“foo”:true}, b:{“foo”:false, “bar”:false}} • => {a:{“foo”:true}, b:{“foo”:false, “bar”:true}} • => [“bar”] • update • add: {a:{}} => +”foo” => {a:{“foo”:false}} • remove: {a: {“foo”:false}} => {a: {“foo”:true}}
Ϣʔεέʔε•ΫϦοΫͷΧϯτ (G-counter) • riak-server/types/counters/buckets/likes/datatypes/basho.com -d 1 •γϣοϐϯάΧʔτ (OR-sets) •ϩάΠϯϢʔβʔ (PN-counter) •͜ΕΒͷΈ߹Θͤ (map & LWW-register,boolean) •{ name : “basho.com”, likes: 20000, users: 3000,links: [ “basho.co.jp”, “basho.co.uk” ], cool: true }
Ͱ͖ͳ͍͜ͱ•ʮ0Ҏ্ʯͷPN-counter •ϢχʔΫͳIDൃߦ •ͦͷଞCAS͕ඞཁͳσʔλߏͱૢ࡞ •RESTfulͳૢ࡞
Strong Consistency•ෆม݅: t(w1) > t(w2) or t(w2) > t(w1) •Sequencial Consistencyʹ͍ۙRiakc1c2getgetv1v1w1(v1) okw2(v1) fail
Strong Consistency•͍ͪͲGETͯ͠ɺͦͷόʔδϣϯʹର͢Δૢ࡞ΛૹΔ •ͦͷؒʹߋ৽͞Ε͍ͯͨΒࣦഊɺޭ •MultiPaxos
Strong Consistencydo {!riak_object = Riak.fetch(bucket, key)!riak_object.data = new_value!} while (riak_object.store != ok)
·ͱΊ•Riak 1.x Մ༻ੑͷ͋Δࢄσʔλϕʔε •RESTfulͳઃܭΛ͢ΔͳΒSiblings •2.0 ͔Βෳͷ߹ੑϞσϧΛબͰ͖ΔΑ͏ʹͳͬͨ •ΞϓϦΛ؆୯ʹ࡞Γ͍ͨͳΒCRDT •CASతߋ৽Λ͍ͨ͠ͳΒStrong Consistency
Questions?•Riak 2.0 Λָ͠Έʹ͍ͯͩ͘͠͞ •Web: http://basho.co.jp •Twitter: @BashoJapan •Me: [email protected] •ML: [email protected]