Slide 1

Slide 1 text

෼ࢄσʔλϕʔεʹ͓͚Δ ৽͍͠੔߹ੑϞσϧͱ Riakʹ͓͚Δ࣮૷ 2013 / 11 / 28 WebDB Forum Basho ্੢߁ଠ

Slide 2

Slide 2 text

෼ࢄσʔλϕʔεʹ͓͚Δ ݹͯ͘৽͍͠੔߹ੑϞσϧͱ Riakʹ͓͚Δ࣮૷ 2013 / 11 / 28 WebDB Forum Basho ্੢߁ଠ

Slide 3

Slide 3 text

BashoͱRiak •෼ࢄσʔλϕʔεʁ •RiakΛ஌͍ͬͯΔʁ •BashoΛ஌͍ͬͯΔʁ

Slide 4

Slide 4 text

CAPఆཧͱཧ૝ͷDB •ͲΜͳނোʹରͯ͠΋ (partition tolerance) •σʔλ͸ৗʹ੔߹͓ͯ͠Γ (consistency) •γεςϜ͕ࢭ·Δ͜ͱ͸ͳ͍ (availability) ͜ͷ3ͭΛಉ࣌ʹຬͨ͢γεςϜ͸ଘࡏ͠ͳ͍

Slide 5

Slide 5 text

•Մ༻ੑ (Availability) ͕ಛ௃ͷσʔλ ϕʔε •ӡ༻͠΍͍͢ɺେ͖ͳσʔλͰ΋ೖΔ •҆ఆੑɺ༧ଌՄೳੑ •ʮσʔλΛઈରʹͳ͘͞ͳ͍ʯ

Slide 6

Slide 6 text

͜Μͳͱ͜ΖͰ ಈ͍͍ͯ·͢Riak •Rovio (Angry Birds) •Yahoo!JAPAN ͷΫϥ΢υετϨʔδ •NHS (ΠΪϦε ࠃຽอݥαʔϏε) •Bump (=>Google) •ۜߦɺήʔϜɺখചɺηϯαʔɺetc…

Slide 7

Slide 7 text

How Riak Works

Slide 8

Slide 8 text

Consistent Hashing • 160-bit Ωʔۭؒ • ۭؒΛ౳෼͢Δ • ύʔςΟγϣϯ͸ϊʔ υ͕ݸผ؅ཧ • ϨϓϦΧ͸Nݸͷύʔ ςΟγϣϯʹίϐʔ͞ ΕΔ OPEF OPEF OPEF OPEF hash(“meetups/spamham”) N=3

Slide 9

Slide 9 text

Consistency͸೉͍͠ •ߋ৽ΛࢭΊΔʢAvailabilityΛԼ͛Δʣ͔ɺߋ৽ͷ্ॻ͖Λ ڐ͢ʢσʔλΛࣦ͏ʣ͔͔͠બ୒ࢶ͕ͳ͍ Server2 Server1 Server3 PUT V=42 PUT V=0 V=?

Slide 10

Slide 10 text

Consistencyͷ୅ΘΓʹ •ͱΓ͋͑ͣෳ਺ͷόʔδϣϯͷڞଘΛڐ͢ •Ͳͷόʔδϣϯ͕ਖ਼͍͔͠ɺ΋͘͠͸Ϛʔδ͢Δ͔ΛRead࣌ʹܾఆ Server2 Server1 Server3 PUT V=42 PUT V=0 V=0 or 42 V=0 V=0 or 42 V=42

Slide 11

Slide 11 text

APΛ࣮ݱ •ωοτϫʔΫ෼அ͕ى͖͍ͯͯ΋ͱΓ͋͑ͣॻ͖ࠐΈΛڐ͢ Server2 Server1 Server3 PUT V=42 PUT V=0 Server4 ෮چͨ͠Βॻ͖໭͢ ྆ํ͓࣋ͬͯ͘

Slide 12

Slide 12 text

γϣοϐϯάΧʔτͷྫ •UnionΛͱΕ͹Α͍ Server2 Server1 Server3 PUT cart=[a,b,d] PUT cart=[a,b,c] union([a,b,c], [a,b,d]) => [a,b,c,d] [a,b,c] [a,b,c] or [a,b,d] [a,b,d]

Slide 13

Slide 13 text

ෳ਺όʔδϣϯΛ ڐ͢͜ͱͷ೉఺ •ϓϩάϥϛϯά͕೉͍͠ʢτϥϯβΫγϣϯ͸ૉ੖ Β͍͠ʣ •ݱ࣮ੈք͸γϣοϐϯάΧʔτͱΧ΢ϯλʔ͚ͩ Ͱ͸ͳ͍ •҆શͳMerge, update͕Ͱ͖Δσʔλߏ଄Λຖճ ߟ͑ͳ͚Ε͹ͳΒͳ͍ •࢖͍ͬͯΔ͏ͪʹࣅͨΑ͏ͳϥΠϒϥϦ͕͋ͪ͜ ͪͰग़དྷ্͕Δ

Slide 14

Slide 14 text

ͳͥ೉͍͠ͷ͔ʁ •σʔλͷWriteͱWrite͕ೖΕସΘΓ͏ ΔʹSerializableͲ͜Ζ͔Write΋Ұ؏ ͨ͠ঢ়ଶʹͰ͖ͳ͍ Server2 Server1 Server3 w1 w2 w1 w2 w2 (w1 lost)

Slide 15

Slide 15 text

Logical Monoticity •σʔλʹର͢ΔՄ׵ͳૢ࡞ͷΈΛڐ͢ʂ Data = update(w2, update(w1, Data0)) = update(w1, update(w2, Data0)) Data = merge(update(w2, Data0), Data)

Slide 16

Slide 16 text

౴͑: CRDT •ʮෳ੡ՄೳͳՄ׵σʔλܕʯ •Conflict-Free Replicated Data Types •Commutative Replicated Data Types •… •(Going to be included in Riak 2.0) ஫) CRDTͷ࡞ऀ͸Logical Monotinicy ͱ͍͏ݴ༿͸࢖͍ͬͯͳ͍

Slide 17

Slide 17 text

CRDT in Riak 2.0 •KVSͷVʹʮܕʯΛ࣋ͨͤͯɺܕʹΑͬͯ UpdateͱMergeͷϩδοΫΛܾΊΔ •Read࣌ʹMerge͕αʔόʔଆͰࣗಈతʹ࣮ ߦ͞ΕΔ •ΞϓϦέʔγϣϯ͸ܕΛࢦఆ͢Δ͚ͩͰΑ͘ɺ ෳ਺όʔδϣϯͷϋϯυϦϯά͕ෆཁʹͳΔ

Slide 18

Slide 18 text

CRDT example •PN-Counter •Set •OR-sets •LWW-register •Graph…

Slide 19

Slide 19 text

PN-Counter •σϞ

Slide 20

Slide 20 text

PN-Counter • merge • {a: {1,-1}, b: {1,0}, c: {2,0}} • {a: {0,0}, b: {2, 0}, c: {0, -2}} • => {a: {1,-1}, b:{2,0}, c:{2,-2}} => 2 • update • a͕ {increment, 3} Λड͚෇͚Δͱ • {a: {4,-1}, b: {1,0}, c: {2,0}}

Slide 21

Slide 21 text

OR-Sets • merge • {a:{“foo”:true}, b:{“bar”:false}} • + {a:{“foo”:true}, b:{“foo”:false, “bar”:false}} • => {a:{“foo”:true}, b:{“foo”:false, “bar”:true}} • => [“bar”] • update • add: {a:{}} => +”foo” => {a:{“foo”:false}} • remove: {a: {“foo”:false}} => {a: {“foo”:true}}

Slide 22

Slide 22 text

OR-Sets •σϞ

Slide 23

Slide 23 text

Ϣʔεέʔε •ΫϦοΫ਺ͷΧ΢ϯτ (G-counter) • riak-server/types/counters/buckets/likes/datatypes/basho.com -d 1 •γϣοϐϯάΧʔτ (OR-sets) •ϩάΠϯϢʔβʔ਺ (PN-counter) •͜ΕΒͷ૊Έ߹Θͤ (map & LWW-register, boolean) •{ name : “basho.com”, likes: 20000, users: 3000, links: [ “basho.co.jp”, “basho.co.uk” ], cool: true }

Slide 24

Slide 24 text

Ͱ͖ͳ͍͜ͱ •ʮ0Ҏ্ʯͷPN-counter •ϢχʔΫͳIDൃߦ •ͦͷଞCAS͕ඞཁͳσʔλߏ଄ͱૢ࡞

Slide 25

Slide 25 text

·ͱΊ •Riak͸Մ༻ੑͷ͋Δ෼ࢄσʔλϕʔε •ෳ਺ͷόʔδϣϯΛಉ࣌ʹอ࣋͢ΔͷΛ ڐ͢͜ͱͰՄ༻ੑΛ୲อ •ΞϓϦ։ൃͷ೉қ౓͕՝୊ •CRDTͱ͍͏ܕͷಋೖʹΑΓ؆୯͔ͭ σʔλͷͳ͘ͳΒͳ͍࢓૊ΈΛ࡞ͬͨ

Slide 26

Slide 26 text

Questions? •Riak 2.0 Λָ͠Έʹ͍ͯͩ͘͠͞ •Web: http://basho.co.jp •Twitter: @BashoJapan •Me: [email protected] •ML: [email protected]

Slide 27

Slide 27 text

Useful links http://hal.upmc.fr/docs/00/55/55/88/PDF/techreport.pdf http://arxiv.org/pdf/1210.3368.pdf https://gist.github.com/russelldb/f92f44bdfb619e089a4d http://gsd.di.uminho.pt/members/cbm/ps/scadt3.pdf http://arxiv.org/abs/1011.5808