データベース アーキテクチャーの動向と使い分け

QConTokyo ( http://www.qcontokyo.com/KotaUENISHI_2015.html ) の発表スライド

April 21, 2015

2. ࣗݾ঺հ • @kuenishi • Github, Twitter, etc • ෼ࢄγεςϜྺ7೥ •

Bashoδϟύϯͷํ͔Βདྷ·ͨ͠ • Riak CSͷ։ൃ • εϐϦνϡΞϧͳ࿩Λ͠·͢

7. Durability “The ACID property which guarantees that transactions that have

committed will survive permanently. “ http://en.wikipedia.org/wiki/Durability_(database_systems)

18. ϨϓϦέʔγϣϯ͸೉͍͠ •CAPఆཧ • ෳ਺ͷϊʔυ͕อ͍࣋ͯ͠Δɺ࠷ॳ͸ಉ͡ΦϒδΣΫτ ʹมߋͷྻΛૹΓଓ͚Δ • ϝοηʔδ͕౸ୡ͠ͳ͍ͱ͖ʹɺશͯͷϊʔυ͕ಉ͡ม ߋͷྻΛอ࣋͢Δ͜ͱ͕Ͱ͖ͳ͍ • ೉͠͞ͷࠜݯ͸ނো୯ҐΛ෼͚ͨ͜ͱ

• ผͷ΋ͷ͕ಉҰͰ͋Δ͜ͱΛอূ͢Δ

20. ղ๏ͦͷ̌: Master-Slave • ߋ৽ͷ໋ྩྻ͕།ҰͰ͋Δ͜ͱΛอূ͢Δ • εϨʔϒɺϨϓϦΧ͸ɺܾఆ͞Εͨߋ৽ͷ໋ྩྻΛड͚ औͬͯϩʔΧϧʹ൓ө͢Δ͚ͩ • ϚελʔෆࡏͰ͸Կ΋Ͱ͖ͳ͍ c

c w1: x=a w2: x=b r1: read x w3: x=c w1: x=a w2: x=b r1: read x w3: x=c
21. ղ๏ͦͷ̍: ίϯηϯαε • ҟͳΔ࣮ମ͕ಉ͡ঢ়ଶΛ࣋ͭ͜ͱ͕ΰʔϧ • ϨϓϦέʔγϣϯ͸͍ΘΏΔ෼ࢄ߹ҙ໰୊ • ͜ΕΛղ͘௨৴ํࣜΛɺίϯηϯαεϓϩτίϧͱ͍͏ c c

w1: x=a c w2: x=b r1: read x w3: x=c

24. 2000೥୅ Web࣌୅ (1/2) • ΞϓϦέʔγϣϯɺϛυϧ΢ΣΞͷϨΠϠ(TCP/IP) ͰϨϓϦ έʔγϣϯ͕ҰൠతʹͳΔ • ωοτϫʔΫϨϕϧͰͷಉظܕɻยܥ͕ނোͯ͠΋ಈ࡞ܧଓ •

ReadΛεέʔϧΞ΢τͰ͖ΔλΠϓͷ΋ͷ΋͍͔ͭ͘ొ৔ • Master͔ΒSlave (Replica)΁ࠩ෼Λྲྀ͢λΠϓ͕ओྲྀ • MySQLͷbinlog, GFS (BigTable), HDFS (HBase)
25. Master-Slave͸೉͍͠ • Ϛελʔ੾Γସ͑ͷλΠϜϥά • Split brain଱ੑ c b w1: x=a

r1: read x w3: x=c w2: x=b
26. 2000೥୅ Web࣌୅ (2/2) • WebγεςϜͷෳࡶԽɺڊେԽ • ίϯηϯαεܕͷϨϓϦέʔγϣϯͷ࣮༻Խ • ωοτϫʔΫ෼அ͕ى͖ͯ΋ͳΜͱ͔ͳΔ •

Dynamo, Chubby, ZooKeeper, SQL Server (2008?~) • Paxos (1989), Quorum (1979) ͳͲ 2/3 Ack
27. Quorum: ίϯηϯαε͸೉͍͠ • ্ॻ͖Λڐ༰͢ΔφΠʔϒͳϓϩτίϧઃܭͰ͸؆୯ ʹσʔλ͕ഁյ͞ΕΔ • ͍ͭͰ΋୭Ͱ΋ނো͢Δ͠໧Δ͠෮׆͢Δ…ͱ͍͏ݱ ࣮ੈքͰ͸࣮༻తͰ͸ͳ͍ a? c?

w1: x=a b? w2: x=b r1: read x w3: x=c
28. Paxos: ίϯηϯαε͸೉͍͠ • 2ϑΣʔζͷ߹ҙϓϩτίϧ • Proposer (஋ΛఏҊ͢Δਓ) Λଟ਺ܾͰܾఆ • Proposed

Value (ఏҊ͞Εͨ஋) Λଟ਺ܾͰܾఆ •ఏҊ಺༰ʹॱং൪߸Λৼͬͯ৽چ؅ཧ͢Δ •͍ͭͰ΋୭Ͱ΋ނো͢Δ͠໧Δ͠෮׆͢Δ…ͱ͍͏੍໿ ԼͰ΋ɺແݶʹ͕࣌ؒ͋ͬͯա൒਺͕ނো͍ͯ͠ͳ͚Ε ͹߹ҙ͢Δ •࣮૷͸೉͍͕͠ɺؤுΕ͹ͳΜͱ͔ͳΔ
29. ίϯηϯαεܕ ϨϓϦέʔγϣϯͷ෼ྨ • CPܕ • ෳ੡ؒͷಉҰੑΛอো͢ΔλΠϓ • Paxos, RaftͳͲͷΞϧΰϦζϜΛ࠾༻ •

ωοτϫʔΫ෼அͨ͠ͱ͖ʹଟ਺ଆͷωοτϫʔΫʹ͍Δϊʔ υ͔͠ར༻Ͱ͖ͳ͍ • APܕ • ෳ੡͕׬શʹҰக͍ͯ͠ͳ͍͜ͱΛڐ༰͢Δ • Vector Clock΍CRDTʹΑΓҼՌ੔߹ੑΛอোʢ΋͘͠͸୯ͳ ΔλΠϜελϯϓʣ • ωοτϫʔΫ෼அͯ͠΋ɺ͢΂ͯͷϊʔυͰར༻Մೳ
30. ϨϓϦέʔγϣϯ͔ΒΈͨ σʔλϕʔεͷ෼ྨ • Master-Slaveܕ • ࣮૷͕γϯϓϧɺߴ଎ • ίϯηϯαεͱMaster-SlaveͷϋΠϒϦουܕ • Ϛελʔબग़ʹίϯηϯαεϓϩτίϧΛ࠾༻

• ϨϓϦέʔγϣϯͦͷ΋ͷ͸Master-Slave • ίϯηϯαεܕ • ϨϓϦέʔγϣϯʹ΋ίϯηϯαεϓϩτίϧΛར༻ • Ϛελʔނোʹ൐͏μ΢ϯλΠϜ͕ͳ͍
31. ϨϓϦέʔγϣϯ͔ΒΈͨ σʔλϕʔεͷ෼ྨ • Master-Slaveܕ • MySQL, PostgreSQL • ίϯηϯαεͱMaster-SlaveͷϋΠϒϦουܕ •

MongoDB, HBase, Redis • ίϯηϯαεܕ • Riak, Cassandra (͍ͣΕ΋AP, CPϞʔυ͋Γ) • CouchBase (CPܕ)
32. 2010೥୅ Ϋϥ΢υͷ࣌୅ • NewSQLͱ͍ΘΕΔ෼ྨͷొ৔ • FoundationDB, NuoDB • طଘͷNoSQL͕SQL(-likeͳ΋ͷ)Λ࣮૷͢Δ৔߹ •

NewSQL ͷதʹ͸ ACID Λຬͨ͢(?)΋ͷ΋ • ෳ਺σʔληϯλʔͰͷϨϓϦέʔγϣϯ͕ඞਢʹ • ωοτϫʔΫ෼அ΍ϨΠςϯγ͕ΑΓॏཁͳ՝୊ʹ • MPP͕OLAPͷϫʔΫϩʔυͰ࣮༻Խʢ෼ࢄΫΤϦॲཧʣ • BigQuery, Impala, PrestoDB

35. ෼ࢄDBͰACID •ݱ࣮తͳઃܭ͸ͻͱ௨Γ͔͠ͳ͍ •ίϯηϯαεʹΑΔMasterબग़ʴM/SϨϓϦέʔ γϣϯ or CPܕͷϨϓϦέʔγϣϯ •λΠϜελϯϓͷಉظΛอূ͢Δ࢓૊Έ •ָ؍తฒߦੑ੍ޚ •MegaStore (2011),

Spanner (2012)

CRDT, boom
38. CRDT • ָ؍తϨϓϦέʔγϣϯΛ؆୯ʹ͢Δσʔλ ܕͱϨϓϦέʔγϣϯٕज़ͷͻͱͭ • Conﬂict-Free Replicated Data Types •

w1(w2(x)) == w2(x1(x)) Λຬͨ͢Α͏ͳ σʔλܕɾσʔλߏ଄ͱԋࢉࢠͷ૊Έ߹Θͤ • ωοτϫʔΫ෼அ࣌Ͱ΋ߋ৽ɺಡΈग़͠Մೳ
39. CRDTྫ: G-Counter • merge •a͕͍࣋ͬͯΔσʔλ: {a: 1, b: 1, c:

2} •b͕͍࣋ͬͯΔσʔλ: {a: 0, b: 2, c: 0} • x => {a: 1, b:2, c:2} => 5 • update • a͕ {increment, 3} Λड͚ͱΔͱ{a: 4, b: 1, c: 2} • C < x ͱ͍͏৚݅ԋࢉΛॲཧͰ͖Δ
40. CRDTྫ: PN-Counter • merge • {a: {1,-1}, b: {1,0}, c:

{2,0}} • {a: {0,0}, b: {2, 0}, c: {0, -2}} • => {a: {1,-1}, b:{2,0}, c:{2,-2}} => 2 • update • a͕ {increment, 3} Λड͚෇͚Δͱ • {a: {4,-1}, b: {1,0}, c: {2,0}} • c < x ͱ͍͏৚݅ԋࢉΛॲཧͰ͖ͳ͍
41. CRDTྫ: OR-Sets • merge • a:{“foo”:false, “bar”:true, “baz”:true} • +

b:{“bar”:true, “baz”:false}} • => {“foo”:false, “bar”:true, “baz”:true} • => [“foo”] • update • add: a:{} => +”foo” => a:{“foo”:false} • remove: a: {“foo”:false} => a: {“foo”:true}
42. CRDT • ωοτϫʔΫ෼அ࣌Ͱ΋ߋ৽ɺಡΈग़͠Մೳ • Writeͷ ”ฒߦॲཧ” ͕ՄೳʹͳΔσʔλ • ஋Λܭࢉ͢Δํ๏ʹҰఆͷ੍໿͕͋Δ •

ޮ཰తͳCRDTͷ࣮૷͸·ͩݚڀஈ֊
43. ༧૝: 2010೥୅ޙ൒ • ࣮૷໘Ͱ͸޻෉ͷ༨஍͕͋ΓɺACIDΛຬͨͦ͏ͱ͢Δ෼ࢄDB͸·ͩ ·ͩొ৔͢Δ •෼ࢄΛߟྀͨ͠ฒߦੑ੍ޚ •σʔληϯλʔΛލ͙CPܕϨϓϦέʔγϣϯɺτϥϯβΫγϣϯ •ӡ༻ϊ΢ϋ΢ͷීٴ • NoSQLσʔλϕʔεͷ࠾༻͸͠͹Β͘ଓͩ͘Ζ͏ʢ͍͔ͭ͘͸౫ଡ͞

ΕΔͩΖ͏ʣ • ڧ͍੔߹ੑͱָ؍తϨϓϦέʔγϣϯͷϋΠϒϦουܕσʔλϕʔε ͕ొ৔͢ΔͩΖ͏

46. •2000೥͜Ζ͔Βɺσʔλϕʔεͷ2େٕज़ཁૉʹɺঃʑʹ ෼ࢄγεςϜͷٕज़͕ཁૉٕज़ͱͯ͠ඞਢʹͳ͍ͬͯͬͨ •2015೥·Ͱʹొ৔ͨ͠σʔλϕʔεͷϨϓϦέʔγϣϯ ٕज़ʹ͍ͭͯ؆୯ʹղઆ •2015೥ޙ൒ʹ͸ɺCPͱAPͷϨϓϦέʔγϣϯΛಉ͡Πϯ λʔϑΣʔεͰ࢖͍෼͚ɺACIDΛຬͨ͢෼ࢄσʔλϕʔε ͕ొ৔͢ΔͩΖ͏ •2020೥ͷେ·͔ͳໝ^H༧૝Λ ·ͱΊ ※Disclaimer:

͜ͷࢿྉͷ಺༰͸্੢ͷݸਓతͳ༧૝Ͱ͋ΓɺԿΒ͔ͷະདྷΛอূ͢Δ΋ͷͰ͸͋Γ·ͤΜ
