Upgrade to Pro — share decks privately, control downloads, hide ads and more …

分散データベース Riak と オブジェクトストレージ RiakCS

ksauzz
August 06, 2013

分散データベース Riak と オブジェクトストレージ RiakCS

オープンソースカンファレンス 2013 @ Kyoto

ksauzz

August 06, 2013
Tweet

More Decks by ksauzz

Other Decks in Technology

Transcript

  1. ©2013 BASHO TECHNOLOGIES INC. ALL RIGHTS RESERVED. Agenda •  Basho

    •  Riak (෼ࢄσʔλϕʔε) •  RIakCS (ΦϒδΣΫτετϨʔδ) •  Ϣʔεέʔε
  2. ©2013 BASHO TECHNOLOGIES INC. ALL RIGHTS RESERVED. Basho Technologies, Inc.

    •  ઃཱ: 2008/01 •  ຊࣾ: ϚανϡʔηοπभέϯϒϦοδ •  ࣾһ: ໿130໊ •  ೔ຊ๏ਓ 2012/09 ઃཱ
  3. ©2013 BASHO TECHNOLOGIES INC. ALL RIGHTS RESERVED. Ωʔ / όϦϡʔ/

    όέοτ •  ΩʔɺόϦϡʔͷϖΞΛόέ οτ΁อଘ͢Δ •  όϦϡʔ͸ͲͷΑ͏ͳόΠφ ϦͰ΋Α͍  (JSON,  XML,   Msgpack,  etc…)   KEY KEY bucket   KEY VALUE VALUE VALUE
  4. ©2013 BASHO TECHNOLOGIES INC. ALL RIGHTS RESERVED. ϚελʔϨε •  ෳ਺ϊʔυͰΫϥελΛߏ੒

      •  ͢΂ͯͷϊʔυ͸ର౳Ͱɺ   Ϛελʔ΍୯Ұো֐఺͸ͳ͍   •  ͢΂ͯͷϊʔυ͸ಉ౳Ͱɺ   ϦεΤετΛࡹ͖ɺσʔλΛ อ࣋͢Δ   node   node   node   node   node  
  5. ©2013 BASHO TECHNOLOGIES INC. ALL RIGHTS RESERVED. σʔλͷෳ੡  • 

    160-­‐bit  ੔਺ͷΩʔྖҬ  =  Ring   •  RingΛ౳ִؒͰύʔςΟγϣϯʹ෼ׂ   •  ύʔςγϣϯΛΫϥελͷ֤ϊʔυʹ ׂΓ౰ͯΔ   node  0   node  1   node  2   node  3  
  6. ©2013 BASHO TECHNOLOGIES INC. ALL RIGHTS RESERVED. σʔλͷෳ੡  • 

    160-­‐bit  ੔਺ͷΩʔྖҬ  =  Ring   •  RingΛ౳ִؒͰύʔςΟγϣϯʹ෼ׂ   •  ύʔςγϣϯΛΫϥελͷ֤ϊʔυʹ ׂΓ౰ͯΔ   •  bucket  /  key  ͷϋογϡ஋ʹΑΓɺ   อଘ͢ΔύʔςΟγϣϯΛܾఆ   node  0   node  1   node  2   node  3   hash(“bucket/key”)  
  7. ©2013 BASHO TECHNOLOGIES INC. ALL RIGHTS RESERVED. σʔλͷෳ੡  • 

    160-­‐bit  ੔਺ͷΩʔྖҬ  =  Ring   •  RingΛ౳ִؒͰύʔςΟγϣϯʹ෼ׂ   •  ύʔςγϣϯΛΫϥελͷ֤ϊʔυʹ ׂΓ౰ͯΔ   •  bucket  /  key  ͷϋογϡ஋ʹΑΓɺ   อଘ͢ΔύʔςΟγϣϯΛܾఆ   •  ࿈ଓ͢ΔύʔςΟγϣϯʹෳ੡Λอଘ   node  0   node  1   node  2   node  3   hash(“bucket/key”)  
  8. ©2013 BASHO TECHNOLOGIES INC. ALL RIGHTS RESERVED. Ұ࣌ো֐ൃੜ࣌  • 

    Ұ࣌తͳϊʔυো֐ʢnode  2ʣ͕ൃੜ   •  PUT,  GET,  DELETEϦΫΤετ͸ɺϑΥʔϧ όοΫϊʔυʢnode  0ʣ΁   node  0   node  1   node  2   node  3   hash(“bucket/key”)  
  9. ©2013 BASHO TECHNOLOGIES INC. ALL RIGHTS RESERVED. Ұ࣌ো֐ϊʔυͷ෮چ࣌  • 

    Ұ࣌తͳϊʔυো֐ʢnode  2ʣ͕ൃੜ   •  PUT,  GET,  DELETEϦΫΤετ͸ɺϑΥʔϧ όοΫϊʔυʢnode  0ʣ΁   •  ো֐ϊʔυͷ෮چʢnode  2ʣ   •  “Handoff”ʹΑΓɺσʔλΛϑΥʔϧόο Ϋϊʔυʢnode  0ʣ͔Β෮چϊʔυ ʢnode  2ʣ΁Ҡߦ   •  ௨ৗӡ༻Λ࠶։   node  0   node  1   node  2   node  3   hash(“bucket/key”)  
  10. ©2013 BASHO TECHNOLOGIES INC. ALL RIGHTS RESERVED. ΠϯλʔϑΣʔε Client HTTP

    ProtocolBuffer Java Ruby Python PHP Node.js Haskell etc…
  11. ©2013 BASHO TECHNOLOGIES INC. ALL RIGHTS RESERVED. όέοτ/Ωʔࢦఆ BUCKET/KEY VALUE

    GET /buckets/people/keys/alice PUT /buckets/people/keys/alice DELETE /buckets/people/keys/alice KEY VALUE
  12. ©2013 BASHO TECHNOLOGIES INC. ALL RIGHTS RESERVED. ηΧϯμϦʔΠϯσοΫε(2i) •  binaryͱintegerܕ͕ར༻Մ

      •  ׬શҰக΋͘͠͸ൣғࢦఆ(range)   KEY VALUE { “name”: “alice”, “age”: 32 } 14 INDEX age_int: 32 KEY: 14
  13. ©2013 BASHO TECHNOLOGIES INC. ALL RIGHTS RESERVED. MapReduce •  σʔλͷ໰͍߹Θͤɺϑ

    ΟϧλϦϯάͷ෼ࢄɺղ ੳͱूܭ   •  Erlang,    JavaScriptͰهड़Մ   •  Erlangͷํ͕ߴ଎  
  14. ©2013 BASHO TECHNOLOGIES INC. ALL RIGHTS RESERVED. શจݕࡧ (Yokozuna β)

    •  Riak  +  Solr   •  ೔ຊޠαϙʔτ   •  Riak  2.0  ͰϦϦʔε༧ఆ  
  15. ©2013 BASHO TECHNOLOGIES INC. ALL RIGHTS RESERVED. σϞ 1.  1ϊʔυ͚ͩͷΫϥελͰ࢝ΊΔ

    2.  10 ݸͷKey/Value Λ௥Ճ 3.  4ϊʔυΛΫϥελʹ௥Ճ (Join) 4.  ࠷ॳͷϊʔυΛؚΊɺ2ϊʔυΛ kill -9 5.  ͢΂ͯͷΩʔΛऔಘ
  16. ©2013 BASHO TECHNOLOGIES INC. ALL RIGHTS RESERVED. Riak CS Architecture

     Stanchion ΦϒδΣΫτ ૢ࡞ block block block block block block manifest ü  ߴՄ༻ੑ ü  ෼ࢄ഑ஔ ü  ෳ੡ όέοτ ૢ࡞ Ϣʔβૢ࡞ɺ Ϩϙʔτ  S3 REST API  
  17. ©2013 BASHO TECHNOLOGIES INC. ALL RIGHTS RESERVED. APIͱΠϯλϑΣʔε    

    •  AWS  S3  REST  API४ڌ   •  ҰൠతͳS3  ޲͚ϥΠϒϥϦɺπʔϧΛར༻Մೳ   •  REST  GET,  PUTͱDELETE  ΦϖϨʔγϣϯ   •  S3-­‐style  ACLsɺόέοτϙϦγʔ  
  18. ©2013 BASHO TECHNOLOGIES INC. ALL RIGHTS RESERVED. σϞ  DragonDisk

    http://www.dragondisk.com s3cmd http://s3tools.org/s3cmd
  19. ©2013 BASHO TECHNOLOGIES INC. ALL RIGHTS RESERVED. • Riak  ͷ্ʹ࣮૷  

    • ؆୯ʹ࢖͑ΔΦϒδΣΫτετϨʔδ   • AWS  S3  ޓ׵API     ·ͱΊ
  20. ©2013 BASHO TECHNOLOGIES INC. ALL RIGHTS RESERVED. σʔληϯλʔؒ ϨϓϦέʔγϣϯ 

     ෳ਺DC  ؒͰͷɺยํ޲·ͨ͸   ྆ํ޲ͷσʔλಉظ     ར༻໨త   •  ॏେࡂ֐ԼͰ΋αʔϏεΛܧଓ   •  σʔλϩʔΧϦςΟ   •  ΞΫςΟϒόοΫΞοϓ   •  ProducYonΫϥελʔͱStage༻   ΫϥελʔʹΑΓݕূ؀ڥΛߏங   Primary   Cluster  (DC#1)   Secondary   Cluster  (DC#2)   Secondary   Cluster  (DC#3)   Client   Update  
  21. ©2013 BASHO TECHNOLOGIES INC. ALL RIGHTS RESERVED. ੡඼৘ใͷอ؅  • 

    ֦ுੑͱϢʔβΤΫεϖϦΤϯεͷύϑΥ ʔϚϯεվળ   •  bestbuy.comͱখചΓళฯͰ࢖༻͢ΔΦϯ ϥΠϯ੡඼ΧλϩάɺϨʔςΟϯάʹRiak Λબ୒ɻ   •  Holiday  ShoppingʢΫϦεϚεηʔϧͳ Ͳʣ࣌ʹ߹Θͤͯϊʔυ௥Ճɻ   •  Bestbuy.comͷϗʔϜϖʔδϨϯμϦϯά ͷSLA͸̍ඵҎ಺ɻ   •  2013೥ͷHoliday  seasonʹ͸ɺSKU਺Ͱ26ඦ ສ͔Β500ඦສ΁֦ுΛ૝ఆɻ   •  Amazon  AWSͷෳ਺Availability  Zone্ʹRiak ΫϥελʔΛߏங  
  22. ©2013 BASHO TECHNOLOGIES INC. ALL RIGHTS RESERVED. ΦϯϥΠϯ޿ࠂ •  DCϨϓϦέʔγϣϯͱ֦ுੑͷඞཁ͔Β

      MySQL  →  Cassandra  →  Riak  ͱϦϓϨʔε   •  ϢʔβΞΫςΟϏςΟσʔλ͓ΑͼτϥϑΟο ΫσʔλΛอଘ   •  τϥϑΟοΫσʔλ͸MySQL͔ΒҠߦʢෳ਺DC ؒͷσʔλϨϓϦέʔγϣϯ͕ඞཁʣ   •  Ϣʔβσʔλ͸Cassandra͔ΒҠߦʢbackward   compaYbility͕֬อͰ͖͍ͯͳ͍͜ͱ͕աڈͷ ϦϦʔεʹ͋Γɺ৴༻Ͱ͖ͳ͍ͨΊܾஅʣ   •  5ΧॴͷDCؒͰσʔλϨϓϦέʔγϣϯ   •  2011೥ʹ4  trillionͷadσʔλΛѻ͏ɻ   advertising conomic potential of uding OpenX mpTime) provide a bining ad serving, an ad exchange, a Supply Side Platform, and ach year. OpenX uses Riak for user and trafficking data behind its data services API. They selected Riak due to its highly available, low-latency, redundant architecture. OpenX also uses Riak’s  multi- datacenter replication across several data centers, providing up-to-date data throughout its global infrastructure. For more details about how OpenX uses Riak, check out the video of Anthony Molinaro, OpenX engineer, speaking at RICON2012,  Basho’s  2012   developer conference. ng technology provider. gencies, mobile operators, active and measurable obile devices. In 2009, rly all of the broadcasters le operators. With the dly, they needed to move to an architecture that could gracefully new platform because it is distributed, scalable, and highly mes of traffic. they opted to build two geographically separated, mirrored sites ation feature. As Marcus Kern, VP of Technology at Velti, ver  140  customers.  We  cannot  afford  a  single  minute  of   d exceed our requirements for scale, data durability, and 2009೥࣌఺  
  23. ©2013 BASHO TECHNOLOGIES INC. ALL RIGHTS RESERVED. ϝχϡʔγεςϜ •  ౰ॳAmazon

     S3Λ࢖༻;  On  demandϝχϡʔͷϥ΢ϯ υτϦοϓϨΠςϯγʔͷ௿ݮΛୡ੒   •  ߴ଎ͳಡΈग़͠/ॻࠐΈͷͨΊRiakΛબ୒   •  Video  On  demandϝχϡʔͷΞάϦήʔτʹඞཁͳ σʔλΛอଘ   •  ϚʔέςΟϯάΩϟϯϖʔϯʹকདྷ࢖༻͢ΔͰ͋Ζ ͏Ϣʔβؔ࿈ͷ৘ใΛอଘ   •  ຖ೔ɺϦϞʔτίϯτϩʔϥ͔Βͷ2500ສΫϦοΫ ਺Λॲཧ   •  On  demandϝχϡʔͷͨΊʹ̏ΧॴͷDCʹRiakΫϥ ελʔΛߏஙɻ   •  ϚʔέςΟϯά༻ʹ̐ͭͷΫϥελʔΛߏங  
  24. ©2013 BASHO TECHNOLOGIES INC. ALL RIGHTS RESERVED. •  Electronic  Health

     Recordsʢॲํᝦ৘ใʣΛҩऀ΍ ϝσΟΧϧγεςϜ͕༷ʑͳσόΠε͔ΒΞΫη ε   •  5.5MͷશσϯϚʔΫࠃຽ޲͚   Danish Health Services ϔϧεέΞ৘ใ؅ཧ
  25. ©2013 BASHO TECHNOLOGIES INC. ALL RIGHTS RESERVED. Riak CS  ͷϢʔεέʔε

    ύϒϦοΫ Ϋϥ΢υεετϨʔδ AWSҎ֎ͷS3४ڌͷ ετϨʔδ   Ϋϥ΢υυϥΠϒ   (ҰൠతͳίϯςϯπετϨʔδ)   Backup-­‐as-­‐a-­‐Service   ΞʔΧΠϒετϨʔδ ࣾһͱࣾ಺෦໳ͷͨ ΊͷετϨʔδ
  26. ©2013 BASHO TECHNOLOGIES INC. ALL RIGHTS RESERVED. :BIPP+"1"/ •  ΠϯλʔωοτγϣοϐϯάαΠτͷϓ

    ϥοτϑΥʔϜΛYahoo!δϟύϯ͕ఏڙ   •  γϣοϐϯάαΠτͷը૾σʔλΛRiak   CSʹετΞ   •  ొ࿥ΦϒδΣΫτ਺ɿ20ສ݅ʢ2012೥຤ ࣌఺ʣ   •  ϦΫΤετ਺ɿ450  req/sec   •  Ϩεϙϯεɿ10ms  –  80ms   •  ߏஙɿ1೔ɻ   •  S3ޓ׵Ϋϥ΢υετϨʔδαʔϏεΛఏڙ   •  2ΧॴͷDCؒͰσʔλϨϓϦέʔγϣϯ