Upgrade to Pro — share decks privately, control downloads, hide ads and more …

大きなデータと戦う技術 / fighting-large-data

大きなデータと戦う技術 / fighting-large-data

明日の開発カンファレンス 2018秋

yuuki takezawa

October 13, 2018
Tweet

More Decks by yuuki takezawa

Other Decks in Technology

Transcript

  1. େ͖ͳσʔλͱઓ͏ٕज़
    yuuki takezawa
    asucon 2018ळ

    View Slide

  2. Profile
    • ஛ᖒ ༗و / ytake
    • גࣜձࣾΞΠελΠϧ CTO
    • PHP, Hack, Go, Scala
    • Apache Hadoop, Apache Spark, Apache Kafka

    • twitter https://twitter.com/ex_takezawa
    • facebook https://www.facebook.com/yuuki.takezawa
    • github https://github.com/ytake

    View Slide

  3. View Slide

  4. Agenda
    • ΞϓϦέʔγϣϯͱσʔλઃܭ

    • ղܾ͢ΔͨΊʹ

    View Slide

  5. ΞϓϦέʔγϣϯͱσʔλઃܭ

    View Slide

  6. ΞϓϦέʔγϣϯͷσʔλʹ͍ͭͯ
    • WebΞϓϦέʔγϣϯͳͲΛࢧ͑Δ

    RDBMS
    • IoTͳͲʹ୅ද͞ΕΔେن໛ͳσʔλ

    View Slide

  7. ΞϓϦέʔγϣϯͷ੒௕Λࢧ͖͑ΕΔʁ
    • ૝ఆ֎ͷ੒௕Λ਱͛Δ

    WebΞϓϦέʔγϣϯ


    ఆظతͳσʔλϕʔεϦϑΝΫλϦϯάɺ
    ΞϓϦέʔγϣϯͷϦϑΝΫλϦϯά

    ͕࣮ࢪͰ͖Δ͔Ͳ͏͔

    View Slide

  8. ΞϓϦέʔγϣϯͷ੒௕Λࢧ͖͑ΕΔʁ
    • ϋʔυ΢ΣΞɾΞϓϦέʔγϣϯো֐


    ΞϓϦέʔγϣϯʹ߹ΘͤͯΫϥ΢υ
    ͔ɺΦϯϓϨΛબ୒͢Δ

    View Slide

  9. খ͞ͳνʔϜͷ৔߹

    View Slide

  10. ࠷ॳͷΞϓϦέʔγϣϯ
    • σʔλϕʔεઃܭ + Active Record etc

    ϑϨʔϜϫʔΫͰߏங͞ΕΔ

    ΞϓϦέʔγϣϯ

    • গਓ਺ͷ։ൃऀͰߏ੒͞ΕΔ։ൃ૊৫


    View Slide

  11. ෳ਺νʔϜ΁ͷ੒௕

    View Slide

  12. ΞϓϦέʔγϣϯͷ੒௕
    • ૿͑ΔΞϓϦέʔγϣϯػೳ

    • ։ൃνʔϜͷ૿һ

    εΩϧ͸༷ʑ

    View Slide

  13. ΞϓϦέʔγϣϯͷ੒௕
    • Ϩίʔυ૿Ճɾ࣮૷ίʔυ૿ՃʹΑΔ

    ύϑΥʔϚϯεͷ௿Լ


    ϥΠϒϥϦͰൃߦ͞ΕΔSQLʹ͍ͭͯ

    ཧղ͍ͯ͠Δ͔Ͳ͏͔

    όΠφϦΛσʔλϕʔεʹ֨ೲʂʁ

    View Slide

  14. ΞϓϦέʔγϣϯͱσʔλϕʔε
    • खܰʹ࢖͑Δ͔Β࢖͏ Ͱ͸ͳٙ͘໰Λ࣋ͭ

    ൃߦ͞ΕΔSQL͸ݱࡏͷ

    ΞϓϦέʔγϣϯن໛ʹ߹͍ͬͯΔ͔Ͳ͏͔
    • ϋʔυ΢ΣΞ૿ڧͰ৐Γ੾Δ

    ໰୊͕ޙճ͠ʹͳΔ͜ͱ΋

    View Slide

  15. ΞϓϦέʔγϣϯͷ੒௕ͱσʔλϕʔε
    • σʔλऔಘ؆ུԽͷͨΊͷ

    σʔλϕʔεઃܭ

    • ΞΫηεϩάͳͲͷσʔλΛ֨ೲ

    ཁ஫ҙ

    View Slide

  16. େن໛νʔϜ΁ͷ੒௕

    View Slide

  17. ߋͳΔΞϓϦέʔγϣϯͷ੒௕
    • ૿͑ଓ͚ΔΞϓϦέʔγϣϯػೳ

    • ։ൃνʔϜͷڊେԽ

    ෳ਺ͷνʔϜߏ੒ͱ

    ෳ਺ͷεςʔΫϗϧμ

    View Slide

  18. ΞϓϦέʔγϣϯͷ੒௕
    • Ϩίʔυ૿Ճɾ࣮૷ίʔυ૿ՃʹΑΔ

    ͞ΒͳΔύϑΥʔϚϯε௿Լ
    • ͋ͪͪ͜Ͱى͜Γ࢝ΊΔো֐

    View Slide

  19. ϦϦʔεΛ༏ઌͤ͞Αʂ

    View Slide

  20. ฐ֐
    • ϦϦʔε༏ઌͷͨΊɺ

    ܧ͗଍͠ͷΞϓϦέʔγϣϯ

    • εςʔΫϗϧμ૿Ճʹ൐͏

    ΞϓϦέʔγϣϯͷෳࡶԽ
    • খதن໛ͷΞϓϦέʔγϣϯ࣌୅ͷ

    ઃܭͱ࣮૷༝དྷͷෆ۩߹͕૿Ճ

    View Slide

  21. σʔλઃܭ༝དྷͷ໰୊
    • େྔσʔλͷϑϧεΩϟϯ

    • INDEXෆ଍ͷͨΊͷύϑΥʔϚϯε௿Լ
    • γϯϓϧͳߏ੒ނͷػೳ௥Ճ࣌ͷ

    ΫΤϦෳࡶԽ

    View Slide

  22. ղܾ͢ΔͨΊʹ

    View Slide

  23. ෳࡶ͞ͱͷઓ͍

    View Slide

  24. ΞϓϦέʔγϣϯͷ෼ྨ
    • ॻ͖ࠐΈ͕ଟ਺ͷΞϓϦέʔγϣϯ
    • ಡΈࠐΈ͕ଟ਺ͷΞϓϦέʔγϣϯ


    ඞͣͲͪΒ͔ʹ෼ྨ͞ΕΔ

    View Slide

  25. ྆ํ͋Γ·͚͢Ͳɾɾʁ

    View Slide

  26. ΞϓϦέʔγϣϯͷ୯Ґ
    • ҰͭͷΞϓϦέʔγϣϯʹ

    ͨ͘͞Μͷػೳ͕٧·͍ͬͯΔέʔε


    ػೳҰͭͣͭΛ෼ղͯ͠ߟ͑Δ

    View Slide

  27. ॻ͖ࠐΈଟ਺ͷΞϓϦέʔγϣϯ
    • ॻ͖ࠐΈʹڧ͘ɺ

    εέʔϧ͕༰қͳσʔλϕʔε΁

    Cassandra, Dynamodb, MongoDB

    • ػೳ୯ҐͰߟ͑Δ

    ܾࡁܥͳΒRDBMSซ༻ͳͲ

    View Slide

  28. ಡΈࠐΈଟ਺ͷΞϓϦέʔγϣϯ
    • RDBMSͷΈͰ΋े෼

    • LIKEݕࡧͳͲ͸Elasticsearch, Solr΁

    View Slide

  29. ॻ͖ࠐΈͱಡΈࠐΈͷ౷߹
    • ͲͪΒ͔͚ͩͰ΍Ζ͏ͱ͠ͳ͍ࣄ΋Ұͭ
    • Message Brokerซ༻ʹΑΔ෼ࢄॲཧ

    Apache Kafka, RabbitMQ

    Amazon SQS, Amazon Kinesis(Firehose)

    View Slide

  30. CQRS
    "A few myths about CQRS". Ouarzy's Blog. 

    http://www.ouarzy.com/2016/10/02/a-few-myths-about-cqrs/ ࢀর

    View Slide

  31. ࣮ྫ

    View Slide

  32. େྔσʔλ΁ͷΞϓϩʔν

    View Slide

  33. େྔσʔλ΁ͷΞϓϩʔν
    Ϣʔβʔͷ࣌ܥྻߦಈϩά͕
    QIQSELBGLBܦ༝ͰૹΒΕͯ͘Δ

    View Slide

  34. େྔσʔλ΁ͷΞϓϩʔν
    "QBDIF,BGLB
    "QBDIF;PPLFFQFS
    QBSUJUJPO
    ݱࡏԯ͘Β͍

    ΞϓϦέʔγϣϯ΁ͷো֐ɾऔΓ͜΅͠ͳ͠

    View Slide

  35. େྔσʔλ΁ͷΞϓϩʔν
    σʔλϕʔεΛ݁߹ͯ͠ϏδωεϩδοΫٵऩ
    QVTI௨஌ࢦࣔͳͲΠϕϯτΛૹ৴
    ଞαʔϏε͕SBCCJUNRΛ࢖͍ͬͯΔͨΊ

    View Slide

  36. େྔσʔλ΁ͷΞϓϩʔν
    ,BGLB$POOFDUʹΑΔసૹΛซ༻

    View Slide

  37. େྔσʔλ΁ͷΞϓϩʔν
    $BTTBOESB$MVTUFS
    ͪ͜Β΋ԯͪΐͬͱ͘Β͍
    ো֐ͳ͠ɾίϯύΫγϣϯఆظ࣮ߦͰ
    τϥϒϧͳ͠

    View Slide

  38. ूܭܥσʔλͱͷઓ͍

    View Slide

  39. ΞΫηεϩάͳͲͷσʔλͷ׆༻
    • ΞΫηεϩάͳͲͷղܾํ๏
    • ΞϓϦέʔγϣϯͰఏڙ͞ΕΔػೳ

    ϩάΛར༻͢ΔϨίϝϯσʔγϣϯ

    ෼ੳػೳ

    View Slide

  40. ϩάσʔλ΁ͷΞϓϩʔν
    • ΄ͱΜͲ͸աڈͷσʔλͷूܭͰ

    ΄΅ෆม

    • ूܭޙʹ

    ଞͷσʔλͱֻ͚߹ΘͤΔͳͲ

    View Slide

  41. ϩάσʔλ΁ͷΞϓϩʔν
    • RDBMSͰूܭ

    ୯७ͳεϨʔϒͱ͸෼཭͓ͯ͘͠ࣄ

    ਺ेԯҎ্ͷσʔλͰ͸ແཧ͠ͳ͍

    • ूܭςʔϒϧͱΞϓϦέʔγϣϯ༻ͷ

    ςʔϒϧ͸ซ༻͠ͳ͍

    View Slide

  42. ϩάσʔλ΁ͷΞϓϩʔν
    • HDFSͰूܭ

    RDBMS͔ΒApache Sqoopɺ

    Apache SparkͳͲͰసૹ
    • ूܭॲཧ͸Apache SparkͳͲͰߦ͍ɺ

    ଞͷσʔλϕʔεͱ݁߹͠ɺ֨ೲ

    View Slide

  43. ࣮ྫ

    View Slide

  44. ϩάσʔλ΁ͷΞϓϩʔν

    View Slide

  45. ϩάσʔλ΁ͷΞϓϩʔν
    ूܭର৅ͷ
    σʔλϕʔεɾςʔϒϧΛసૹ

    View Slide

  46. ϩάσʔλ΁ͷΞϓϩʔν
    )%'4΁3%#.4ͷσʔλ
    Λอ؅

    View Slide

  47. ϩάσʔλ΁ͷΞϓϩʔν
    )%'4ʹ͋Δσʔλɺ
    ଞͷ3%#.4্ͷσʔλΛ݁߹

    View Slide

  48. ϩάσʔλ΁ͷΞϓϩʔν
    ूܭॲཧޙ࠶ͼ)%'4ͳͲʹ
    ֨ೲ͠௚͢FUD

    View Slide

  49. ूܭσʔλͱϦΞϧλΠϜσʔλ΁ͷΞϓϩʔν
    • ूܭ݁ՌΛ֨ೲͨ͠σʔλετϨʔδʴ

    ετϦʔϜॲཧͷ૊Έ߹Θͤ

    • WebΞϓϦέʔγϣϯ૚Ͱ

    ूܭ౳͸ߦΘͳ͍

    View Slide

  50. KappaΞʔΩςΫνϟ

    View Slide

  51. KappaΞʔΩςΫνϟ

    View Slide

  52. ࣮ྫ

    View Slide

  53. ϩάσʔλ΁ͷΞϓϩʔν ͦͷ2

    View Slide

  54. ϩάσʔλ΁ͷΞϓϩʔν ͦͷ2
    ༷ʑͳΞϓϦέʔγϣϯ͔Β

    σʔλૹ৴

    View Slide

  55. ϩάσʔλ΁ͷΞϓϩʔν ͦͷ2
    "QBDIF,BGLB͕

    શͯͷσʔλΛड৴

    View Slide

  56. ϩάσʔλ΁ͷΞϓϩʔν ͦͷ2
    ,BGLB4QBSL4USFBNJOH
    ΞϓϦέʔγϣϯ͔Βૹ৴͞Εͨσʔλͱɺ
    3%#.4ʹ֨ೲ͞ΕͨσʔλΛ݁߹͠ɺ
    ूܭɾू໿Λߦ͏

    View Slide

  57. ϩάσʔλ΁ͷΞϓϩʔν ͦͷ2
    ूܭɾू໿͞ΕͨσʔλΛɺ

    ಡΈࠐΈͰར༻͢ΔΞϓϦέʔγϣϯʹ

    ߹Θͤͯอ؅
    $BTTBOESBͱ4QBSL4USFBNJOHͷΈͰ

    ೖग़ྗΛߦ͏έʔε΋

    View Slide

  58. ϩάσʔλ΁ͷΞϓϩʔν ͦͷ2
    ूܭɾू໿͞Εͨσʔλͷ͏ͪ
    ༷ʑͳՕॴͰར༻͞ΕΔ΋ͷ͸ɺ)%'4΁
    ࠶ܭࢉ΍ɺো֐ൃੜ࣌ʹ෮چͤ͞ΔͳͲ

    View Slide

  59. ϩάσʔλ΁ͷΞϓϩʔν ͦͷ2
    ΞϓϦέʔγϣϯଆ͔Β͸
    $BTTBOESBͷΈʹ໰͍߹ΘͤΛߦ͏

    View Slide

  60. ·ͱΊ

    View Slide

  61. ·ͱΊ
    • ن໛ʹ߹Θͤͨσʔλઃܭ

    ఆظతͳσʔλϕʔεϦϑΝΫλϦϯά
    • దࡐదॴΛݟۃΊΔٕज़
    • ΞϓϦέʔγϣϯͱઓ͏৺

    View Slide