$30 off During Our Annual Pro Sale. View Details »

スマートニュースの世界進出を支えるログ解析基盤 #jawsdays #tech

スマートニュースの世界進出を支えるログ解析基盤 #jawsdays #tech

スマートニュースは昨年の 10/1 に米国版をローンチするにあたり、ログ解析基盤のリニューアルを行いました。日本に加えて米国やその他の国が入ってくることにより、単なるユーザ数の増加に加え、OS x 国 x タイムゾーン x 多種多様なメトリクスのような集計軸が増えることで、ログの前処理、集計、可視化に様々な工夫が必要になってきます。本セッションでは、会社の成長に応じたログ集計基盤の転換を振り返りながら、世界進出にあたってどのようなことを考え、どのようにログ集計基盤をリニューアルしていったか、および、そのログ解析基盤を支える Amazon EMR, Hive, Presto, Azkaban, Shib, Chartio などのツールについてお話します。

Takumi Sakamoto

March 22, 2015
Tweet

More Decks by Takumi Sakamoto

Other Decks in Programming

Transcript

  1. εϚʔτχϡʔεͷ
    ੈքల։Λࢧ͑Δϩάղੳج൫
    ࡔຊ୎າ
    εϚʔτχϡʔεגࣜձࣾ
    εϚʔτχϡʔεެࣜΩϟϥ
    ஍ٿ͘Μ

    View Slide

  2. "CPVU.F
    w ࡔຊ୎າ
    w HJUIVCUXJUUFS!UBLVT
    w εϚʔτχϡʔεגࣜձࣾ
    w Πϯϑϥத৺ʹαʔϏεͷԼճΓͷ໘౗ΛΈ͍ͯΔ
    w ೥݄ʹδϣΠϯ
    w ϩάղੳܥͷ࢓ࣄ͸ͦͷλΠϛϯάͰ͸͡Ίͨ

    View Slide

  3. View Slide

  4. .-CBTFE/FXT"QQ

    View Slide

  5. /VNCFSPG%PXOMPBET
    ถࠃ൛ϦϦʔε
    ࠃ಺Ͱ޿ࠂࣄۀΛ։࢝
    Πϯλʔφγϣφϧ൛ϦϦʔε

    View Slide

  6. ."6JO64
    https://www.techinasia.com/smartnews-us-1m-monthly-active-users/

    View Slide

  7. http://aws.amazon.com/jp/solutions/case-studies/smartnews/

    View Slide

  8. Ұߦͷϩάͷ޲͜͏ʹ͸ɺ
    ҰਓͷϢʔβ͕͍Δ
    http://ihara2525.tumblr.com/post/17029509298

    View Slide

  9. εϚʔτχϡʔεͷ৔߹

    View Slide

  10. ,1*μογϡϘʔυ
    ͨͱ͑͹"#ςετͷ݁ՌΛ֬ೝͯ͠αʔϏεվળ

    View Slide

  11. ϩάࡇΓ
    ࣾ௕ʙΞϓϦΤϯδχΞʙαʔόΤϯδχΞͰ։ൃ߹॓
    શһͰϩάͷ໰୊Λٞ࿦ɺվળʹऔΓ૊ΉϋοΧιϯ

    View Slide

  12. ϩάղੳج൫ͷ࿩

    View Slide

  13. Collect Process Visualize
    Store
    ΞϓϦͰϩάΛऔಘ
    "1*αʔόͰड৴
    ετϨʔδʹӬଓԽ
    લॲཧ &5-

    ूܭ
    #*πʔϧͳͲͰՄࢹԽ
    %BUB1JQFMJOFT

    View Slide

  14. ΞϓϦ͔Βૹ৴͞ΕΔϩά
    w Ϣʔβͷߦಈ
    w χϡʔεهࣄΛಡΜͩ
    w Ϣʔβͷߦಈʹجͮ͘ϓϩύςΟ
    w هࣄͷ63-ɺهࣄͷ଺ࡏ࣌ؒ
    w ͦͷ΄͔ɺσόΠεͰ͔͠஌Γಘͳ͍৘ใ
    w σόΠε৘ใɺ04ͷόʔδϣϯɺΞϓϦͷόʔδϣϯ

    View Slide

  15. +40/ϩάͷྫ
    {
    "event" : "viewArticle",
    "timestamp" : "2014-11-28 12:45:14",
    "properties" : {
    "userId" : 1224434,
    "os" : "android",
    "country" : "Japan",
    "url" : "http://www.example.com/",
    "duration" : 224.8
    }
    }

    View Slide

  16. ΞϓϦ͔Βαʔό΁ͷૹ৴
    w όοϑΝϦϯάͯ͠όϧΫૹ৴
    w ΫοΫύου͞Μͷ1VSFF ˞
    తͳ
    w ΦϑϥΠϯ࣌͸ΦϯϥΠϯʹͳͬͨλΠϛϯάͰૹ৴
    w ඞཁʹԠͯ͡αʔόଆͰ৘ใΛ෇༩
    w "#ςετͷࢀՃঢ়گͳͲ
    ※ http://techlife.cookpad.com/entry/2014/11/25/132008

    View Slide

  17. Collect Process Visualize
    Store
    ΞϓϦͰϩάΛऔಘ
    "1*αʔόͰड৴
    ετϨʔδʹӬଓԽ
    લॲཧ &5-

    ूܭ
    #*πʔϧͳͲͰՄࢹԽ
    %BUB1JQFMJOFT
    ձࣾͷঢ়گ΍ٕज़ಈ޲ʹΑͬͯมԽ
    ࣌ܥྻͰৼΓฦͬͯΈΔ
    ˞ೖࣾҎલͷ࿩͸఻ฉͰ͢

    View Slide

  18. ࣌ظd
    ن໛d໊
    ঢ়گ
    ʹJ04൛ϦϦʔε
    ৭ʑͳҙຯͰଟ͘ͷ൓ڹ

    View Slide

  19. ౰࣌ͷ՝୊ͱղܾࡦ
    w (PPHMF"OBMZUJDT͚ͩͰ͸ݟ͍ͨ,1*͕ΈΕͳ͍
    w ๭ήʔϜձࣾͷίϗʔτ͕ݟΕΔ,1*πʔϧ͕ཉ͍͠
    w ,1*πʔϧΛࣗ࡞
    w ࣌୅͸'MVFOUE.POHP%#Ͱ͢Αʢŝžŕ
    w ϑϩϯτΤϯυ͸3BJMTΞϓϦ

    View Slide

  20. Process
    Visualize
    Store ௚ۙ਺೔෼Λ.POHP%#ʹอଘ
    .POHP%#ͷ.BQ3FEVDFͰલॲཧ
    3VCZͰूܭͯ͠ɺ.POHPʹอଘ
    3BJMT੡ͷՄࢹԽπʔϧ

    View Slide

  21. ࣌ظd
    ن໛d໊
    ঢ়گ
    ࠃ಺ͰΠϯετʔϧ਺͕૿Ճ
    ΞυϗοΫղੳͷधཁ͕ߴ·Γ࢝ΊΔ

    View Slide

  22. ౰࣌ͷ՝୊ͱղܾࡦ
    w ॲཧ࣌ؒͷ૿ՃΞυϗοΫղੳ͠ʹ͍͘
    w .POHP%#ͷ.BQ3FEVDFͷॲཧ͕࣌ؒԆͼଓ͚Δ
    w ΞυϗοΫղੳ͢Δʹ΋.POHP%#͸௚ۙͷσʔλͷΈ
    w &.33FETIJGUͷಋೖ
    w 4ʹ͢΂ͯͷϩάΛӬଓԽɺલॲཧΛ&.3ʹҠߦ
    w ผγεςϜͱͯ͠3FETIJGUΛಋೖ

    View Slide

  23. Process
    Visualize
    Store શͯͷϩάΛ4ʹอଘ
    &.3 NSKPC
    Ͱͷલॲཧ
    3VCZͰूܭͯ͠ɺ.POHPʹอଘ
    3BJMT੡ͷՄࢹԽπʔϧ
    3FETIJGU͸λϒϩʔͳͲͰՄࢹԽ

    View Slide

  24. ࣌ظd
    ن໛d໊
    ঢ়گ
    ถࠃ൛ΛϦϦʔε
    ޿ࠂνʔϜ্ཱͪ͛

    View Slide

  25. ౰࣌ͷ՝୊ͱղܾࡦ
    w ूܭ߲໨͕૿͑ͯഁ୼͔͔͠Δ
    w 04ຖɺࠃຖɺݴޠຖɺλΠϜκʔϯຖʹूܭ͕ඞཁ
    w ฒྻԽͰ͖ͯͳ͍3VCZ෦෼΋ϘτϧωοΫʹ
    w .POHP%#ͱ3FETIJGUͷεΩʔϚဃ཭໰୊
    w )JWF1SFTUPͷݕূͱಋೖ
    w ௚ۙͷ໰୊Λղܾͭͭ͠ɺ஌ݟΛஷΊΔϑΣʔζ
    w ࣍ͷϑΣʔζ΁ͷΑ͍εςοϓʹͳͬͨ

    View Slide

  26. Process
    Visualize
    Store શͯͷϩάΛ4ʹอଘ
    &.3 NSKPC
    )JWFͰલॲཧ
    1SFTUPͰूܭ͠ɺ݁ՌΛ.POHP%#΁
    ࣗ࡞ͷ3BJMT੡ՄࢹԽπʔϧ

    View Slide

  27. ࣌ظd
    ن໛d໊
    ঢ়گ
    ถࠃͰͷ࠾༻͕૿͑࢝ΊΔ
    Πϯλʔφγϣφϧ൛ͷϦϦʔε

    View Slide

  28. ౰࣌ͷ՝୊ͱղܾࡦ
    w ϩάղੳܥͷλεΫ͕ಛఆͷݸਓʹूத
    w ඇΤϯδχΞʹπϥ͍ɻαʔόʹ44)ʁ.POHP%#ʁ
    w 3BJMTͷՄࢹԽπʔϧͷϝϯςΛͰ͖Δਓ͕গͳ͍
    w Ϗϡʔͷ௥Ճґཔʹ͍͍͚ͭͯͳ͍
    w ͞ΒʹΞυϗοΫͳػೳ௥ՃͰϝϯςίετ͕૿େ
    w օ͕ࣗ༝ʹσʔλʹΞΫηεͰ͖ΔΑ͏ʹ͍ͨ͠
    w $IBSUJP΍4IJCͳͲͷπʔϧΛಋೖ

    View Slide

  29. w &5-ॲཧ͕ෳࡶԽ
    w ௚ྻʹ࣮ߦ͢Δͱॲཧ͕࣌ؒ૿Ճ
    w ฒྻʹ࣮ߦ͢Δͱϑϩʔ͕ෳࡶԽɺ࠶࣮ߦ͕৬ਓܳʹ
    w δϣϒϑϩʔϚωʔδϟʔͷಋೖ
    w ґଘఆٛ΍ɺϦτϥΠɺ4-"ͳͲͷ໘౗Έͯ͘ΕΔ
    w ͍͔ͭ͘ࢼ͕ͨ͠"[LBCBOʹམͪண͘
    ౰࣌ͷ՝୊ͱղܾࡦ

    View Slide

  30. Process
    Visualize
    Store શͯͷϩάΛ4ʹอଘ
    )JWFPO&.3Ͱલॲཧ
    όονδϣϒΛ"[LBCBO؅ཧ
    ूܭ͸1SFTUP
    $IBSUJPͰ֤ݸਓ͕ϩάΛՄࢹԽ
    4IJCͰΫΤϦ࣮ߦ݁ՌΛڞ༗

    View Slide

  31. 4ZTUFN"SDIJUFDUVSF

    View Slide

  32. %BUB4DIFNB
    w ελʔεΩʔϚ
    w σʔλ΢ΣΞϋ΢εʹར༻͞ΕΔ࠷΋୯७ͳεΩʔϚ
    w 'BDUT
    w Ϣʔβͷߦಈϩά
    w %JNFOTJPOT
    w Ϣʔβͷଐੑ 04ɺࠃɺ"#5FTUࢀՃ৘ใͳͲ

    View Slide

  33. %BUB4DIFNB
    user_id action url
    1 readArticle http://example.com/1
    2 readArticle http://example.com/2
    3 readArticle http://example.com/3
    Actions
    user_id os
    1 ios
    2 android
    3 ios
    User OS
    user_id country
    1 US
    2 JP
    3 GB
    User Location
    user_id identifier behavior
    1 tutorial_04 A
    2 tutorial_04 B
    3 tutorial_04 A
    User A/B Test
    user_id version
    1 2.2.1
    2 1.9.8
    3 2.2.0
    User App Version

    View Slide

  34. SELECT
    date,
    behavior,
    count(distinct ac.user_id) as DAU,
    count_if(action='readArticle') as DPV
    FROM actions facts
    LEFT JOIN abtest_users dimensions ON facts.user_id = dimensions.user_id
    WHERE
    definition = 41 AND
    date BETWEEN '2015-01-01' AND '2015-01-31'
    GROUP BY date, behavior
    ORDER BY date, behavior
    ;
    "#ςετผͷ,1*ΛٻΊΔ

    View Slide

  35. API
    Server
    Archive
    Bucket
    Analysis
    Bucket
    ETL
    Cluster
    Analysis
    Cluster
    %BUB'MPX
    BI Tool
    User
    WebUI

    View Slide

  36. ઃܭࢥ૝
    w ։ൃऀɾඇΤϯδχΞɾܦӦऀ
    w ୭΋͕؆୯ʹେن໛ͳσʔληοτʹΞΫηεͰ͖Δ
    w 42- )2-
    Λڞ௨ݴޠʹ͢Δ
    w σʔλΛݟ͍ͨਓ͕ݟ͍ͨܗͰՄࢹԽ
    w ඞཁͰ͋Ε͹؆୯ʹυϦϧμ΢ϯ͍͚ͯ͠Δ
    w ӡ༻ऀ
    w ϩάΛઐ೚Ͱ΍Δਓ͸͍ͳ͍ɺӡ༻ίετ࠷খԽ

    View Slide

  37. ઃܭࢥ૝
    w ετϨʔδ૚ͱΞϓϦέʔγϣϯ͸෼཭͓ͯ͘͠
    w SF*OWFOUͰ"VSPSB΍/FUqJYͷηογϣϯ
    w ےͷΑ͍ઃܭͩͱײͨ͡
    w Ϋϥ΢υΒ͍͠ɺ༷ʑͳϝϦοτ
    w ඞཁʹԠͯ͡ॊೈʹΩϟύγςΟ௥Ճ͕Մೳ
    w ͍ͭམͪͯ΋͍͍ͷͰεϙοτΠϯελϯεΛ׆༻Ͱ͖Δ
    w ϛυϧ΢ΣΞͷόʔδϣϯΞοϓݕূ͕؆қʹ

    View Slide

  38. (SDD415) NEW LAUNCH: Amazon Aurora: Amazon’s New Relational Database Engine | AWS re:Invent 2014

    View Slide

  39. (BDT403) Netflix's Next Generation Big Data Platform | AWS re:Invent 2014

    View Slide

  40. API
    Server
    Archive
    Bucket
    Analysis
    Bucket
    ETL
    Cluster
    Analysis
    Cluster
    ϩάͷอଘ
    BI Tool
    User
    WebUI

    View Slide

  41. w ֤αʔό͔Β'MVFOUEͰ௚઀4ʹΞοϓϩʔυ
    w EBUFͱIPVSͰQSFpYΛ੾Δͷ͕Φεεϝ
    ϩάͷอଘ
    API
    Server
    Archive
    Bucket
    อଘઌͷྫ:
    s3://smartnews/log/production/raw_actions/date=2015-03-22/hour=00/

    View Slide

  42. API
    Server
    Archive
    Bucket
    Analysis
    Bucket
    ETL
    Cluster
    Analysis
    Cluster
    σʔλͷલॲཧ
    BI Tool
    User
    WebUI

    View Slide

  43. )JWF
    w &.3Ͱ)JWFͷΫϥελΛ্ཱͪ͛Δ
    w &YUFSOBM5BCMFʹύʔςΟγϣϯΛ௥Ճ͍ͯ͘͠
    w 'BDUT
    w 4ʹอଘ͞ΕͯΔੜϩάͷύεΛࢦఆ͢Δ͚ͩ
    w %JNFOTJPOT
    w 3%4ͷUBCMFEVNQΛ4ʹஔ͖ɺͦͷύεΛࢦఆ
    w ΧϥϜφετϨʔδʹม׵ͯ͠4ʹอଘ
    w 03$1BSRVFUͳͲͷσʔλղੳʹదͨ͠ϑΥʔϚοτ

    View Slide

  44. )JWFΫϥελͷىಈ
    w "84$-*ܦ༝Ͱىಈ͢Δ
    w ΠϯελϯεͷλΠϓɾ਺ͳͲΛ+40/Ͱఆٛ
    w #PPUTUSBQBDUJPOTͰϛυϧ΢ΣΞͷηοτΞοϓ
    w )JWFͷ.FUBTUPSFͷઃఆͳͲ
    w 1ZUIPO"84&.3ශऀͷϩάूܭ ˞
    ͕ࢀߟʹͳΔ
    ※ http://www.slideshare.net/akirachiku/python-hive-on-emr

    View Slide

  45. )JWFΫϥελͷىಈ
    Old Hive Cluster
    New Hive Cluster
    User A
    User B
    MetaStore
    Archive
    Bucket
    Analysis
    Bucket
    launch
    launch
    read
    write
    )JWFͷ.FUBTUPSF͸3%4Ͱڞ༗͍ͯ͠Δ
    ৽͘͠ΫϥελΛىಈͯ͠4ͷॻ͖ࠐΈઌΛมߋ͢Ε͹ݕূ͕༰қ
    read
    write

    View Slide

  46. )JWFʹΑΔ&5-
    -- External Table ͱͯ͠ɺFluentd ͷϩάܗࣜΛఆٛ
    CREATE EXTERNAL TABLE IF NOT EXISTS raw_actions (
    timestamp STRING,
    tag STRING,
    data STRING
    )
    PARTITIONED BY ( date STRING, hour STRING )
    ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
    LOCATION 's3://smartnews/log/production/raw_actions/';
    ;
    -- ύʔςΟγϣϯΛ௥Ճ͢Δ
    ALTER TABLE raw_actions ADD IF NOT EXISTS PARTITION (`date`='${DATE}', `hour`='${HOUR}')
    LOCATION 's3://smartnews/log/production/raw_actions/date=${DATE}/hour=${HOUR}';
    'MVFOUEͷϩάΛಡΉͨΊͷςʔϒϧΛఆٛ͠ɺύʔςΟγϣϯΛ௥Ճ

    View Slide

  47. )JWFʹΑΔ&5-
    -- ΧϥϜφετϨʔδͷ External Table Λఆٛ
    CREATE EXTERNAL TABLE IF NOT EXISTS orc_actions (
    timestamp INT,
    user_id INT,
    os STRING,
    country STRING,
    action STRING,
    data STRING
    )
    PARTITIONED BY ( date STRING, hour STRING )
    STORED AS ORC
    LOCATION 's3://smartnews/log/production/raw_actions/';
    TBLPROPERTIES ("orc.compress"="SNAPPY");
    ΧϥϜφετϨʔδͷςʔϒϧΛఆٛ

    View Slide

  48. -- ΧϥϜφετϨʔδͷ External Table ʹΠϯϙʔτ
    INSERT OVERWRITE TABLE orc_actions
    PARTITION (`date` = '${DATE}', `hour` = '${HOUR}')
    SELECT
    user_id, timestamp, COALESCE(os, "undefined"), COALESCE(country, "undefined"), action
    modify_json(data, 'unnecessary1', 'unnecessary2')
    FROM raw_actions
    LATERAL VIEW json_tuple(
    raw_actions.json, 'userId', 'timestamp', 'platform', 'country', 'action', 'data'
    ) a as user_id, timestamp, os, country, action, data
    WHERE date = '${DATE}' and hour = '${HOUR}'
    ORDER BY os,country, action, user_id;
    )JWFʹΑΔ&5-
    KTPO@UVQMFؔ਺΍FYQMPEFؔ਺ͳͲͰ+40/Λల։͢Δ
    ࣗ࡞6%'ͰɺಛఆϑΟʔϧυͷ+40/͔ΒෆཁͳϑΟʔϧυ࡟আ

    View Slide

  49. "[LBCBO
    w δϣϒϑϩʔ؅ཧπʔϧ
    w -JOLFE*O͕044ͱͯ͠ެ։͍ͯ͠Δ
    w ओͳػೳ
    w ґଘఆٛɺґଘؔ܎ͷՄࢹԽ
    w δϣϒͷఆظ࣮ߦɺΞυϗοΫ࣮ߦɺϦτϥΠ
    w ੒ޭɾࣦഊ࣌ʹϝʔϧ௨஌

    View Slide

  50. "[LBCBOͷ͍͍ͱ͜Ζ
    w δϣϒ؅ཧ͕͠΍͍͢
    w δϣϒͷґଘؔ܎ͱͲ͜·Ͱ׬͔ྃͨ͠ͷՄࢹԽ
    w δϣϒΛίʔυͱͯ͠؅ཧ
    w λΠϜΞ΢τ 4-"
    ઃఆ ྫ࣌ؒͰऴྃ͠ͳ͚Ε͹௨஌

    type=command
    command=hive-wrapper -d 2015-02-02 -q query_c
    dependencies=query_a, query_b
    δϣϒఆٛϑΝΠϧͷྫ

    View Slide

  51. δϣϒͷґଘؔ܎ͷྫ
    User Features
    Raw table
    Intermediate
    Table
    ྘͸੒ޭࡁΈɺ੺͸ࣦഊ͍ͯ͠ΔͷͰ࠶࣮ߦ͕ඞཁ
    ࠶࣮ߦ͕ඞཁͳδϣϒ͚ͩ༗ޮʹͯ͠࠶࣮ߦ͕Մೳ

    View Slide

  52. "[LBCBOΛ࢖͏্Ͱͷ޻෉
    w "NB[PO&.3ͱ"[LBCBOʹͦΕͧΕ੍໿
    w &.3ͷεςοϓ͸ৗʹ௚ྻʹ࣮ߦ͞ΕΔ
    w "[LBCBOͷ&YFDVUPS͸୆͔࣋ͯ͠ͳ͍
    w #FFMJOF͔Β5ISJGUܦ༝ͰδϣϒΛ࣮ߦ
    w ґଘղܾͨ͠+"3Λ&YFDVUPSʹ഑ஔ
    w &.3ͷλά͔Βϗετ໊Λղܾͯ͠ΫϥελʹΞΫηε

    View Slide

  53. "[LBCBOΛ࢖͏্Ͱͷ޻෉
    Production
    Hive Cluster
    Development
    Hive Cluster
    User A
    User B
    Archive
    Bucket
    Analysis
    Bucket
    read
    write
    submit
    job
    register
    job
    beeline
    azkaban
    Temp
    Hive Cluster
    &.3ͷλά͔ΒCFFMJOFʹ౉͢ϗετ໊ΛٻΊ࣮ͯߦ

    View Slide

  54. API
    Server
    Archive
    Bucket
    Analysis
    Bucket
    ETL
    Cluster
    Analysis
    Cluster
    ϩάղੳ༻Ϋϥελ
    BI Tool
    User
    Web UI

    View Slide

  55. 1SFTUP
    w ෼ࢄ42-ΫΤϦ࣮ߦΤϯδϯ
    w 'BDFCPPL͕044ͱͯ͠ެ։
    w "/4*42-ʹ४ڌ͠ɺҰൠతͳूܭؔ਺ͳͲΛαϙʔτ
    w 1SFTUP͸ετϨʔδΛ࣋ͨͳ͍
    w ༷ʑͳσʔλιʔε )JWFɺ.Z42-
    ʹΞΫηε
    w ৄࡉ͸5SFBTVSF%BUB༷ͷεϥΠυʹͯ

    View Slide

  56. Presto: Interactive SQL Query Engine for Big Data | Hadoop Conference in Japan 2014

    View Slide

  57. 1SFTUPͷ͍͍ͱ͜Ζ
    w ΫΤϦͷ࣮ߦ͕ߴ଎
    w ਺ेԯߦͷεΩϟϯʹ਺ඵʙ਺ेඵ
    w ਺ेԯYҰઍສ͘Β͍ͷ+0*/ͯ͠΋਺ेඵʙ਺෼
    w νϡʔχϯά΍ϊʔυ૿΍ͤ͹͞Βʹߴ଎ԽͰ͖ͦ͏

    View Slide

  58. 1SFTUPͷ͍͍ͱ͜Ζ
    w ෳ਺σʔλιʔεΛ+0*/Ͱ͖Δ
    w ྫ)JWFͷϩάͱ.Z42-ͷϚελ
    w .Z42-͸ϑϧεΩϟϯ૸ΔͷͰ஫ҙ
    w ྫΞϓϦͷ)JWFͷϩάͱΞυͷ)JWFͷϩά
    w ૄ݁߹ͳγεςϜಉ࢜ͷϩάΛܨ͗ࠐΊΔ
    w "NB[PO"VSPSBͱ΋+0*/Ͱ͖Δ͸ͣ
    w ݕূͨ͠Θ͚Ͱ͸ͳ͍

    View Slide

  59. 1SFTUPͷ͍͍ͱ͜Ζ
    w ӡ༻͕Χϯλϯ
    w "84͕&.3ͷCPPUTUSBQBDUJPOΛఏڙ
    w IUUQTHJUIVCDPNBXTMBCTFNSCPPUTUSBQBDUJPOT
    w ࣾ಺޲͚ʹҰ෦֦ு͍ͯ͠Δ
    w #MVF(SFFO%FQMPZNFOUͰόʔδϣϯΞοϓ
    w ৽όʔδϣϯ͕ग़ͨΒ৽ͨʹΫϥελىಈ
    w ݕূͯ͠໰୊ͳ͚Ε͹ɺ%/45BHΛ੾Γସ͑Δ

    View Slide

  60. API
    Server
    Archive
    Bucket
    Analysis
    Bucket
    ETL
    Cluster
    Analysis
    Cluster
    ՄࢹԽ
    BI Tool
    User
    Web UI

    View Slide

  61. $IBSUJP
    w ༷ʑͳσʔλιʔεΛ૊Έ߹ΘͤͯμογϡϘʔυ࡞ΕΔ
    w όοΫΤϯυ͕੾Γସ͑ΒΕΔ,JCBOBͷΠϝʔδ
    w .Z42-ɺ3FETIJGUɺ(PPHMF"OBMZUJDT #JH2VFSZ FUD
    w 1SFTUP͸1SFTUPHSFTPSTTIUVOOFMͰ઀ଓ
    w ՄࢹԽपΓͷ࢓ࣄΛָʹͯ͘͠ΕΔ
    w υϥοάυϩοϓ42-Ͱνϟʔτ࡞੒
    w 6*΋ΩϨΠͰຖ೔ݟΔؾ͕ى͖Δ

    View Slide

  62. μογϡϘʔυͷ࡞Γํ
    ΫΤϦΛ૊ΈཱͯΔ
    υϥοάυϩοϓ42-௚઀ೖྗ

    σʔλΛϓϨϏϡʔ͠ͳ͕Β੔ܗ
    ߜΓࠐΈɺฒͼସ͑ɺΧϥϜ௥Ճ

    ༷ʑͳܗࣜͰՄࢹԽ
    දɺԁάϥϑɺ๮άϥϑͳͲ

    View Slide

  63. μογϡϘʔυͷྫ

    View Slide

  64. $IBSUJPͷ͍͍ͱ͜Ζ
    w ϑϩϯτΤϯυͷࡉ͔͍࡞ΓࠐΈ͕ෆཁʹ
    w όοΫΤϯυΛ࡞Ε͹ɺޙ͸ՄࢹԽ͍ͨ͠ਓʹ೚ͤΔ
    w 42-ͷൣғͰ͋Ε͹ɺ͔ͳΓ৭ʑͱͰ͖Δ
    w $BUFHPSJDBM%SPQEPXOͰଐੑΛߜΓࠐΜͰ͍͚Δ
    w શମूܭ
    w ಛఆͷ04YಛఆͷࠃYಛఆͷνϟϯωϧ
    w σʔλΛ৭Μͳ֯౓͔Βݟ͍͖ͯ΍͍͢

    View Slide

  65. ༷ʑͳଐੑͰߜΓࠐΉྫ
    ӳࠃͰ iOS ͷ bbc.co.uk ͷهࣄͷϥϯΩϯάͲ͏ͳͬͯΔʁ
    ྫϝσΟΞϦϨʔγϣϯͷਓͷґཔ

    View Slide

  66. ༷ʑͳଐੑͰߜΓࠐΉྫ
    ΧϦϑΥϧχΞͰ࣮ࢪதͷɺ͋Δ A/B ςετͷঢ়گͲ͏͚ͩͬʁ
    iPhone 6 Ͱͷௐࢠ͕ѱ͍Έ͍ͨͳͷ͚ͩͲ֬ೝͰ͖Δʁ
    ྫΞϓϦ։ൃऀͷґཔ

    View Slide

  67. SF*OWFOUͰަྲྀ

    View Slide

  68. ϩάղੳج൫ͷվળʹΑΔޮՌ
    w ༷ʑͳ࣠ͰߜΓࠐΈΛ্ͨ͠ͰͷϝτϦΫεՄࢹԽ
    w ࠃผɺ04ผɺνϟϯωϧผɺσόΠεผ
    w "#5FTUͷ݁ՌΛ֬ೝ
    w هࣄબ୒ΞϧΰϦζϜɺνϡʔτϦΞϧվળ
    w ࠓ·ͰݟΕͯͳ͔ͬͨࢦඪ͕ݟΕΔΑ͏ʹ
    w େن໛σʔληοτ޲͚ͷػցֶशϥΠϒϥϦͷར༻
    w )JWFNBMM΍4QBSL.-MJCͳͲ

    View Slide

  69. (PPHMF#JH2VFSZͷ࿩
    w ͔ͳΓഁյతͳςΫϊϩδʔ
    w ਺ઍ୆ن໛ͷίϯςφͰϩάΛϑϧεΩϟϯ
    w )JWFͰ࣋ͬͯΔϩάΛ#2ʹ΋ૹͬͯݕূத
    w 4ˠ($4ˠ#2ʹϩʔυ HTVUJMTZOD

    w ετϨʔδͱΫΤϦ՝ۚʹؔͯ͠͸͍҆
    w ,1*ूܭ͢Δ͚ͩͳΒίϨ͚ͩͰΑ͍ͷͰ͸ͱ͍͏ҹ৅

    View Slide

  70. શ෦#2೚ͤͰ͍͍ͷ͔ʁ
    w ԿΛࣗ෼ͨͪͰ΍ΓɺԿΛଞਓʹ೚ͤΔͷ͔ҙࣝ͢Δ
    w ࣗ෼Ͱ΍Ε͹ܦݧ΍஌͕ࣝ஝ੵ͞Ε͍ͯ͘
    w ଞਓʹ೚ͤΔͱֶͼ͸গͳ͍͕ίετ͸Լ͕Δ
    w ๻Βʹͱͬͯ͸σʔλΛѻ͏ٕज़͸େࣄʹ͢΂͖ཁૉ
    w ͋Δఔ౓͸ࣗ෼ͨͪͰ΋௥͍͔͚Δ΂͖
    w ݁Ռͱͯ͠)JWFNBMM΍4QBSL.-MJC͕ར༻Ͱ͖ͨΓ
    w Ͱ΋ɺ#2͕׆༻Ͱ͖Δ෦෼͸ੵۃతʹ࢖͍ͬͯ͘༧ఆ
    w 6%'΋ग़Δͱ͍͏΢ϫα

    View Slide

  71. ࠓޙͷల๬ͳͲ
    w ͞ΒͳΔར༻έʔεͷ֦େ
    w ༷ʑͳϩάΛ༷ʑͳ࣠Ͱ੾ͬͯղੳ͠ɺαʔϏεվળ
    w ྫ͑͹ΞϓϦͷόʔδϣϯผͷϨεϙϯελΠϜ֬ೝͱ͔
    w αʔϏεͷҟৗݕ஌
    w ಛఆͷ04Yಛఆͷόʔδϣϯ͚ͩϨεϙϯε஗͍ͱ͔
    w ػցֶशͷΤΩεύʔτ͕ଟ͍ɺܙ·Εͨ؀ڥ

    View Slide

  72. ·ͱΊ
    w εϚʔτχϡʔεͷϩάղੳج൫ͷաڈͱݱࡏ
    w ձࣾͷن໛΍ঢ়گʹԠͯ͡ਐԽ͖ͯͨ͠
    w "84ɺ044ɺ9BTB4FSWJDFͷྗΛआΓͯվળ
    w &.3'MVFOUE)JWF"[LBCBO1SFTUP$IBSUJP
    w ΑΓεϚʔτͳχϡʔεΞϓϦʹਐԽ͍ͯ͘͠༧ఆ
    w ͓ख఻͍͍͚ͨͩΔํɺઈࢍืूதͰ͢ʂʂʂ

    View Slide

  73. ࠂ஌

    View Slide

  74. https://atnd.org/events/64096

    View Slide

  75. 4NBSU/FXT5(*'
    w 4NBSU/FXT5(*'ͱ͸ʁ
    w ֎෦ͷਓΛট଴ͯ͠ΦϑΟεͰަྲྀձ
    w ඒຯ͍͠έʔλϦϯάͱΞϧίʔϧ͕ग़·͢ʢແྉʂʣ
    w ڵຯ͕͋Δํ͸!UBLVT·Ͱ࿈བྷ͍ͩ͘͞
    w ࠓ೔ͷ࿩Λ΋ͬͱৄ͘͠ฉ͖͍ͨ
    w ΞϓϦͰ"#ςετΛ͕Γ͕Γճ͢࿩͕ฉ͖͍ͨ
    w ޿ࠂαΠυͷϩά΍"#ςετͷ࿩͕ฉ͖͍ͨ

    View Slide