Overview 1MB[NB%# %BUB5BOL DynamoDB and so on… Lamdba EC2 RDS for DataMart RDS for DataMart S3 for Image/Video EC2 S3 for Data Lake Treasure Data for DWH EMR Internal System for Data Store API Gateway RDS
Transform(Parse, Filter, Enrich etc..) S3 for Data Lake Treasure Data for DWH EMR ଟ͘ͷ֎෦σʔλߏΛ&5-ॲཧͰ σʔλ׆༻ऀʹ؆୯ͳ42-ͰӾཡ͞ ͤΔ&5-ॲཧ ଟ͘ͷ߹ɺ5SBOTGPSNʹίετ͕ ͔͔Γ·ͨ͠ɻ
[FYI] Develop Embulk Plugin with Scala (1) S3 for Data Lake Treasure Data for DWH [email protected]@TUBUT $POUSJCVUPSLJNVUZBN UXJUUFSBETTUBUTͷυϝΠϯಛԽܕͷ1MVHJO ׆༻Ϣʔεέʔε ଟஈͷωετͨ͠ྻΛಛఆͷ݅Ͱ'MBUUFO͍ͨ͠ ಛఆͷ݅Ͱෳͷςʔϒϧʹ0VUQVU͍ͨ͠ [email protected]@TUBUTͰύϥϝʔλ Λάϧʔϐϯάͯ͠+40/ܕΛੜ %BUB$POOFDUPSͷ&YQMPEF+TPO$PMVNOͰ 5SFBTVSF%BUBʹൃՐ
Overview Data Connector S3 for Data Lake Treasure Data for DWH ࢫຯ 1SFTUP5SFBTVSF%BUBϗεςΟϯά %JHEBH4FSWFSෛՙܰݮ ࣗલͷࢄڥ͕ෆཁ ϫʔΫϑϩʔͷهड़ྔ͕૿͑Δ 1SFTUPΫΤϦ͕ಡԽ͢Δ߹
IN / OUT Id name type value 1 kimutyam Company Septeni Original,Inc. 1 kimutyam Language Scala 2 john Country America { "data": [ { "id": 1, "name": "kimutyam", "details": [ { "type": "Company", "value": "Septeni Original,Inc." }, { "type": "Language", "value": "Scala" }, ] }, { "id": 2, "name": "john", "details": [ { "type": "Country", "value": "America" } ] } ] } In Out
ETL with Presto SELECT time, id, name, CAST(detail['type'] AS VARCHAR) AS type, CAST(detail['value'] AS VARCHAR) AS value FROM details CROSS JOIN UNNEST(details) AS details(detail) INSERT INTO user_details WITH users AS ( SELECT time, CAST(JSON_EXTRACT(payload, '$.data') AS ARRAY>) AS data FROM raw ), details AS ( SELECT time, CAST(d['id'] AS INTEGER) AS id, CAST(d['name'] AS VARCHAR) AS name, CAST(d['details'] AS ARRAY>) AS details FROM users CROSS JOIN UNNEST(data) AS data(d) ) ※ εϥΠυͷ߹্ɺׂ