メルペイにおける、マイクロサービスに寄り添うログ収集基盤 / Microservices-frendly Data Pipeline

4ab3fec3e82ddb19bcadd93ef909a443?s=47 Ryo Okubo
September 08, 2018

メルペイにおける、マイクロサービスに寄り添うログ収集基盤 / Microservices-frendly Data Pipeline

A LT slide for builderscon 2018

4ab3fec3e82ddb19bcadd93ef909a443?s=128

Ryo Okubo

September 08, 2018
Tweet

Transcript

  1. Microservices-frendly Data Pipeline merpay DataPlatform @syucream

  2. Microservices in merpay ServiceA ServiceB ServiceC App App App

  3. Data sources vs usecases D-service B-service A-service C-service KPI Analytics

    Fraud Detection Credit Scoring Funnel Analytics ML system Customer Support
  4. microservice -A microservice -B microservice -C Data Pipeline datauser -A

    datauser-B BigQuery BigQuery Event Log Event Log Event Log DB DB DB batch transfer stream transfer batch transfer batch transfer stream transfer stream transfer BqLoad Tool Cloud Storage ToGCS Tool? Cloud Dataflow BqLoad Tool Publish message Subscribe message merpay DataPipeline
  5. Batch (prototype) microservice-B merpay-dataplatform Data User - A microservice-A Cloud

    Pub/Sub BigQuery Cloud Functions data mart change notification Pub/Sub trigger (BqLoad path) BqLoad Cloud SQL data lake Cloud Storage Cloud Spanner microservice-C Cloud Datastore Data Pipeline
  6. Stream (prototype) Microservice platform team Kubernetes cluster A-service B-service merpay-dataplatform

    Logging Cloud Pub/Sub Cloud Dataflow BigQuery stdout via logging library Sink to Pub/Sub Subscribe Data User - A BigQuery DWH Streaming Insert
  7. Schema Registory • Pre-define log schema in ProtocolBuffer ◦ It’s

    popular in mercari/merpay ◦ There’s are some useful protoc plugins • Manage .proto files on GitHub
  8. Remaining issues • Batch ◦ Scalability • Stream ◦ Reliability

    of google-fluentd DaemonSet • Schema Registory ◦ Who does defining/reviewing the schemas?