Upgrade to Pro — share decks privately, control downloads, hide ads and more …

メルペイにおける、マイクロサービスに寄り添うログ収集基盤 / Microservices-frendly Data Pipeline

Ryo Okubo
September 08, 2018

メルペイにおける、マイクロサービスに寄り添うログ収集基盤 / Microservices-frendly Data Pipeline

A LT slide for builderscon 2018

Ryo Okubo

September 08, 2018
Tweet

More Decks by Ryo Okubo

Other Decks in Programming

Transcript

  1. Data sources vs usecases D-service B-service A-service C-service KPI Analytics

    Fraud Detection Credit Scoring Funnel Analytics ML system Customer Support
  2. microservice -A microservice -B microservice -C Data Pipeline datauser -A

    datauser-B BigQuery BigQuery Event Log Event Log Event Log DB DB DB batch transfer stream transfer batch transfer batch transfer stream transfer stream transfer BqLoad Tool Cloud Storage ToGCS Tool? Cloud Dataflow BqLoad Tool Publish message Subscribe message merpay DataPipeline
  3. Batch (prototype) microservice-B merpay-dataplatform Data User - A microservice-A Cloud

    Pub/Sub BigQuery Cloud Functions data mart change notification Pub/Sub trigger (BqLoad path) BqLoad Cloud SQL data lake Cloud Storage Cloud Spanner microservice-C Cloud Datastore Data Pipeline
  4. Stream (prototype) Microservice platform team Kubernetes cluster A-service B-service merpay-dataplatform

    Logging Cloud Pub/Sub Cloud Dataflow BigQuery stdout via logging library Sink to Pub/Sub Subscribe Data User - A BigQuery DWH Streaming Insert
  5. Schema Registory • Pre-define log schema in ProtocolBuffer ◦ It’s

    popular in mercari/merpay ◦ There’s are some useful protoc plugins • Manage .proto files on GitHub
  6. Remaining issues • Batch ◦ Scalability • Stream ◦ Reliability

    of google-fluentd DaemonSet • Schema Registory ◦ Who does defining/reviewing the schemas?