Upgrade to Pro — share decks privately, control downloads, hide ads and more …

LINE DMP: Connect the All B2B Products with Over 100M Users

LINE DMP: Connect the All B2B Products with Over 100M Users

Eebedc2ee7ff95ffb9d9102c6d4a065c?s=128

LINE DevDay 2020

November 26, 2020
Tweet

Transcript

  1. None
  2. LINE B2B Products

  3. Closing the Distance between companies and users

  4. How?

  5. Data!

  6. LINE DMP › DMP = Data Management Platform › Close

    the distance between our partners and users by data › Extract / Transform essential data of LINE users and provide it to the companies
  7. What types of data do we manage? › Ads clicks,

    post-backs from installed apps, conversions Provide more detailed and effective feedbacks Make targeting more efficient › Estimated user attributes, audience delivery, geographic information › Tools to estimate reach of contents Protect user privacy and respect user choice › Data handling with care, Opt-out
  8. Scale Tag Events Per Day 1.5B Audience Data Records 200B

    Servers (VM, PM, DB..) 500+
  9. What types of data do we manage? › Ads clicks,

    post-backs from installed apps, conversions Provide more detailed and effective feedbacks Make targeting more efficient › Estimated user attributes, audience delivery, geographic information › Tools to estimate reach of contents Protect user privacy and respect user choice › Data handling with care, Opt-out
  10. What types of data do we manage? › Ads clicks,

    post-backs from installed apps, conversions Provide more detailed and effective feedbacks Make targeting more efficient › Estimated user attributes, audience delivery, geographic information › Tools to estimate reach of contents Protect user privacy and respect user choice › Data handling with care, Opt-out
  11. Users Audience Delivery? B2B Products Who should we show this

    content to? What content should be shown to this user? LINE DMP Audience DB contents Store audience information Companies
  12. Examples of Audience Types Categorization by how to accumulate each

    audience › Look Alike Audience › … Batch Accumulation Upload by Companies › Phone Number / E-mail Address Audience › IDFA/AAID Audience › LINE User ID Audience › … Real-time Accumulation › Mobile App Event Audience › LINE Tag Event Audience › Official Account Message Click Event Audience › … User A DB Advertiser X’s Web Site Audience Group 001 for X 6TFS*% 5JNFTUBNQ VTFS@B  VTFS@C  ʜ ʜ Visit Accumulate Example: LINE Tag Event Audience
  13. DMP meets DSP What kind of B2B product uses Audience

    › DSP = Demand Side Platform › Optimize ads distribution based on several KPIs › Decides which ad is shown to a user › Wants to return as fast responses to media side as possible › To distribute contents to as many users as possible! › Utilizes audience data retrieved from DMP’s DB: Redis
  14. Original Architecture Realtime Accumulation Batch Accumulation User Upload Redis (Hash)

    Our DSP ,FZ6TFS*% 'JFME ,FZ7BMVF 'JFME 7BMVF VTFS@B BVEJFODF@HSPVQ  BVEJFODF@HSPVQ  ʜ ʜ VTFS@C BVEJFODF@HSPVQ  BVEJFODF@HSPVQ  ʜ ʜ VTFS@D ʜ Redis Data Schema: Using Redis Hashes ID for a group of Audience When a user is added to the group?
  15. Problem HKEYS Latency › Using HKEYS command › Retrieve all

    field names of a single key › According to the service expansion, we have more users and more audience groups › More audience groups which a user belongs to, slower latency of HKEYS become › Difficult to satisfy the DSP’s requirement
  16. New Method about How to Store Data › Use Redis

    String with Message Pack › Key = User ID › Value = MessagePack({List of Audience Group ID}) ,FZ6TFS*% 'JFME 4USJOH VTFS@B .FTTBHF1BDL BVEJFODF@HSPVQ BVEJFODF@HSPVQ ʜ VTFS@C .FTTBHF1BDL BVEJFODF@HSPVQ BVEJFODF@HSPVQ ʜ VTFS@D ʜ
  17. New Method about How to Store Data Benchmark 0 5

    10 15 20 25 30 35 40 10 100 1000 10000 Time (ms) # of audience groups which a user belongs to グラフ タイトル hash (HKEYS) string - MessagePack(GET)
  18. Another Issue Data Atomicity › New records have only one

    value = Need for GET before SET › Need to execute the entire process above atomically › because we had the multiple points to store data into Redis › The First Approach: Redis Lua scripting Redis App 5. SET 2. Deserialize 3. Put a new group 4. Serialize 1. GET Realtime Accumulation Batch Accumulation User Upload Redis (Hash) Introduced in Redis v2.6.0
  19. Lua Script local key = KEYS[1] local unpackedAudienceGroupIds = {}

    local packed = redis.call('GET', key) if (packed) then unpackedAudiEenceGroupIds = cmsgpack.unpack(packed) end for i =1, #ARGV do ¥n" if (not unpackedAudienceGroupIds[ARGV[i]]) then unpackedAudienceGroupIds[ARGV[i]] = 1 end end local setDataPacked = cmsgpack.pack(unpackedAudienceGroupIds) redis.call('SET', key, setDataPacked) return 'OK' But, it was slow T_T - Redis’s Single Threaded Model - MessagePack Ser/Deser requires much time resource
  20. Final Overall Architecture How do we accumulate custom audiences? Realtime

    Accumulation Batch Accumulation User Upload Kafka Redis Audience Indexer Our DSP › By using ”User ID” as a Kafka key, we ensure the order of GET - > SET without any workarounds on Redis servers › Side benefits: it made easier global control of rate limits to Redis. Kafka Key: “User ID”
  21. Summary: How did we resolve the issue › Needed multi-faceted

    changes › 1. Changed the data structure › Redis Hash => Redis String › Improve GET latency › 2. Changed the process › Store directly to Redis => Via Kafka › Improve SET simplicity
  22. What types of data do we manage? › Ads clicks,

    post-backs from installed apps, conversions Provide more detailed and effective feedbacks Make targeting more efficient › Estimated user attributes, audience delivery, geographic information › Tools to estimate reach of contents Protect user privacy and respect user choice › Data handling with care, Opt-out
  23. What types of data do we manage? › Ads clicks,

    post-backs from installed apps, conversions Provide more detailed and effective feedbacks Make targeting more efficient › Estimated user attributes, audience delivery, geographic information › Tools to estimate reach of contents Protect user privacy and respect user choice › Data handling with care, Opt-out
  24. “Visitor” ▷ “Customer” = Conversion!

  25. Conversion? General Definition › Conversion Rate (CVR) = # "#

    $"%&'()*"% # "# +,) $-*./ Everyone wants better CVR ▷ Need to measure exact CVR! › Conversion = A “visitor” performs an action that you wanted them to do! › ex) Install an app, purchase a product, register an account etc..
  26. Online Conversion: Web and App Mobile App Conversion Web Conversion

  27. Architecture How to measure conversion – rough image for web

    Redirector Click Log Click Ads Store clicks Conversion Pipeline Tag Event Fetch click logs Conversion Log LINE Tag
  28. Tag Event Architecture How to measure conversion – rough image

    for web Redirector Click Log Click Ads Store clicks Conversion Pipeline Fetch click logs Conversion Log # "# $"%&'()*"% # "# +,) $-*./ LINE Tag
  29. Click Measurement › Redirector (Simple web application) › Click Log

    Storage › Use Redis › Stored with some different DB keys › Cookies (3rd party) made on the LINE domain › click ID: unique ID for each click issued in the redirector Redirector Click Log Click Ads Store clicks
  30. Architecture How to measure conversion – rough image for web

    Redirector Click Log Click Ads Store clicks Conversion Pipeline Fetch click logs Conversion Log # "# $"%&'()*"% # "# +,) $-*./ LINE Tag Tag Event
  31. Cookies: 1st party vs 3rd party › 1st party cookie

    = a cookie made on the domain in the address bar › Cookie owner = A website (which a user visits) owner › 3rd party cookie = a cookie made on a different domain › cf: iframe, image tags › Example: when a user visit https://example.com, › Cookies on ”example.com” = 1st party cookie › Cookies on any other domains = 3rd party cookie LINE’s domain Advertisers’ domain
  32. Question: How to find the past clicks? How can we

    judge if a user has already clicked the ads? › 1. Click ID: a unique ID for each click issued in the redirector › If their landing page = Conversion point, we can find a click log directly › 2. 3rd party cookie: a cookie associated with LINE’s domain › BUT: 3rd party cookie is almost dead (e.g. Safari, Chrome..) › 3. 1st party cookie: a cookie associated with advertiser’s domain › Created on JavaScript (by LINE Tag) › Oh wait, we don’t have a mapping between a click and 1st party cookie, right?
  33. Lazy mapping for 1st party cookie Prerequisite for the mapping!

  34. Two Separete Flows Here! LINE Tag Sub-flow 1: Lazy mapping

    for 1st party cookie › Sub-flow 1 should be completed before Sub-flow 2 › ex) click ads ▷ visit a LP ▷ (1sec) ▷ visit a conversion page immediately LINE Tag Frond End Sub-flow 2: Conversion Measurement
  35. Two Separete Flows Here! LINE Tag Sub-flow 1: Lazy mapping

    for 1st party cookie › Sub-flow 1 should be completed before Sub-flow 2 › ex) click ads ▷ visit a LP ▷ (1sec) ▷ visit a conversion page immediately › Add a delay before searching click logs by 1st party cookie! › To put Sub-flow 2 behind Sub-flow 1 (Best effort) LINE Tag Frond End Conversion Condition Matcher Kafka Click Log Finder Add small delay An event matches any conditions?
  36. What types of data do we manage? › Ads clicks,

    post-backs from installed apps, conversions Provide more detailed and effective feedbacks Make targeting more efficient › Estimated user attributes, audience delivery, geographic information › Tools to estimate reach of contents Protect user privacy and respect user choice › Data handling with care, Opt-out
  37. What types of data do we manage? › Ads clicks,

    post-backs from installed apps, conversions Provide more detailed and effective feedbacks Protect user privacy and respect user choice › Data handling with care, Opt-out Make targeting more efficient › Estimated user attributes, audience delivery, geographic information › Tools to estimate reach of contents
  38. Protect user privacy and choice › Our data is used

    only in the inside of LINE › Protect personal information › Provide the several opt-out functions › Respect user choice › One of DMP’s missions › How can we provide good touchpoints to companies with keeping user privacy?
  39. Future Work › Cross Platform (= Cross B2B Products) ›

    Cross Targeting › Cross Report › Beyond Cookie & IDFA, AAID › Strengthen utilization of data owned by LINE › E.g. LINE Login How has LINE DMP made progress?
  40. Team & Environment How has LINE DMP made progress? ›

    Manager + 3 Server-side Engineer + (1 Front-end Engineer) › Massive traffic makes developers join in discussions from the very beginning › Can enjoy the challenges as an engineer twice J › Severe business requirement › Massive traffic and complex data architecture
  41. Summary › Common Goal: Closing the Distance between companies and

    users › LINE DMP supports it with the power of data › LINE DMP provides various data with massive traffic to: › Make targeting more efficient › Give more detailed and sophisticated feedbacks › Protect user privacy and respect user choice › Challenging business problems & data processing › So many next challenges are waiting for us and you!