$30 off During Our Annual Pro Sale. View Details »

Architecture Modifications to Deal with 5 to 10 Times More Access than Usual as a Regular Event

Architecture Modifications to Deal with 5 to 10 Times More Access than Usual as a Regular Event

Yuki Matsumoto (ValueCommerce / Technology Development Division / CTO)

https://tech-verse.me/ja/sessions/233
https://tech-verse.me/en/sessions/233
https://tech-verse.me/ko/sessions/233

Tech-Verse2022
PRO

November 18, 2022
Tweet

More Decks by Tech-Verse2022

Other Decks in Technology

Transcript

  1. None
  2. Load measures - System modification of bottlenecks for the next

    campaign. - The previous issue has been resolved, but problem occurred elsewhere. - System burst due to sudden traffic such as a campaign. - As a result, scale out and scale up in a hurry.
  3. Agenda - Service overview - System architecture (old) - System

    load Issues - Points of system improvement - System architecture (new) - Summary
  4. STORE’s R∞ service overview CRM (Customer Relationship Management) Analysis Visualize

    your store customer details. • Store’s customer data • Yahoo! JAPAN Shopping Premium member rate • Signs of estrangement, • etc. Campaign Automatic distribution of optimal campaign. Who What When Verify the effects PDCA for customer nurturing.
  5. User STORE’s R∞ service overview CRM (Customer Relationship Management) API

    Cart pages Product detail pages Yahoo! JAPAN Shopping ValueCommerce
  6. Display the coupons that the customer is applying and the

    coupons that can be used. Notify customers who are viewing product details about coupons that can be used. STORE’s R∞ service overview CRM (Customer Relationship Management) Product detail pages Cart pages
  7. Yahoo! JAPAN Shopping campaign - Regular campaigns by day of

    the week, date, etc. - 5 to 10 times more requests (RPS) than usual. - The amount of data such as purchase data is the same.
  8. Thu. Fri. Sat. Sun. Mon. Tue. Wed. Thu. Fri. Sat.

    Sun. Mon. Tue. Wed. Yahoo! JAPAN Shopping campaign Specific two-week request. Regular campaigns by day of the week. Regular campaigns by date.
  9. Coupon data User data (RDBMS) User Store System Architecture (old)

    Deliver Process Deliver Server (Java Servlet) Yahoo! JAPAN Shopping
  10. Store Access Log Purchase Log (RDBMS) System Architecture (old) Aggregate

    Process Deliver Server (Java Servlet) Apache Spark
  11. Delivery process Response deterioration User data update Update delay Analysis

    data update Aggregation delay Issues
  12. Display the coupons that the customer is applying and the

    coupons that can be used. Notify customers who are viewing product details about coupons that can be used. Issues Product detail pages Cart pages
  13. Issues Analysis Visualize your store customer details. • Store’s customer

    data • Yahoo! JAPAN Shopping Premium member rate • Signs of estrangement, • etc. Campaign Automatic distribution of optimal campaign. Who What When Verify the effects PDCA for customer nurturing.
  14. Improvement plan Apache Spark scale out. RDBMS scale out. ⎯

    Automatically scale out in advance based on CPU / Memory / Load Average. ⎯ Tuning Thread Count ⎯ Add the number of cluster nodes in advance. ⎯ Augment read-only read replicas. API Server scale out.
  15. Improvement plan Apache Spark scale out. RDBMS scale out. ⎯

    Automatically scale out in advance based on CPU / Memory / Load Average. ⎯ Tuning Thread Count ⎯ Add the number of cluster nodes in advance. ⎯ Augment read-only read replicas. API Server scale out. Fundamentally change the system configuration. Change to a system architecture that is resistant to load.
  16. Points of system improvement Deliver server Aggregate process User data

  17. Points of system improvement Deliver server Aggregate process User data

  18. Params: user id and product code. Get user data Get

    coupon data Return target coupon Acceptance of requests Deliver server processing flow User data: 30-40 million. Coupon data: thousands.
  19. Params: user id and product code. Get user data Get

    coupon data Return target coupon Acceptance of requests Deliver server processing flow User data: 30-40 million. Coupon data: thousands. Key-Value Store mdbm (memory-mapped key/value store)
  20. Coupon data structure (mdbm) Key: Store ID typedef struct {

    int64_t client_id; int32_t campaign_num; campaign_data_t campaigns[]; } campaign_list_t; typedef struct { _birth_t birth; _range_t age_range; _prefecture_t prefecture; _rank_t rank; _range_t purc_count_total; _range_t purc_amount; ・・・ } typedef struct { int64_t campaign_id; _targeting_condition_t condition; ・・・ } campaign_data_t;
  21. User Data (redis) Coupon Data (RDBMS) Store mdbm generate batch

    Yahoo Shopping System Architecture (new) Deliver Process Deliver Server (Apache Module) Coupon Data (mdbm)
  22. Points of system improvement User data Deliver server Aggregate process

  23. Demographic info • Sex • Residential areas • Age •

    Birth month Action history • Last access • Access to specific products • Purchase of specific products • Non-purchase of specific products • Purchase amount in the past Custom info • Yahoo premium member • SB smartphone user • Ymobile smartphone user • Rank specified by the store User data type
  24. Redis data structure field = "customer" KEY = <User identifier>

    value = <compress json string in Zstandard format> field = "acc" value = <compress json string in Zstandard format> field = "rank" value = <compress json string in Zstandard format> { "sex": 1, "birth": [1964, 1], "prefecture": 27, ... } { "<store id>": [ {"item": "<item code>", "time": 1665975527}, {"item": "<item code>", "time": 1666000534} ], ... } { "<store id>": { "purcHist": [ {"amount": 7251, "item": "<item code>", "time": 1666003823, "type": 3679}, {"amount": 2709, "item": "<item code>", "time": 1666144923, "type": 3679}, ], "repeat": 103428, "rank": 129283, "lastPurc": 1666144923, "pCount": 1, "pAmount": 9960, "tCount": 1, "tAmount": 9960 }, ... "mTCount": 247, "coupon": [ "Nzg1ZDRmYTAzOGRhZjRlMmRkNTNiMTk4YjA3" ] }
  25. Issue of updating • Process of updating an existing record

    • Need to atomic User Data (redis) Deliver Server (Apache Module) 1. Get 6. Set 2. Decompression 3. Json parse 4. Update
  26. Issue of updating • Process of updating an existing record

    • Need to atomic User Data (redis) Deliver Server (Apache Module) 1. Get 6. Set 2. Decompression 3. Json parse 4. Update Redis Module (Introduced from redis 4.x)
  27. VCROO.HMGET Original implemental command ⎯ Same as standard HGET ⎯

    Same as standard HMGET VCROO.HGET Redis> VCROO.HGET <UID> acc {"17603":[ { "item":”<item code>", "time":1666459863 } ], "1273540":[ { "item":”<item code>", "time":1666253214 } ], … } VCROO.RECCV ⎯ Update specific product access in “acc” field ⎯ Update specific product purchase in “cv” field VCROO.RECACC ・・・
  28. Points of system improvement Aggregate process Deliver server User data

  29. Customer Rank • Different settings for each store • Aggregated

    for up to 2 years of data • Re-aggregation is required when settings are changed Rank Aggregate Processing
  30. Data Source (RDBMS) Data Source (RDBMS) Issue of adding node

    node node node node ・・・・・
  31. Data Source (RDBMS) Data Source (RDBMS) Issue of adding node

    node node node node node node ・・・・・
  32. Instance Storage Request Infrastructure cost (EMR / RDBMS)

  33. Data (S3) Side Data (RDBMS) Store Side data upload batch

    System Architecture (new) Deliver Server (Apache Module) node node node
  34. Data (S3) Side Data (RDBMS) Store Side data upload batch

    System Architecture (new) Deliver Server (Apache Module) node node node
  35. Data (S3) Side Data (RDBMS) Store Side data upload batch

    System Architecture (new) Deliver Server (Apache Module) node node node
  36. Data (S3) Side Data (RDBMS) Store Side data upload batch

    System Architecture (new) Deliver Server (Apache Module) node node node
  37. What by change system architecture - Infrastructure cost. (Number of

    servers during campaign.) - Aggregate processing time. 5-6 hours. 2-3 hours. About 40% reduction
  38. Summary System load and performance issues - Provisional solution In

    a situation that services continues to grow, it will be difficult to deal only with a provisional solution. - Fundamental architectural change Architecture changes should also be considered based on medium- to long-term maintenance costs and infrastructure costs. - Scalability Bottlenecks newly discovered in the process of service growth.
  39. Thank you