Behind the Scenes of Generative AI-Driven Products and Productivity Enhancements

Slide 1

Slide 1 text

1 Conﬁdential Behind the Scenes of Generative AI-Driven Products and Productivity Enhancements Yuki Ishikawa April 16, 2024

Slide 2

Slide 2 text

2 Confidential Yuki Ishikawa Mercari, Inc. VP of Generative AI / LLM After graduating from the University of Tokyo, he joined Nintendo Co., Ltd. in 2012. In 2014, he joined Moi Corporation (TwitCasting) and engaged in various development projects and new business launches. In June 2017, he joined Mercari Group's Souzoh, Inc. (former). Later, he transferred to Mercari, Inc., and from July 2020, he served as the Executive Officer and VP of Product at Merpay, Inc. From January 2021, he became the Representative Director and CEO of Souzoh, Inc. From July 2022, he concurrently served as the Executive Officer and VP at Mercari, Inc. He assumed his current position in May 2023.

Slide 3

Slide 3 text

3 Conﬁdential Introduction of Mercari

Slide 4

Slide 4 text

　　 “Circulate all forms of value to unleash the potential in all people” 4 Group Mission

Slide 5

Slide 5 text

　　 About February 1st, 2013 Established Tokyo, Fukuoka, Palo Alto, Bangalore Ofﬁces 2,101 (including subsidiaries) Headcount 5

Slide 6

Slide 6 text

　　 What Is Mercari? ● Service launch: July 2013 ● Operating systems: Android, iOS *Can also be accessed through web browsers ● Usage fee: Free *Sales fee for sold items: 10% of the sales price ● Regions/languages supported: Base specs for Japan/Japanese ● Total number of listings to date: More than 3 billion *As of November 2021 Many sellers enjoy having the items they no longer need purchased and used by buyers who need them, and buyers enjoy the feeling of hunting for treasure as they search through unique and diverse items for lucky ﬁnds. In addition to buying and selling, users actively communicate through the buyer/seller chat and the “Like” feature. The Mercari app is a C2C marketplace where individuals can easily sell used items. We want to provide both buyers and sellers with a service where they can enjoy safe and secure transactions. Mercari offers a unique customer experience, with a transaction environment that uses an escrow system, where Mercari temporarily holds payments, and simple and affordable shipping options. 6

Slide 7

Slide 7 text

　　 YoY +11% YoY +14% Billion JPY Million users Marketplace—GMV/MAU  7 1. Starting from FY2022.6, graph reﬂects retroactive adjustment to combine C2C and B2C 2. Quarterly average number of users who browsed our service (app or web) at least once during a given month GMV1/MAU2 (Billion JPY) (Million users) 2 1

Slide 8

Slide 8 text

8 Conﬁdential GenAI Initiatives

Slide 9

Slide 9 text

9 Conﬁdential Mercari's initiatives related to generative AI and LLM Establishment of Dedicated Generative AI/LLM Team Utilizing LLMs for SEO Conducting Merpay LLM Hackathon Mercari ChatGPT Plugin Release Formulation of Guidelines for LLM Usage Utilizing Generative AI for Videos in OOH Advertising Conducting Company-Wide Hackathon "Mercari AI Builders Fest" Utilizing Generative AI Visuals in Recruitment 2023 May June July, August Utilizing Generative AI Visuals in Advertising September, October, November, December 2024 January, February, March Release of "Mercari Listing Battle" on GPT Store Release of "Mercari Product Search" on GPT Store Release of Mercari AI Assist (Listing Support Feature) Release of Mercari AI Assist (Purchase Support Feature) Category Re-Mapping Using LLMs

Slide 10

Slide 10 text

10 Conﬁdential Execution Mission of the Dedicated Generative AI/LLM Team ● Creating new customer experiences and maximizing business impact by utilizing generative AI/LLM technology ● Dramatically improving company-wide productivity Execution

Slide 11

Slide 11 text

11 Conﬁdential Execution Speciﬁc Initiatives

Slide 12

Slide 12 text

12 Conﬁdential What the LLM Team is Doing Building and Enabling

Slide 13

Slide 13 text

13 Conﬁdential What the LLM Team is Doing Building and Enabling

Slide 14

Slide 14 text

14 Conﬁdential Application and Adaptation to Existing Products Individual Function Teams (Team names are for illustrative purposes) Seller UX Buyer UX CS Fintech LLM team Planning, model selection, prompt engineering, product implementation, etc. B2C XB

Slide 15

Slide 15 text

15 Conﬁdential Application and Adaptation to Existing Products (2) Co-creation (1) Leadership Cases where the Function team takes the lead, and the LLM team works alongside them, reviewing LLM-related aspects as needed Cases where the dedicated LLM team takes ownership and carries out everything from planning to implementation

Slide 16

Slide 16 text

16 Conﬁdential SEO Improvement (Already Released) Generating search result title information for Mercari's search screen related to SEO using LLM ● Keyword overlap between "parasol" and "umbrella" ● Notation of brand names Generating titles using LLM for Category X Brand Name.

Slide 17

Slide 17 text

17 Conﬁdential

Slide 18

Slide 18 text

18 Conﬁdential How to Use the "Improvement Suggestion for Listed Items" Feature Suggestions from the AI assistant are delivered for items that can be improved STEP.1 Open the chat and choose from the AI assistant's suggestions STEP.2

Slide 19

Slide 19 text

19 Conﬁdential How to Use the "Improvement Suggestion Feature for Listed Items" Follow the AI assistant's instructions to proceed with the selection STEP.3 After updating the content and completing the process, the information for the listed item will be updated STEP.4

Slide 20

Slide 20 text

20 Conﬁdential "Mercari AI Assist" Future Plans Purchase Support Features Listing support Features Troubleshooting Features Various features are scheduled for release

Slide 21

Slide 21 text

21 Conﬁdential Publish Mercari Ofﬁcial GPTs on the OpenAI GPT Store Mercari Item search Actively creating GPTs to explore new experiences Mercari listed item battle

Slide 22

Slide 22 text

22 Conﬁdential What the LLM Team is Doing Building and Enabling

Slide 23

Slide 23 text

23 Conﬁdential ① Guideline Formulation ● Enabling not only the ML team but also general SWE teams to implement products ● Formulating guidelines in collaboration with Mercari's R&D organization "R4D" ● Publicly releasing developer guidelines (link) ② Study Sessions and Hackathons Company-wide Initiatives for LLM Readiness ● Conducting internal study sessions on an irregular basis. ● Holding hackathons for all job types, not limited to engineers. ● Conducted 3 times in half a year: April at Mercari, June at Merpay, and September at Mercari. Continuing to be held at various locations.

Slide 24

Slide 24 text

24 Conﬁdential Mercari Employee-Only Tool To promote internal usage, we created a Mercari employee-only "ChatGPT" that allows input of work-related information. It also supports GPT-4, Google Gemini, and Anthropic Claude3.

Slide 25

Slide 25 text

25 Conﬁdential Mercari Employee-Only Tool In addition to the Code Interpreter and image generation features, it also includes a translation mode, an SQL generation function compatible with Mercari's data, and a document search function for engineers.

Slide 26

Slide 26 text

26 Conﬁdential - Do Not Share Behind the Scenes of Product Development with LLM

Slide 27

Slide 27 text

27 Conﬁdential Product Development Using LLM Fast? Cheap? Tasty? =good? Japanese beef bowl (gyudon)

Slide 28

Slide 28 text

28 Conﬁdential - Do Not Share Feasibility check Fast LLMs can perform a wide range of tasks with high accuracy Data creation Model training Release Effect verification LLM + Few-shot learning as an alternative Release Effect verification What you want to do It takes time and cost to get verification results Enables PoCs for many tasks with reasonable accuracy = Dramatic reduction in PoC costs

Slide 29

Slide 29 text

29 Conﬁdential - Do Not Share Fast? An example initially tested on the item list screen of "Mercari AI Assist" (2023.09) ● GPT-3.5: 1.5~2.5 seconds/item ● GPT4: 3~5 seconds/item The response speed of LLMs can sometimes take time depending on the usage and model, and for consumer-facing services that require quick responses, some ingenuity is necessary. Artiﬁcial Analysis

Slide 30

Slide 30 text

30 Conﬁdential - Do Not Share Cheap ● Being able to use a model equivalent to GPT3.5 at this price is cheap ○ Until January 25, 2024 (GPT-3.5 Turbo) ■ Input: $1.0 / 1M tokens ■ Output: $6.0 / 1M tokens ○ After January 25, 2024 (GPT-3.5 Turbo) ■ Input: $0.5 / 1M tokens ■ Output: $1.5 / 1M tokens ● Claude Haiku, which recently came out, is even cheaper. Claude 3 Haiku Artiﬁcial Analysis See also: New embedding models and API updates

Slide 31

Slide 31 text

31 Conﬁdential - Do Not Share Cheap? ● While smaller models are cheaper, they are not always adequate. But as model size increases, cost also rapidly goes up. ● Especially for large-scale consumer-facing services like Mercari, trying to use them without any consideration can result in a hefty price (although it has become much cheaper compared to last year) Ex. Assuming Mercari has 2 million listings per day, and if some kind of LLM processing is applied to all items (with only 1 LLM call per item), the annual cost for GPT-4 Turbo would be 550 million yen (however, with GPT-3.5 Turbo, it would be 27 million yen) Artiﬁcial Analysis

Slide 32

Slide 32 text

32 Conﬁdential - Do Not Share Tasty (=good) It translates unique words appropriately without any additional training If you design appropriate ﬂows, it can make decisions and take actions on its own Although I can't write everything here, as you all know, it's really tasty (amazing) in so many ways! It can understand and generate vision and audio as well Pokémon character "リザードン" English: Charizard French: Dracaufeu Korean: 리자몽 Mercari AI Assist Purchase Assistance Feature What would be a good birthday present for my 5-year-old daughter? GPT Additional questions Hit the API and display search results etc. Customer Support How can I assist you today? Let me take a look at the photo you provided It can understand the other person's speech and respond verbally It can also read image information

Slide 33

Slide 33 text

33 Confidential - Do Not Share Tasty (=good)? There are many great things about it! On the other hand, controlling the output content is particularly difficult ● It sometimes states incorrect things as if they were correct (hallucination) ● Even with the same input, prompts, etc., and after a lot of correction, it may occasionally (e.g., 0.01% of the time) produce different outputs ● It still struggles with tasks like providing confidence scores or calculating meaningful scores ● QA is extremely challenging 🤮

Slide 34

Slide 34 text

34 Conﬁdential - Do Not Share Ingenuity in Product Development Using LLM It's the most exciting and enjoyable technology, but when using it in products provided to customers, it's generally necessary to face the difﬁculty of control Control Challenges🔥 ● Output content (hallucination) ● Output speed (latency) ● Cost

Slide 35

Slide 35 text

35 Conﬁdential - Do Not Share Ingenuity in Product Development Using LLM at Mercari 1. Using LLM as the underlying logic in a way that is not directly visible to customers ex. "Using in web SEO" 2. Interaction between customers and LLM (without free input) Initially adopting a selection-based approach with no free input from customers ex. "Mercari AI Assist (Listing Improvement Suggestion Feature)" 3. Interaction between customers and LLM (with free input) Allowing free input while controlling the input scope and output to a certain extent ex. "Mercari AI Assist (Purchase Support Feature)" 4. Moving towards initiatives with even greater freedom For product initiatives with a large scope of impact, we are taking the approach of gradually expanding the allowable range while learning from the initiative in the production environment

Slide 36

Slide 36 text

36 Confidential - Do Not Share Ingenuity in Mercari AI Assist (Listing Improvement Suggestion Feature) We were using GPT-3.5 for the underlying logic, but we wondered if we could find a model that has ● Higher accuracy & more stable output than GPT-3.5 ● Lower latency ● Lower cost =>With this motivation, we executed fine-tuning of a small-scale OSS model Used for extracting product information

Slide 37

Slide 37 text

37 Confidential - Do Not Share Fine-tuning is a technique for tuning a pre-trained model to optimize it for a specific task Instead of updating all parameters, we adopted QLoRA, which is a type of PEFT (Parameter-efficient fine-tuning) that efficiently updates only a subset of the parameters (chosen because it can be executed quickly without compromising quality) About the adopted method (Fine-tuning with QLoRA) [2305.14314] QLoRA: Efficient Finetuning of Quantized LLMs

Slide 38

Slide 38 text

38 Conﬁdential - Do Not Share Consideration of models to adopt (Japanese LLMs) We considered the top 5 models of 7B size in LLM JP Eval (Multi-Choice QA, NLI, QA, Reading Comprehension) as candidates (as of December 2023) ● tokyotech-llm/Swallow-7b-instruct-hf (Llama-based) ● rinna/nekomata-7b-instruction (Qwen-based) ● stabilityai/StableBeluga-7B (Llama-based) ● rinna/youri-7b-instruction (Llama-based) ● mistralai/Mistral-7B-Instruct-v0.2 (Mistral-based) Reference: Nejumi LLM Leaderboard

Slide 39

Slide 39 text

39 Confidential - Do Not Share 1. Prepare the fine-tuning dataset ○ Output: JSON/CSV files 2. Fine-tuning (one A100 GPU (48GB)) ○ Output: LoRA adapters + base model (merged) - ~29GB 3. Post-training Quantization ○ Using llama.cpp ○ Output: GGUF model file - ~4GB 4. Evaluate ○ BLEU Score (4-gram) ■ 1.3x better than GPT-3.5 Turbo ○ Cost (10 pods NVIDIA Tesla T4) ■ 14x cheaper than GPT-3.5 Turbo (OpenAI) Fine-tuning with QLoRA

Slide 40

Slide 40 text

40 Conﬁdential - Do Not Share Behind the Scenes of Productivity Improvement with Generative AI

Slide 41

Slide 41 text

41 Conﬁdential - Do Not Share Company-wide use of generative AI services is increasing GitHub Copilot usage also skyrocketed in about half a year from July 2023 to February Changes in DAUs Usage of "ChatGPT" service exclusively for Mercari employees grew by 230% in one year Total Shown: 8x increase Total Accepts: 6.5x increase Acceptance Rate: Around 30%

Slide 42

Slide 42 text

42 Confidential - Do Not Share Providing specific features for specific use cases In order to further promote deeper usage, it is important to provide specialized features for specific use cases in addition to the general features of the "ChatGPT” service exclusively for Mercari employees. ● mercari Dev Assist LLM provides answers based on Mercari's internal technical documents ● mercari Analytics Assist A tool that assists with data analysis based on knowledge from Mercari's databases

Slide 43

Slide 43 text

43 Confidential - Do Not Share Providing specific features for specific use cases "How do I create MS from scratch?" GPT-4 Generate Query: "implement a new microservice". Vector DB by FAISS Microservice Wiki In-House Dev Services Github Issue Slack Message Generate Summary & Create embeddings by LLM GPT-4 & text-embedding-3 Respond with Related documents Answer based on documents

Slide 44

Slide 44 text

44 Conﬁdential Formation of Cross-Organizational Virtual Teams The dedicated generative AI team forms virtual teams with various internal teams to execute advancement projects lasting 3 to 6 months. July~September Generative AI x Marketing & Creative Team October~December Generative AI x Analytics Team Generative AI x CS Ops Team January~March Generative AI x HR & Corporate Team

Slide 45

Slide 45 text

45 Conﬁdential Recent Success Cases of Virtual Teams Formed a virtual team with Generation AI Team x Marketing & Creative Team, set OKRs and progressed the following milestones. ● July: Recruitment Visuals by Generative AI ● August: Campaign Visuals by Generative AI ● September: Video Ads by Generative AI

Slide 46

Slide 46 text

46 Conﬁdential Broadcasted @’Shibuya Scramble Crossing’ Utilizing Generative AI for Video Creation in OOH URL :https://youtu.be/IWgwWYrhaMs

Slide 47

Slide 47 text

47 Conﬁdential - Do Not Share Current Evaluation Before implementation: 12 business days Planning (3 days) → Illustration drawing (5 days) → Design creation (4 days) After implementation: 4 business days Planning (1 day) → AI-generated illustration (1 day) → Design creation (2 days) → Succeeded in reducing production man-hours while creating novel illustrations that are not found in stock materials Production Man-hours: Rating ◯ (Signiﬁcantly reduced) With recent updates, the breadth of quality and reproducibility has expanded, not only for hobby and merchandise goods but also for fashion items Reproducibility: Rating ◯ Contribution to Results: Rating ◯ (Particularly advantageous in terms of CVR)

Slide 48

Slide 48 text

48 Conﬁdential - Do Not Share CTVR Mapping Visuals and videos produced using generative AI tend to have higher CVRs. Campaigns created with generative AI, such as "One in seven people use Mercari" and "Halloween," have achieved high acquisition efﬁciency overall. Creative A Creative B

Slide 49

Slide 49 text

49 Confidential - Do Not Share Ample Room for Further Utilization For Leadership ● => Incorporation into company policies and goal setting, such as OKRs, through top-down approach can further promote adoption For Members (General Use) ● Currently, there is still a significant variation in the degree of utilization among individuals ● => Continuously implement overall improvement measures (e.g., hackathons) and provide services for specific use cases (it is also important to measure and visualize results with numbers) Utilization in Each Project ● There are cases where side support works well and cases where it is challenging ● => Assign members who are committed to AI in some form (fully assign dedicated generative AI members, hire dedicated members on the project side)

Slide 50

Slide 50 text

50 Conﬁdential "Current State of Thinking" on Generative AI Utilization within the Organization Interacting with services Creating by oneself Increasing the number of creators

Slide 51

Slide 51 text

51 Conﬁdential - Do Not Share We are hiring! ● Senior Technical Product Manager ● Engineering Manager ● Mobile Engineer, Full Stack