Behind the Scenes of Generative AI-Driven Products and Productivity Enhancements

1 Conﬁdential Behind the Scenes of Generative AI-Driven Products and
Productivity Enhancements Yuki Ishikawa April 16, 2024

2 Confidential Yuki Ishikawa Mercari, Inc. VP of Generative AI
/ LLM After graduating from the University of Tokyo, he joined Nintendo Co., Ltd. in 2012. In 2014, he joined Moi Corporation (TwitCasting) and engaged in various development projects and new business launches. In June 2017, he joined Mercari Group's Souzoh, Inc. (former). Later, he transferred to Mercari, Inc., and from July 2020, he served as the Executive Officer and VP of Product at Merpay, Inc. From January 2021, he became the Representative Director and CEO of Souzoh, Inc. From July 2022, he concurrently served as the Executive Officer and VP at Mercari, Inc. He assumed his current position in May 2023.

3 Conﬁdential Introduction of Mercari

　　 “Circulate all forms of value to unleash the potential
in all people” 4 Group Mission

　　 About February 1st, 2013 Established Tokyo, Fukuoka, Palo Alto,
Bangalore Ofﬁces 2,101 (including subsidiaries) Headcount 5

　　 What Is Mercari? • Service launch: July 2013 •
Operating systems: Android, iOS *Can also be accessed through web browsers • Usage fee: Free *Sales fee for sold items: 10% of the sales price • Regions/languages supported: Base specs for Japan/Japanese • Total number of listings to date: More than 3 billion *As of November 2021 Many sellers enjoy having the items they no longer need purchased and used by buyers who need them, and buyers enjoy the feeling of hunting for treasure as they search through unique and diverse items for lucky ﬁnds. In addition to buying and selling, users actively communicate through the buyer/seller chat and the “Like” feature. The Mercari app is a C2C marketplace where individuals can easily sell used items. We want to provide both buyers and sellers with a service where they can enjoy safe and secure transactions. Mercari offers a unique customer experience, with a transaction environment that uses an escrow system, where Mercari temporarily holds payments, and simple and affordable shipping options. 6

　　 YoY +11% YoY +14% Billion JPY Million users Marketplace—GMV/MAU 
7 1. Starting from FY2022.6, graph reﬂects retroactive adjustment to combine C2C and B2C 2. Quarterly average number of users who browsed our service (app or web) at least once during a given month GMV1/MAU2 (Billion JPY) (Million users) 2 1

8 Conﬁdential GenAI Initiatives

9 Conﬁdential Mercari's initiatives related to generative AI and LLM
Establishment of Dedicated Generative AI/LLM Team Utilizing LLMs for SEO Conducting Merpay LLM Hackathon Mercari ChatGPT Plugin Release Formulation of Guidelines for LLM Usage Utilizing Generative AI for Videos in OOH Advertising Conducting Company-Wide Hackathon "Mercari AI Builders Fest" Utilizing Generative AI Visuals in Recruitment 2023 May June July, August Utilizing Generative AI Visuals in Advertising September, October, November, December 2024 January, February, March Release of "Mercari Listing Battle" on GPT Store Release of "Mercari Product Search" on GPT Store Release of Mercari AI Assist (Listing Support Feature) Release of Mercari AI Assist (Purchase Support Feature) Category Re-Mapping Using LLMs

10 Conﬁdential Execution Mission of the Dedicated Generative AI/LLM Team
• Creating new customer experiences and maximizing business impact by utilizing generative AI/LLM technology • Dramatically improving company-wide productivity Execution

11 Conﬁdential Execution Speciﬁc Initiatives

12 Conﬁdential What the LLM Team is Doing Building and
Enabling

Enabling

14 Conﬁdential Application and Adaptation to Existing Products Individual Function
Teams (Team names are for illustrative purposes) Seller UX Buyer UX CS Fintech LLM team Planning, model selection, prompt engineering, product implementation, etc. B2C XB

15 Conﬁdential Application and Adaptation to Existing Products (2) Co-creation
(1) Leadership Cases where the Function team takes the lead, and the LLM team works alongside them, reviewing LLM-related aspects as needed Cases where the dedicated LLM team takes ownership and carries out everything from planning to implementation

16 Conﬁdential SEO Improvement (Already Released) Generating search result title
information for Mercari's search screen related to SEO using LLM • Keyword overlap between "parasol" and "umbrella" • Notation of brand names Generating titles using LLM for Category X Brand Name.

17 Conﬁdential

18 Conﬁdential How to Use the "Improvement Suggestion for Listed
Items" Feature Suggestions from the AI assistant are delivered for items that can be improved STEP.1 Open the chat and choose from the AI assistant's suggestions STEP.2

19 Conﬁdential How to Use the "Improvement Suggestion Feature for
Listed Items" Follow the AI assistant's instructions to proceed with the selection STEP.3 After updating the content and completing the process, the information for the listed item will be updated STEP.4

20 Conﬁdential "Mercari AI Assist" Future Plans Purchase Support Features
Listing support Features Troubleshooting Features Various features are scheduled for release

21 Conﬁdential Publish Mercari Ofﬁcial GPTs on the OpenAI GPT
Store Mercari Item search Actively creating GPTs to explore new experiences Mercari listed item battle

Enabling

23 Conﬁdential ① Guideline Formulation • Enabling not only the
ML team but also general SWE teams to implement products • Formulating guidelines in collaboration with Mercari's R&D organization "R4D" • Publicly releasing developer guidelines (link) ② Study Sessions and Hackathons Company-wide Initiatives for LLM Readiness • Conducting internal study sessions on an irregular basis. • Holding hackathons for all job types, not limited to engineers. • Conducted 3 times in half a year: April at Mercari, June at Merpay, and September at Mercari. Continuing to be held at various locations.

24 Conﬁdential Mercari Employee-Only Tool To promote internal usage, we
created a Mercari employee-only "ChatGPT" that allows input of work-related information. It also supports GPT-4, Google Gemini, and Anthropic Claude3.

25 Conﬁdential Mercari Employee-Only Tool In addition to the Code
Interpreter and image generation features, it also includes a translation mode, an SQL generation function compatible with Mercari's data, and a document search function for engineers.

26 Conﬁdential - Do Not Share Behind the Scenes of
Product Development with LLM

27 Conﬁdential Product Development Using LLM Fast? Cheap? Tasty? =good?
Japanese beef bowl (gyudon)

28 Conﬁdential - Do Not Share Feasibility check Fast LLMs
can perform a wide range of tasks with high accuracy Data creation Model training Release Effect verification LLM + Few-shot learning as an alternative Release Effect verification What you want to do It takes time and cost to get verification results Enables PoCs for many tasks with reasonable accuracy = Dramatic reduction in PoC costs

29 Conﬁdential - Do Not Share Fast? An example initially
tested on the item list screen of "Mercari AI Assist" (2023.09) • GPT-3.5: 1.5~2.5 seconds/item • GPT4: 3~5 seconds/item The response speed of LLMs can sometimes take time depending on the usage and model, and for consumer-facing services that require quick responses, some ingenuity is necessary. Artiﬁcial Analysis

30 Conﬁdential - Do Not Share Cheap • Being able
to use a model equivalent to GPT3.5 at this price is cheap ◦ Until January 25, 2024 (GPT-3.5 Turbo) ▪ Input: $1.0 / 1M tokens ▪ Output: $6.0 / 1M tokens ◦ After January 25, 2024 (GPT-3.5 Turbo) ▪ Input: $0.5 / 1M tokens ▪ Output: $1.5 / 1M tokens • Claude Haiku, which recently came out, is even cheaper. Claude 3 Haiku Artiﬁcial Analysis See also: New embedding models and API updates

31 Conﬁdential - Do Not Share Cheap? • While smaller
models are cheaper, they are not always adequate. But as model size increases, cost also rapidly goes up. • Especially for large-scale consumer-facing services like Mercari, trying to use them without any consideration can result in a hefty price (although it has become much cheaper compared to last year) Ex. Assuming Mercari has 2 million listings per day, and if some kind of LLM processing is applied to all items (with only 1 LLM call per item), the annual cost for GPT-4 Turbo would be 550 million yen (however, with GPT-3.5 Turbo, it would be 27 million yen) Artiﬁcial Analysis

32 Conﬁdential - Do Not Share Tasty (=good) It translates
unique words appropriately without any additional training If you design appropriate ﬂows, it can make decisions and take actions on its own Although I can't write everything here, as you all know, it's really tasty (amazing) in so many ways! It can understand and generate vision and audio as well Pokémon character "リザードン" English: Charizard French: Dracaufeu Korean: 리자몽 Mercari AI Assist Purchase Assistance Feature What would be a good birthday present for my 5-year-old daughter? GPT Additional questions Hit the API and display search results etc. Customer Support How can I assist you today? Let me take a look at the photo you provided It can understand the other person's speech and respond verbally It can also read image information

33 Confidential - Do Not Share Tasty (=good)? There are
many great things about it! On the other hand, controlling the output content is particularly difficult • It sometimes states incorrect things as if they were correct (hallucination) • Even with the same input, prompts, etc., and after a lot of correction, it may occasionally (e.g., 0.01% of the time) produce different outputs • It still struggles with tasks like providing confidence scores or calculating meaningful scores • QA is extremely challenging 🤮

34 Conﬁdential - Do Not Share Ingenuity in Product Development
Using LLM It's the most exciting and enjoyable technology, but when using it in products provided to customers, it's generally necessary to face the difﬁculty of control Control Challenges🔥 • Output content (hallucination) • Output speed (latency) • Cost

35 Conﬁdential - Do Not Share Ingenuity in Product Development
Using LLM at Mercari 1. Using LLM as the underlying logic in a way that is not directly visible to customers ex. "Using in web SEO" 2. Interaction between customers and LLM (without free input) Initially adopting a selection-based approach with no free input from customers ex. "Mercari AI Assist (Listing Improvement Suggestion Feature)" 3. Interaction between customers and LLM (with free input) Allowing free input while controlling the input scope and output to a certain extent ex. "Mercari AI Assist (Purchase Support Feature)" 4. Moving towards initiatives with even greater freedom For product initiatives with a large scope of impact, we are taking the approach of gradually expanding the allowable range while learning from the initiative in the production environment

36 Confidential - Do Not Share Ingenuity in Mercari AI
Assist (Listing Improvement Suggestion Feature) We were using GPT-3.5 for the underlying logic, but we wondered if we could find a model that has • Higher accuracy & more stable output than GPT-3.5 • Lower latency • Lower cost =>With this motivation, we executed fine-tuning of a small-scale OSS model Used for extracting product information

37 Confidential - Do Not Share Fine-tuning is a technique
for tuning a pre-trained model to optimize it for a specific task Instead of updating all parameters, we adopted QLoRA, which is a type of PEFT (Parameter-efficient fine-tuning) that efficiently updates only a subset of the parameters (chosen because it can be executed quickly without compromising quality) About the adopted method (Fine-tuning with QLoRA) [2305.14314] QLoRA: Efficient Finetuning of Quantized LLMs

38 Conﬁdential - Do Not Share Consideration of models to
adopt (Japanese LLMs) We considered the top 5 models of 7B size in LLM JP Eval (Multi-Choice QA, NLI, QA, Reading Comprehension) as candidates (as of December 2023) • tokyotech-llm/Swallow-7b-instruct-hf (Llama-based) • rinna/nekomata-7b-instruction (Qwen-based) • stabilityai/StableBeluga-7B (Llama-based) • rinna/youri-7b-instruction (Llama-based) • mistralai/Mistral-7B-Instruct-v0.2 (Mistral-based) Reference: Nejumi LLM Leaderboard

39 Confidential - Do Not Share 1. Prepare the fine-tuning
dataset ◦ Output: JSON/CSV files 2. Fine-tuning (one A100 GPU (48GB)) ◦ Output: LoRA adapters + base model (merged) - ~29GB 3. Post-training Quantization ◦ Using llama.cpp ◦ Output: GGUF model file - ~4GB 4. Evaluate ◦ BLEU Score (4-gram) ▪ 1.3x better than GPT-3.5 Turbo ◦ Cost (10 pods NVIDIA Tesla T4) ▪ 14x cheaper than GPT-3.5 Turbo (OpenAI) Fine-tuning with QLoRA

40 Conﬁdential - Do Not Share Behind the Scenes of
Productivity Improvement with Generative AI

41 Conﬁdential - Do Not Share Company-wide use of generative
AI services is increasing GitHub Copilot usage also skyrocketed in about half a year from July 2023 to February Changes in DAUs Usage of "ChatGPT" service exclusively for Mercari employees grew by 230% in one year Total Shown: 8x increase Total Accepts: 6.5x increase Acceptance Rate: Around 30%

42 Confidential - Do Not Share Providing specific features for
specific use cases In order to further promote deeper usage, it is important to provide specialized features for specific use cases in addition to the general features of the "ChatGPT” service exclusively for Mercari employees. • mercari Dev Assist LLM provides answers based on Mercari's internal technical documents • mercari Analytics Assist A tool that assists with data analysis based on knowledge from Mercari's databases

43 Confidential - Do Not Share Providing specific features for
specific use cases "How do I create MS from scratch?" GPT-4 Generate Query: "implement a new microservice". Vector DB by FAISS Microservice Wiki In-House Dev Services Github Issue Slack Message Generate Summary & Create embeddings by LLM GPT-4 & text-embedding-3 Respond with Related documents Answer based on documents

44 Conﬁdential Formation of Cross-Organizational Virtual Teams The dedicated generative
AI team forms virtual teams with various internal teams to execute advancement projects lasting 3 to 6 months. July~September Generative AI x Marketing & Creative Team October~December Generative AI x Analytics Team Generative AI x CS Ops Team January~March Generative AI x HR & Corporate Team

45 Conﬁdential Recent Success Cases of Virtual Teams Formed a
virtual team with Generation AI Team x Marketing & Creative Team, set OKRs and progressed the following milestones. • July: Recruitment Visuals by Generative AI • August: Campaign Visuals by Generative AI • September: Video Ads by Generative AI

46 Conﬁdential Broadcasted @’Shibuya Scramble Crossing’ Utilizing Generative AI for
Video Creation in OOH URL :https://youtu.be/IWgwWYrhaMs

47 Conﬁdential - Do Not Share Current Evaluation Before implementation:
12 business days Planning (3 days) → Illustration drawing (5 days) → Design creation (4 days) After implementation: 4 business days Planning (1 day) → AI-generated illustration (1 day) → Design creation (2 days) → Succeeded in reducing production man-hours while creating novel illustrations that are not found in stock materials Production Man-hours: Rating ◯ (Signiﬁcantly reduced) With recent updates, the breadth of quality and reproducibility has expanded, not only for hobby and merchandise goods but also for fashion items Reproducibility: Rating ◯ Contribution to Results: Rating ◯ (Particularly advantageous in terms of CVR)

48 Conﬁdential - Do Not Share CTVR Mapping Visuals and
videos produced using generative AI tend to have higher CVRs. Campaigns created with generative AI, such as "One in seven people use Mercari" and "Halloween," have achieved high acquisition efﬁciency overall. Creative A Creative B

49 Confidential - Do Not Share Ample Room for Further
Utilization For Leadership • => Incorporation into company policies and goal setting, such as OKRs, through top-down approach can further promote adoption For Members (General Use) • Currently, there is still a significant variation in the degree of utilization among individuals • => Continuously implement overall improvement measures (e.g., hackathons) and provide services for specific use cases (it is also important to measure and visualize results with numbers) Utilization in Each Project • There are cases where side support works well and cases where it is challenging • => Assign members who are committed to AI in some form (fully assign dedicated generative AI members, hire dedicated members on the project side)

50 Conﬁdential "Current State of Thinking" on Generative AI Utilization
within the Organization Interacting with services Creating by oneself Increasing the number of creators

51 Conﬁdential - Do Not Share We are hiring! •
Senior Technical Product Manager • Engineering Manager • Mobile Engineer, Full Stack

52 Conﬁdential Thank you for listening.

53 Conﬁdential - Do Not Share Q & A

Behind the Scenes of Generative AI-Driven Produ...

Behind the Scenes of Generative AI-Driven Products and Productivity Enhancements

More Decks by Yuki Ishikawa

Featured

Transcript