domain is rich in structured/unstructured text and image data → Highly compatible with LLM • Examples of text and image data in Mercari Hallo: ◦ Job posting descriptions created by partners ◦ Sales materials and business meeting data for partners • LLM applications in Mercari Hallo: ◦ Easy job posting creation ◦ Job posting risk prediction ◦ Sales Productivity Improvement
• What We Accomplished ◦ Validated AI-Native approach to on-demand work • Technical expertise and infrastructure will power future AI initiatives across Mercari Group Service Closing: December 18, 2025
Partner company without knowledge of job posting creation can create high-quality job posting easily. • Automatically generate job details simply by selecting the required fields (job category, role, benefits…) Partner simply selects items and inputs appeal points and job duties in free text style.
Go Backend • Prompt Creation ◦ Leveraged few-shot prompting to create prompts that consistently generate high-quality job postings Implementation GraphQL Server (Go) Cloud SQL Partner Company Retrieve job information such as business name, work location, etc. Generate job posting draft using LLM based on obtained information
of attractive "better job postings" → Defined job posting quality in 3 levels through interviews with sales staff and analysis of past job postings • Criteria for defining job posting quality (3 levels): ◦ Whether information about job content, workplace atmosphere, and expected candidate profile is included ◦ Whether information about benefits (dress code, meals ,etc) ◦ Considerations for readability such as emojis, section titles, and other formatting elements, etc. Easy Job Posting Creation
posting quality in 3 levels using LLM as a Judge ◦ Prompt for “LLM as a Judge”: Modified the original paper's prompt for job postings eval • Quality Comparison: Easy Job Posting vs Manual Creation • Limitation: ◦ Validates quality consistency based on internal standards, not matching rates. Experiments on Job Quality Evaluation by “LLM as a Judge” Easy Job Positing Manual Creation Score Average (100 cases) 2.34 1.32
(あんしん・あんぜん ) commitment: Only legally compliant job postings published to ensure crew’s peace of mind. • Strict review criteria: ◦ Is job content appropriate? Any inappropriate expressions? ◦ All postings undergo rigorous review • Dual-check system: Human + LLM review for faster, more accurate screening.
◦ High performance in various NLP tasks like text classification. • High explainability: ◦ Can output reasons for high risk in natural language. • Rapid adaptation to new risks ◦ No labeled training data required ◦ Simply update prompts—fast & flexible
risk → Recall is the top priority KPI ◦ False Positives (false alarms): Poor user experience → Precision must be maintained • Key point: Prompt quality management is critical for effective risk prediction Challenges in Risk Prediction Using LLMs
Composer, LiteLLM, BigQuery … etc • PdMs & ML Engineers can easily validate prompt quality by inputting prompts through the Airflow UI without writing code
Served 12M+ users with LLM-powered features ◦ Easy Job Posting Creation – Quality postings for everyone ◦ Job Posting Risk Prediction – Safety & security • Key Learnings from Building an AI-Native Product → PromptOps infrastructure is critical for AI-Native products • These experiences and infrastructure live on → Contributing to AI adoption across Mercari Group Conclusion