The Semantic Layer Solution Part 3. Solution in Action Part 4. What’s Next? Part 5. Closing Thoughts from Setup to Strategy The roadmap for integrating LLMs into your data work
MCP Client MongoDB MCP server MongoDB Request task Analyze request Decide on tools/resource to use Call tool or request resources Execute tool/resource Send result Send result Format and interpret result Present formatted result
is the churn rate for this month? Event log Subscription log Event log Subscription log Active users last month Inactive OR Canceled users Extract active user data Extract subscription data Extract active user data Active users last month Inactive users Actual outcome
map natural language to queries • … understand the business context • … know the data structure • … recognize business terms • … The Root Problem Why basic MCP servers produce wrong results Because LLM doesn’t …
map natural language to queries • … understand the business context • … know the data structure • … recognize business terms • … The Root Problem: Missing Context Why basic MCP servers produce wrong results Because LLM doesn’t … → NO CONTEXT
map natural language to queries • … understand the business context • … know the data structure • … recognize business terms • … The Root Problem: Missing Context Why basic MCP servers produce wrong results Because LLM doesn’t … • Provide request interpretation • Provide query mapping • Provide business contexts • Provide data schema and metadata • Provide definitions for business terms • … Provide LLM with context data → HAS CONTEXT
precise business logic Introducing the Semantic Layer How to add business intelligence to AI analysis What is semantic layering? Natural language question Precise business logic • Provides business and data context to the model • Adds meaning & context to the raw tools and resources in MCP What role does it play? Semantic Layer CONTEXT →
uses offers various AI- based services, such as web search, translation, OCR, and more • Uses freemium business model Case Study: AI Assistant Service A freemium LINE AI chatbot AI Assistant
friend • Free User: User that has agreed to the terms of use of AI Assistant • DAU: Number of unique users that have at least one event log that is not unsubscribe and block Defining Business Context Terms, metrics, and data relationships Example context type 1 - Business terms
friend • Free User: User that has agreed to the terms of use of AI Assistant • DAU: Number of unique users that have at least one event log that is not unsubscribe and block Defining Business Context Terms, metrics, and data relationships Example context type 1 - Business terms Example context type 2 – Data schema
friend • Free User: User that has agreed to the terms of use of AI Assistant • DAU: Number of unique users that have at least one event log that is not unsubscribe and block Defining Business Context Terms, metrics, and data relationships Example context type 1 - Business terms -- Users Table users ( user_id: VARCHAR, reg_date: DATE, status: CHAR(1), -- 'A'=Active, 'I'=Inactive msg_count: INTEGER ) Example context type 2 – Data schema -- Events Table events ( event_id: VARCHAR, user_id: VARCHAR, date: DATE, type: CHAR(1) -- 'T'=Text, 'M'=Media, 'V'=Voice )
with access to our messaging database. DATA SCHEMA: - users table: user_id, reg_date, status ('A'=Active, 'I'=Inactive), msg_count - events table: event_id, user_id, date, type ('T'=Text, 'M'=Media, 'V'=Voice) BUSINESS TERMS: • Friend: User that has added AI Assistant as a friend • Free User: User that has agreed to the terms of use of AI Assistant • DAU: Number of unique users that have at least one event log that is not unsubscribe and block Always translate technical codes into business terms and provide strategic insights. Good system prompt (w/ Semantic Layering) The difference context make
added AI Assistant as a friend” definition: ”events.type=‘add_friend’ exists and events.type=‘agree’ doesn’t” business_value: "12% of user base generating 67% of platform engagement” ... Business terms→ business_terms.yaml Structured business logic definition
│ ├── schema.yaml │ ├── business_terms.yaml │ └── semantic_orchestrator.py └── requirements.txt Add semantics and layers to server code Code implementation with semantic tools Server now uses semantics for query processing and response enhancement
via MongoDB MCP server The Question: “What’s the churn rate for April 2025?” The Result • Calculated Result: 83.0% • Calculation Logic: (April 25 Inactive Users / March 25 Active Users) × 100 = (15,520 / 18,695) × 100 • Problem: Missing cancelled user data • LLM found the data but lacked business context understanding • Churn = Inactive + Cancelled → Only calculated inactive • Raw data analysis without domain knowledge leads to incomplete analysis Core Issue
is the churn rate for April?” ① Natural Language Interpretation {intent: “metric_query”, subject: “churn_rate”} Business term lookup ② Query Mapping Metric_id: “customer_churn_rate” Business context clarification ③ Metric Definition Calculation: ”prev_month_mau vs current_month_inactive_or_canceled” Domain-specific execution ④ Business Logic Execution Result: 83.6% of churn rate
Result • Inactive Users: 15,520 • Subscription Cancellation: 112 (newly discovered) • Total Churned Customers: 15,632 • Improved Result: 83.6% (15,632 / 18,695) March Active Users April Inactive Users The Question: “What’s the churn rate for April 2025?”
Inactive Users: 15,520 • Subscription Cancellation: 112 (newly discovered) • Total Churned Customers: 15,632 • Improved Result: 83.6% (15,632 / 18,695) April Canceled Users The Result The Question: “What’s the churn rate for April 2025?” March Active Users April Inactive Users
Inactive Users: 15,520 • Subscription Cancellation: 112 (newly discovered) • Total Churned Users: 15,632 • Improved Result: 83.6% (15,632 / 18,695) The Result The Question: “What’s the churn rate for April 2025?” April Canceled Users March Active Users April Inactive Users
Processing Metadata Strategy Example // Strategic index design {order_type: 1, dt: 1, emid: 1} // Churn calculation requirements: // - Find who cancelled (order_type filter) // - Within target month (dt filter) // - From previous active users (emid filter) • Before: • Full collection scan • Processing millions of records • After: • Targeted data access • Focus only on the cancelled users in specific month Execution Enhancement Benefits Metadata optimization enables original semantic design execution
Churned Customers: 15,632 • Improved Result: 83.6% (15,632 / 18,695) The Result Complete Implementation Results Stage 3: Metadata optimization for semantic query execution The Question: “What’s the churn rate for April 2025?” April Canceled Users March Active Users April Inactive Users
execution The Question: “What’s the churn rate for April 2025?” • Inactive Users: 15,520 • Subscription Cancellation: 76 (corrected from 112) • Total Churned Users: 15,596 • Final Result: 83.4% (15,596 / 18,695) The Result April Canceled Users March Active Users April Inactive Users
Ask Q1. Monthly Active Users What was the number of active users in June? “18,695 active users” Q2. Historical Churn Analysis What was the customer churn rate in February 2025? “83.4%” ③ Relative Time Queries How many users were active last month? “18,695 active users in May 2025” ④ Regional Distribution Can I see a breakdown of users by region for this month? “JP 45%, TW 30%, TH 25%”
service usage frequency + revisit patterns Result: “60 % of users with 3+ monthly usage willing to convert to subscription” Request: “Would converting our service to a subscription model improve profitability?” PostgreSQL Current users’ purchase cycles + Avg. purchase amounts Confluence User interview results + subscription intention surveys
Excel at • Large-scale data processing and pattern recognition • Simultaneous multi-system query execution • Repetitive analysis task automation • 24/7 real-time monitoring and alerting • Standardized report generation What Only Humans Can Do • Designing logic systems for LLMs to understand business context • Building semantic layers and context systems • Strategic question design and framing • Structuring domain knowledge for AI utilization • Analysis result integrity verification
transaction data, and Tableau dashboards simultaneously - beyond individual analytical capacity The Future Analyst New Partnership: Human + AI Context-Aware Workflow Automation Automate analytical workflows that adapt to data patterns: anomaly detection triggers deeper investigation, seasonal trends adjust forecasting models, and outliers prompt validation processes Advanced Analytics Implementation Deploy models for customer segmentation, demand forecasting, and A/B testing — without requiring deep technical implementation, whether statistical, rule-based, or machine learning.
transaction data, and Tableau dashboards simultaneously - beyond individual analytical capacity The Future Analyst New Partnership: Human + AI Context-Aware Workflow Automation Automate analytical workflows that adapt to data patterns: anomaly detection triggers deeper investigation, seasonal trends adjust forecasting models, and outliers prompt validation processes Advanced Analytics Implementation Deploy models for customer segmentation, demand forecasting, and A/B testing — without requiring deep technical implementation, whether statistical, rule-based, or machine learning. “LLMs are powerful partners that expand our analytical capabilities”