Enterprise-Ready Document Intelligence: Scaling RAG & Multi-Agent Workflows in Azure AI Foundry

Enterprise-Ready Document Intelligence: Scaling RAG & Multi-Agent Workflows in Azure
AI Foundry

2 about me Asif Waquar Solutions Architect, Munich Re |Singapore
asifwaquar asifwaquar https://asifwaquar.com/ @asifwaquar

The growing need for intelligent document processing.. of new enterprise
data is unstructured. of enterprise processes rely on documents 90% 80%

Common usecases Financial Services Invoice Document Automation Extract data from
invoices and purchase orders, validate and push the data to a system of record. Government Application forms Extract data from customs declaration forms and store them in the corresponding location for auditing. Healthcare Energy Medical results Extract data from a variety of medical results and trigger alerts when required. Production reports Extract data from drilling reports across different sites and consolidate the data to generate BI insights.

Use Case “We store all our data and contract documents
in SharePoint sites. These contracts are with various vendors and clients, and they come in multiple formats -Word, Excel, PowerPoint, PDF, ZIP files, and even documents with embedded content. The requirement was to make it easier to search within these documents -for example, to quickly find who signed a contract, what terms and conditions were agreed upon, or what discussions took place. Users were facing several challenges with the standard SharePoint Search. To address this, we developed the “Doc Search” AI-powered solution, built with Copilot Agents and the agentic framework (multi agent workflows) on SharePoint sites. This solution makes document search faster, smarter, and more intuitive, helping users quickly find the right information with better accuracy and context..”

Autonomous Workflow, Supervised by Humans Collect Agent monitors event triggers
and scans data sources to collect documents, or waits for human input Classify Agent classifies document types Extract Agent reads, interprets and transforms content with Generative AI technology Validate Agent validates extracted data based on maker- defined business rules and/or human input. Integrate Agent connects to target systems, knowledge repositories and downstream apps to process data User monitors the agent, agent learns with human input and repeats..

The cautionary flip side: Gartner also predicts over 40% of
agentic AI projects will be cancelled by end of 2027, driven by escalating costs, unclear value, and inadequate risk controls. Architecture discipline is the difference. This is not research curiosity. It’s where market is going..

What are agents? Agents are apps that use AI to
reason, plan, connect to systems and complete tasks working alongside or on behalf of a person, team or organization Simple Advanced Retrieval Task Autonomous Agents vary complexity and capabilities depending on your need

A range of tools for agent creation No code Pro
code For end users Agent builder For makers Copilot Studio For developers Copilot Studio, Azure AI Foundry & Visual Studio Data protection, agent sharing & usage limits, and reporting & cost management

RAG (Retrieval Augmented Generation) Reference : https://www.k2view.com/blog/rag-architecture/#RAG-Architecture-Step-by-Step How it works
? You ask: “Show me the latest treaty where ABC Insurance agreed to renewal terms.” Without RAG: SharePoint search might show a long list of files, and you have to open each one. With RAG + Copilot Agent: You get a concise, AI-generated summary: “The renewal treaty signed on 12 March 2024 with ABC Insurance includes a 10% rate adjustment and was approved by John Smith.”

Vector(Semantic) Indexing It’s a process where each word is given
a number that represents its meaning, and similar words are grouped together. In keyword-based search, the system only looks for exact words you type. But in vector (semantic) search, it also finds words with similar meanings — even if they’re not exactly the same.

KEYWORD VS SEMANTIC UNDERSTANDING • Keyword Questions • “What is
the policy on annual leave?” • “Do we have a policy regarding workplace safety?” • “How many sick days are employees entitled to annually?” • “What are the guidelines for filing a harassment complaint?” • Semantic Questions • “How can an employee request a leave of absence for mental health reasons?” • “What steps should be taken if an employee feels unsafe at work?” • “Can part-time employees also participate in the health benefits program?” • “What should a manager do if they observe discriminatory behavior?” Specific terms found in the documents More nuanced understanding required

Document Processing..

THE QUESTION “Pull the AAB Contract from last March, summarize
the exclusion clauses, flag anything that violates our new sanctions policy, and give me a plain-English answer -I have a call in 10 minutes” What RAG does: • Fetches 5 chunks that mention "AAB" and "exclusion”. • Stuffs them into one prompt. • Returns a confident-sounding paragraph. • Misses the sanctions check entirely. (In this step is where enterprises get problem)

RAG works until the task has verbs. One prompt. One
retrieval. One answer. • Compound tasks • Retrieve → summarize → check rules → synthesize. Each step needs different context, different model settings, different prompts. • Context overload • Stuffing 40 clauses into one prompt dilutes attention. The model confidently answers the wrong question • No branching • What if a clause matches policy A but not B? A single chain can't route , it averages, and averages hide violations. • No verification • There's no second pair of eyes. The model that writes the answer is the same one that checks it. That's not a check.

Stop asking "what's the best prompt?" Start asking "who should
do each job?" Think of a law firm. You don't ask one partner to do discovery, contract review, compliance, and client presentation. You have specialists and one partner who coordinates. Retrieval Agent Finds the right documents. Hybrid search, filtering, ranking. Summarization Agent Distills clauses into structured, citable outputs. Compliance Agent Checks against policies, sanctions, rule libraries. Coordinator Agent Owns the workflow. Decides who runs when.

Contract document intelligence, for real. Reinsurance contracts: 80-page PDFs, dense
legalese, scanned amendments, clause cross-references. This is where RAG demos go to die. 80+ pages per contract 200+ clauses per contract 15+ compliance rules to apply < 30s answer time underwriters expect Why reinsurance? One missed exclusion clause can cost eight figures. Every clause matters. Every jurisdiction matters. 'The model was probably right' is not an acceptable answer when an underwriter is pricing a $50M risk. This is the exact domain where multi-agent design earns its keep: traceable, auditable, and each step accountable to a human reviewer. 08 / 18

How the agents actually talk to each other ? A
coordinator drives the loop. Each specialist has its own tools and returns structured results back to the thread.

Multi Agent Workflow in Azure AI Foundry

Before agents can reason, chunks have to be right. 01
Extract Azure Document Intelligence OCRs scanned pages, preserves layout, tables, headers. 02 Segment Clause-level splitting not fixed token windows. Use heading + numbering patterns. 03 Enrich Attach metadata: treaty ID, jurisdiction, effective date, clause type, parent section. 04 Embed text-embedding-3-large per clause. Store vector + text + metadata together. 05 Index Azure AI Search with vector, keyword, and semantic config — hybrid from day one. 10 / 18

Same question. Now watch four agents handle it. "Pull the
AAB contract, summarize exclusions, flag sanctions violations." 1 Coordinator Parses the request. Identifies three sub-tasks. Plans the order: retrieve → summarize → compliance. 2 Retrieval Agent Runs hybrid search scoped to contract='AAB'. Returns 14 clause chunks ranked by semantic + BM25. 3 Summarization Agent Takes the 14 chunks, filters to 'exclusion' type, produces a structured bullet summary with citations. 4 Compliance Agent Cross-references summary against sanctions policy. Flags Clause 9.3 breach risk on Russian entities. 11 / 18

So why pay the extra cost? Knowledge work productivity 60%+
A market intelligence firm with 500+ data-quality staff deployed multi-agent to detect anomalies, explain market shifts, synthesize insights. 60%+ productivity gain, $3M+ annual savings. Source: McKinsey, 'Seizing the agentic AI advantage' (2025) Faster legacy modernization 40–50% Multi-agent 'squads' — coders, reviewers, integrators — running modernization workflows have delivered 40–50% faster timelines and ~40% lower costs from technical-debt reduction. Source: McKinsey, 'AI for IT modernization' (2024–25) Autonomous resolution at scale 80% By 2029, Gartner predicts agentic AI will autonomously resolve 80% of common service issues without human intervention — leading to a 30% reduction in operational costs. Source: Gartner, March 2025 Competitive separation by 2028 2028 Gartner predicts organizations using multi-agent AI across 80% of customer-facing processes will materially outperform peers. Specialization + collaboration is the moat. Source: Gartner Strategic Predictions, 2026 14 / 18

Lessons Learnt 01 Understand the data Understand file types, limitations
& behaviors. 02 Agents are expensive interns Every agent is another mouth to feed tokens, latency, failure surface. Add them only when a single one provably can't handle the job. 03 Structured output is non-negotiable Free-form text between agents is where systems go to die. Enforce JSON schemas at every hop. 04 The coordinator is your bug farm 80% of production issues will be in routing logic, not in the specialist agents. Invest in traces, replay, and eval suites for the coordinator first. 05 Keep a human in the loop for now Compliance agents should flag, not decide. An underwriter reviews. A regulator will ask who approved. Plan the audit trail on day one. 16 / 18

• Building and Evaluating Advanced RAG Applications • https://www.deeplearning.ai/short-courses/building-evaluating-advanced-rag/ •
Preprocessing Unstructured Data for LLM Applications • https://www.deeplearning.ai/short-courses/preprocessing-unstructured-data-for-llm- applications/ • Langchain • https://dev.to/pavanbelagatti/learn-how-to-build-reliable-rag-applications-in-2026-1b7p • AI Tools for document processing • https://aitoolsatlas.ai/blog/best-ai-tools-document-processing-data-extraction-2026 • Chunking Strategy • https://aishwaryasrinivasan.substack.com/p/all-you-need-to-know-about-rag-in • RAG • https://pub.towardsai.net/building-a-modern-rag-pipeline-in-2026-qwen3-embeddings- and-vector-database-in-qdrant-ebeca2bbe338 • Foundry Lab https://github.com/microsoft-foundry/Foundry-Local-Lab Resources

Thank you..

Prompts

How to select LLMs • The choice of model determines
the intelligence and capabilities of your agent. • Choose based on: • Complexity of tasks • Access to tools/plugins • Performance vs. cost Reference : https://platform.openai.com/docs/models https://platform.openai.com/docs/pricing

Enterprise-Ready Document Intelligence: Scaling...

Enterprise-Ready Document Intelligence: Scaling RAG & Multi-Agent Workflows in Azure AI Foundry

Asif Waquar

More Decks by Asif Waquar

Other Decks in Technology

Featured

Transcript