Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Enterprise-Ready Document Intelligence: Scaling...

Sponsored · SiteGround - Reliable hosting with speed, security, and support you can count on.

Enterprise-Ready Document Intelligence: Scaling RAG & Multi-Agent Workflows in Azure AI Foundry

Most RAG implementations today are still single-shot - user asks a question, system fetches some chunks, model spits out an answer. That works for simple lookups, but falls apart when you need to do something more involved, like pulling a document, summarizing the key clauses, checking them against compliance rules, and then presenting a coherent answer to an underwriter. No single prompt chain handles that well.

This session is about breaking that problem into specialized agents - each one responsible for a specific job. One agent handles retrieval. Another focuses on summarization. A third does compliance or rule checking. And a coordinator agent manages the handoff between them. Azure AI Foundry's Agent Service gives you the building blocks for this: tool definitions, function calling, and the ability to chain agents together without duct-taping everything with custom code.

I will walk through how this plays out in practice using a document intelligence use case from the reinsurance domain -ingesting contract documents, breaking them down into clause-level chunks, running hybrid search across them, and layering a conversational interface on top. The goal isn't to show a demo - it's to show the architecture decisions, the trade-offs, and the places where things get messy in real enterprise deployments.

Avatar for Asif Waquar

Asif Waquar

April 08, 2026

More Decks by Asif Waquar

Other Decks in Technology

Transcript

  1. 2 about me Asif Waquar Solutions Architect, Munich Re |Singapore

    asifwaquar asifwaquar https://asifwaquar.com/ @asifwaquar
  2. The growing need for intelligent document processing.. of new enterprise

    data is unstructured. of enterprise processes rely on documents 90% 80%
  3. Common usecases Financial Services Invoice Document Automation Extract data from

    invoices and purchase orders, validate and push the data to a system of record. Government Application forms Extract data from customs declaration forms and store them in the corresponding location for auditing. Healthcare Energy Medical results Extract data from a variety of medical results and trigger alerts when required. Production reports Extract data from drilling reports across different sites and consolidate the data to generate BI insights.
  4. Use Case “We store all our data and contract documents

    in SharePoint sites. These contracts are with various vendors and clients, and they come in multiple formats -Word, Excel, PowerPoint, PDF, ZIP files, and even documents with embedded content. The requirement was to make it easier to search within these documents -for example, to quickly find who signed a contract, what terms and conditions were agreed upon, or what discussions took place. Users were facing several challenges with the standard SharePoint Search. To address this, we developed the “Doc Search” AI-powered solution, built with Copilot Agents and the agentic framework (multi agent workflows) on SharePoint sites. This solution makes document search faster, smarter, and more intuitive, helping users quickly find the right information with better accuracy and context..”
  5. Autonomous Workflow, Supervised by Humans Collect Agent monitors event triggers

    and scans data sources to collect documents, or waits for human input Classify Agent classifies document types Extract Agent reads, interprets and transforms content with Generative AI technology Validate Agent validates extracted data based on maker- defined business rules and/or human input. Integrate Agent connects to target systems, knowledge repositories and downstream apps to process data User monitors the agent, agent learns with human input and repeats..
  6. The cautionary flip side: Gartner also predicts over 40% of

    agentic AI projects will be cancelled by end of 2027, driven by escalating costs, unclear value, and inadequate risk controls. Architecture discipline is the difference. This is not research curiosity. It’s where market is going..
  7. What are agents? Agents are apps that use AI to

    reason, plan, connect to systems and complete tasks working alongside or on behalf of a person, team or organization Simple Advanced Retrieval Task Autonomous Agents vary complexity and capabilities depending on your need
  8. A range of tools for agent creation No code Pro

    code For end users Agent builder For makers Copilot Studio For developers Copilot Studio, Azure AI Foundry & Visual Studio Data protection, agent sharing & usage limits, and reporting & cost management
  9. RAG (Retrieval Augmented Generation) Reference : https://www.k2view.com/blog/rag-architecture/#RAG-Architecture-Step-by-Step How it works

    ? You ask: “Show me the latest treaty where ABC Insurance agreed to renewal terms.” Without RAG: SharePoint search might show a long list of files, and you have to open each one. With RAG + Copilot Agent: You get a concise, AI-generated summary: “The renewal treaty signed on 12 March 2024 with ABC Insurance includes a 10% rate adjustment and was approved by John Smith.”
  10. Vector(Semantic) Indexing It’s a process where each word is given

    a number that represents its meaning, and similar words are grouped together. In keyword-based search, the system only looks for exact words you type. But in vector (semantic) search, it also finds words with similar meanings — even if they’re not exactly the same.
  11. KEYWORD VS SEMANTIC UNDERSTANDING • Keyword Questions • “What is

    the policy on annual leave?” • “Do we have a policy regarding workplace safety?” • “How many sick days are employees entitled to annually?” • “What are the guidelines for filing a harassment complaint?” • Semantic Questions • “How can an employee request a leave of absence for mental health reasons?” • “What steps should be taken if an employee feels unsafe at work?” • “Can part-time employees also participate in the health benefits program?” • “What should a manager do if they observe discriminatory behavior?” Specific terms found in the documents More nuanced understanding required
  12. THE QUESTION “Pull the AAB Contract from last March, summarize

    the exclusion clauses, flag anything that violates our new sanctions policy, and give me a plain-English answer -I have a call in 10 minutes” What RAG does: • Fetches 5 chunks that mention "AAB" and "exclusion”. • Stuffs them into one prompt. • Returns a confident-sounding paragraph. • Misses the sanctions check entirely. (In this step is where enterprises get problem)
  13. RAG works until the task has verbs. One prompt. One

    retrieval. One answer. • Compound tasks • Retrieve → summarize → check rules → synthesize. Each step needs different context, different model settings, different prompts. • Context overload • Stuffing 40 clauses into one prompt dilutes attention. The model confidently answers the wrong question • No branching • What if a clause matches policy A but not B? A single chain can't route , it averages, and averages hide violations. • No verification • There's no second pair of eyes. The model that writes the answer is the same one that checks it. That's not a check.
  14. Stop asking "what's the best prompt?" Start asking "who should

    do each job?" Think of a law firm. You don't ask one partner to do discovery, contract review, compliance, and client presentation. You have specialists and one partner who coordinates. Retrieval Agent Finds the right documents. Hybrid search, filtering, ranking. Summarization Agent Distills clauses into structured, citable outputs. Compliance Agent Checks against policies, sanctions, rule libraries. Coordinator Agent Owns the workflow. Decides who runs when.
  15. Contract document intelligence, for real. Reinsurance contracts: 80-page PDFs, dense

    legalese, scanned amendments, clause cross-references. This is where RAG demos go to die. 80+ pages per contract 200+ clauses per contract 15+ compliance rules to apply < 30s answer time underwriters expect Why reinsurance? One missed exclusion clause can cost eight figures. Every clause matters. Every jurisdiction matters. 'The model was probably right' is not an acceptable answer when an underwriter is pricing a $50M risk. This is the exact domain where multi-agent design earns its keep: traceable, auditable, and each step accountable to a human reviewer. 08 / 18
  16. How the agents actually talk to each other ? A

    coordinator drives the loop. Each specialist has its own tools and returns structured results back to the thread.
  17. Before agents can reason, chunks have to be right. 01

    Extract Azure Document Intelligence OCRs scanned pages, preserves layout, tables, headers. 02 Segment Clause-level splitting not fixed token windows. Use heading + numbering patterns. 03 Enrich Attach metadata: treaty ID, jurisdiction, effective date, clause type, parent section. 04 Embed text-embedding-3-large per clause. Store vector + text + metadata together. 05 Index Azure AI Search with vector, keyword, and semantic config — hybrid from day one. 10 / 18
  18. Same question. Now watch four agents handle it. "Pull the

    AAB contract, summarize exclusions, flag sanctions violations." 1 Coordinator Parses the request. Identifies three sub-tasks. Plans the order: retrieve → summarize → compliance. 2 Retrieval Agent Runs hybrid search scoped to contract='AAB'. Returns 14 clause chunks ranked by semantic + BM25. 3 Summarization Agent Takes the 14 chunks, filters to 'exclusion' type, produces a structured bullet summary with citations. 4 Compliance Agent Cross-references summary against sanctions policy. Flags Clause 9.3 breach risk on Russian entities. 11 / 18
  19. So why pay the extra cost? Knowledge work productivity 60%+

    A market intelligence firm with 500+ data-quality staff deployed multi-agent to detect anomalies, explain market shifts, synthesize insights. 60%+ productivity gain, $3M+ annual savings. Source: McKinsey, 'Seizing the agentic AI advantage' (2025) Faster legacy modernization 40–50% Multi-agent 'squads' — coders, reviewers, integrators — running modernization workflows have delivered 40–50% faster timelines and ~40% lower costs from technical-debt reduction. Source: McKinsey, 'AI for IT modernization' (2024–25) Autonomous resolution at scale 80% By 2029, Gartner predicts agentic AI will autonomously resolve 80% of common service issues without human intervention — leading to a 30% reduction in operational costs. Source: Gartner, March 2025 Competitive separation by 2028 2028 Gartner predicts organizations using multi-agent AI across 80% of customer-facing processes will materially outperform peers. Specialization + collaboration is the moat. Source: Gartner Strategic Predictions, 2026 14 / 18
  20. Lessons Learnt 01 Understand the data Understand file types, limitations

    & behaviors. 02 Agents are expensive interns Every agent is another mouth to feed tokens, latency, failure surface. Add them only when a single one provably can't handle the job. 03 Structured output is non-negotiable Free-form text between agents is where systems go to die. Enforce JSON schemas at every hop. 04 The coordinator is your bug farm 80% of production issues will be in routing logic, not in the specialist agents. Invest in traces, replay, and eval suites for the coordinator first. 05 Keep a human in the loop for now Compliance agents should flag, not decide. An underwriter reviews. A regulator will ask who approved. Plan the audit trail on day one. 16 / 18
  21. • Building and Evaluating Advanced RAG Applications • https://www.deeplearning.ai/short-courses/building-evaluating-advanced-rag/ •

    Preprocessing Unstructured Data for LLM Applications • https://www.deeplearning.ai/short-courses/preprocessing-unstructured-data-for-llm- applications/ • Langchain • https://dev.to/pavanbelagatti/learn-how-to-build-reliable-rag-applications-in-2026-1b7p • AI Tools for document processing • https://aitoolsatlas.ai/blog/best-ai-tools-document-processing-data-extraction-2026 • Chunking Strategy • https://aishwaryasrinivasan.substack.com/p/all-you-need-to-know-about-rag-in • RAG • https://pub.towardsai.net/building-a-modern-rag-pipeline-in-2026-qwen3-embeddings- and-vector-database-in-qdrant-ebeca2bbe338 • Foundry Lab https://github.com/microsoft-foundry/Foundry-Local-Lab Resources
  22. Q/A

  23. How to select LLMs • The choice of model determines

    the intelligence and capabilities of your agent. • Choose based on: • Complexity of tasks • Access to tools/plugins • Performance vs. cost Reference : https://platform.openai.com/docs/models https://platform.openai.com/docs/pricing