Upgrade to Pro — share decks privately, control downloads, hide ads and more …

[AITour 26] Efficient model customization with ...

[AITour 26] Efficient model customization with Azure AI Foundry

Learn how to customize AI models and optimize their performance for your scenario through targeted fine-tuning methods. This session discusses Distillation, RFT and RAFT approaches using Azure AI Foundry, and shows how you can reduce costs and improve precision with less data and complexity.

Location: Toronto
Date: Oct 1, 2025
Session: https://aitour.microsoft.com/flow/microsoft/toronto26/sessioncatalog/page/sessioncatalog/session/1755310347948001jt2d

Visit the Repo:
https://github.com/microsoft/aitour26-BRK443-efficient-model-customization-with-azure-ai-foundry

Join the Discord:
https://aka.ms/model-mondays/discord

Avatar for Nitya Narasimhan, PhD

Nitya Narasimhan, PhD

October 08, 2025
Tweet

More Decks by Nitya Narasimhan, PhD

Other Decks in Technology

Transcript

  1. What we will cover today Unlock business value with fine-tuning

    Model customization is critical when you need to improve the quality, accuracy, and performance of your AI applications Fine-tuning in Azure AI Foundry When is fine-tuning the right choice for me? Why is Azure AI Foundry the right platform to use? Distillation demo Reduce cost by using larger model to train smaller, cheaper model Hands-on with Hybrid RAG+FT Why do we need hybrid customization approaches? How does RAFT improve precision for RAG usage? RAFT demo Improve precision of RAG by teaching model to be more selective Summary Add fine-tuning to your AI engineering toolkit Try Azure AI Foundry for your fine-tuning tasks
  2. Why does Zava need model customization? Bruno Zhao Zava Customer

    Help me find the right product—for the right price—and I’ll buy it! Make it helpful Customize Cora’s tone & response format to be polite, helpful & conversational Robin Counts Retail Store Manager Help me build loyalty & move inventory— to grow my store sales Make it precise Improve Cora’s use of retrieved knowledge, reducing customer frustration & driving conversions Kian Lambert App Dev Manager Help me operate Cora to be more effective in cost, response quality Make it cheaper Distill Cora’s knowledge into smaller, cheaper models—reducing cost with comparable accuracy
  3. The Zava AI Engineer’s customization journey Question-answering task using natural

    language I’m painting my living room wall. What should I buy? User input Start with prompt engineering Add your data with RAG Tone and style Example responses Intent mapping Query extraction Inventory retrieval Personalization Context engineering Optimize model with fine-tuning Deliver fast, accurate response with less cost Good choice! I recommend our Eggshell Paint. Would you like to know more about color choices? Model adaptation Desired output
  4. What does fine-tuning mean? LLM Fine-tuned LLM Fine-tuning refers to

    customizing a pre-trained LLM with additional training on a specific task or new dataset for enhanced performance, new skills, or improved accuracy
  5. Why should Zava consider fine-tuning? Domain-specific optimization Task-specific optimization Reduced

    token consumption Efficient resource utilization Smaller models, faster response Shorter prompts, improve response Improve quality Reduce cost Reduce latency Example: Zava has a domain-specific focus (retail) and task-specific focus (question-answering). Let’s think about how fine-tuning can help optimization
  6. How can Zava use fine-tuning for optimization? Fine tuning Optimize

    Context RAG Hybrid Fine Tuning + RAG High Low High Low Optimize Model Behavior Do something different Learn something new Deliver optimized experience I want customer responses to use only Zava product data as relevant context I want to improve usage of RAG to be more precise I want faster response to customer queries I want to save token costs with shorter prompts I want to adapt model to reflect Zava tone & style
  7. How can Azure AI Foundry help the Zava AI developer?

    Azure AI Foundry Faster models, simpler methods GPT-4.1, -mini, -nano o4-mini RFT Llama 4 Scout Fine tune anywhere Global training for OAI Datazone for OSS Developer tier Experiment with ease —no hosting fee Enterprise ready ai.azure.com PTU-M, Global GA Extended model support
  8. What are my fine-tuning options in Azure AI Foundry? Supervised

    fine-tuning Module learns from examples Ex: Content generation task Reinforcement fine-tuning Use grader to reinforce CoT Ex: Reasoning tasks Model distillation Transfer learning to cheaper model Optimize for COST Vision fine-tuning Preference fine-tuning Hybrid fine-tuning Improve image understanding Ex: ClassificationTask Provide good & bad examples Ex: Tone adaptation Improve model use of RAG context Optimize for PRECISION
  9. What are the added benefits of fine-tuning here? Built-in safety

    Continual fine-tuning Simplified fine-tuning developer experience Global Training For fine-tuning Developer Tier For model hosting Cost-effective training & testing
  10. How can I fine-tune my model to customize the tone?

    Decide vision and scope Choose base model Choose FT technique Pick enterprise-ready model options Dataset Fine-tuning Evaluation Deploy and monitor Regularly benchmark and iterate!
  11. Demo: Supervised fine-tuning in Zava Bruno Zhao Zava Customer Make

    it helpful Customize Cora’s tone & response format to be polite, helpful & conversational
  12. How can I use distillation to reduce cost of operation?

    Distillation is the process of using a large, general-purpose teacher model to train a smaller student model to perform better at a specific task Task GPT-4.1 Data generation Data Distillation GPT-4.1-nano Training Teacher Model Student Model Distilled Model Evaluation 4.1-nano*
  13. Demo: Using distillation for Zava Kian Lambert App Dev Manager

    Make it cheaper Distill Cora’s knowledge into smaller, cheaper models— reducing cost with comparable accuracy
  14. When search brings you the wrong tools for the job

    Query “What’s the best type of paint to use for a wooden garden bench that stays outside all year?” What often happens Results about painting indoor furniture Results about painting metal benches Articles about wall paint Impact Slower customer help, wasted time, and irrelevant recommendations
  15. Not all matches are created equal Query “What’s the best

    type of paint to use for a wooden garden bench that stays outside all year?” Spray painting a wrought-iron bench How to paint dining chairs Distractor documents Relevant documents Best paint options for outdoor wood How to paint wood bench
  16. What RAFT does Teach the model to focus on relevant

    documents and why RAFT = Retrieval Augmented Finetuning Uses “hard negatives” (distractors) to teach the model: These look similar, but are wrong These are the true matches Uses Chain of Thought to teach the model why
  17. The Oracle context The gold standard answer RAFT aims for

    Exterior acrylic paint is ideal for outdoor wooden benches because it offers excellent resistance to moisture and UV rays. Apply two thin coats, allowing each to dry fully, and finish with a clear exterior-grade polyurethane sealer for maximum protection.
  18. The value to us Why this matters for a hardware

    store Staff find the right product guidance faster Customers get accurate, reliable answers Less time spent filtering irrelevant advice Higher customer trust → higher likelihood of purchase + higher customer loyalty
  19. Demo: Hands-on with RAFT Robin Counts Retail Store Manager Make

    it precise Improve Cora’s use of retrieved knowledge, reducing customer frustration & driving conversions
  20. RAFT Dataset generation Documents Fine-tuning dataset Questions LLM Chunks Chunking

    Training Valid Eval Oracle and distractor chunks Answers CoT LLM CoT CoT CoT CoT
  21. RAFT makes search as reliable as asking the store expert

    By teaching the system to ignore misleading but similar content, RAFT ensures the right answer is always at the top— just like asking the most knowledgeable person in the store Summary
  22. Recap: Model customization unlocks business goals I’m painting my living

    room wall User input System prompt Few shot examples Add my data Grounded responses Prompt engineering + RAG Shorter prompts Supervised fine-tuning Smaller, cheaper models Distillation Better precision Good choice! I recommend our Eggshell Paint. Would you like to know more about color choices? LLM output RAFT
  23. Recap: Azure AI Foundry makes fine-tuning seamless Model choice The

    best models from the best providers Choose serverless or managed compute Reliability 99.9% availability for Azure OpenAI models Latency guarantees with PTU-M Foundry platform Everything you need in one place: models, training, evaluation, deployments, and metrics Scalability Start with low cost DevTier to experiment Scale up with PTU-M for production workloads
  24. Feedback Your feedback is valuable. Please submit your thoughts about

    today’s experiences at aka.ms/MicrosoftAITour/Survey …or use the QR code. Scan QR code to respond