[AITour 26] Efficient model customization with Azure AI Foundry

Efficient model customization with Azure AI Foundry Nitya Narasimhan, PhD
Senior AI Advocate Microsoft

What we will cover today Unlock business value with fine-tuning
Model customization is critical when you need to improve the quality, accuracy, and performance of your AI applications Fine-tuning in Azure AI Foundry When is fine-tuning the right choice for me? Why is Azure AI Foundry the right platform to use? Distillation demo Reduce cost by using larger model to train smaller, cheaper model Hands-on with Hybrid RAG+FT Why do we need hybrid customization approaches? How does RAFT improve precision for RAG usage? RAFT demo Improve precision of RAG by teaching model to be more selective Summary Add fine-tuning to your AI engineering toolkit Try Azure AI Foundry for your fine-tuning tasks

Unlock business value with model optimization

Why does Zava need model customization? Bruno Zhao Zava Customer
Help me find the right product—for the right price—and I’ll buy it! Make it helpful Customize Cora’s tone & response format to be polite, helpful & conversational Robin Counts Retail Store Manager Help me build loyalty & move inventory— to grow my store sales Make it precise Improve Cora’s use of retrieved knowledge, reducing customer frustration & driving conversions Kian Lambert App Dev Manager Help me operate Cora to be more effective in cost, response quality Make it cheaper Distill Cora’s knowledge into smaller, cheaper models—reducing cost with comparable accuracy

The Zava AI Engineer’s customization journey Question-answering task using natural
language I’m painting my living room wall. What should I buy? User input Start with prompt engineering Add your data with RAG Tone and style Example responses Intent mapping Query extraction Inventory retrieval Personalization Context engineering Optimize model with fine-tuning Deliver fast, accurate response with less cost Good choice! I recommend our Eggshell Paint. Would you like to know more about color choices? Model adaptation Desired output

What does fine-tuning mean? LLM Fine-tuned LLM Fine-tuning refers to
customizing a pre-trained LLM with additional training on a specific task or new dataset for enhanced performance, new skills, or improved accuracy

Why should Zava consider fine-tuning? Domain-specific optimization Task-specific optimization Reduced
token consumption Efficient resource utilization Smaller models, faster response Shorter prompts, improve response Improve quality Reduce cost Reduce latency Example: Zava has a domain-specific focus (retail) and task-specific focus (question-answering). Let’s think about how fine-tuning can help optimization

How can Zava use fine-tuning for optimization? Fine tuning Optimize
Context RAG Hybrid Fine Tuning + RAG High Low High Low Optimize Model Behavior Do something different Learn something new Deliver optimized experience I want customer responses to use only Zava product data as relevant context I want to improve usage of RAG to be more precise I want faster response to customer queries I want to save token costs with shorter prompts I want to adapt model to reflect Zava tone & style

How can Azure AI Foundry help the Zava AI developer?
Azure AI Foundry Faster models, simpler methods GPT-4.1, -mini, -nano o4-mini RFT Llama 4 Scout Fine tune anywhere Global training for OAI Datazone for OSS Developer tier Experiment with ease —no hosting fee Enterprise ready ai.azure.com PTU-M, Global GA Extended model support

Customization options in Azure AI Foundry

What are my fine-tuning options in Azure AI Foundry? Supervised
fine-tuning Module learns from examples Ex: Content generation task Reinforcement fine-tuning Use grader to reinforce CoT Ex: Reasoning tasks Model distillation Transfer learning to cheaper model Optimize for COST Vision fine-tuning Preference fine-tuning Hybrid fine-tuning Improve image understanding Ex: ClassificationTask Provide good & bad examples Ex: Tone adaptation Improve model use of RAG context Optimize for PRECISION

What are the added benefits of fine-tuning here? Built-in safety
Continual fine-tuning Simplified fine-tuning developer experience Global Training For fine-tuning Developer Tier For model hosting Cost-effective training & testing

How can I fine-tune my model to customize the tone?
Decide vision and scope Choose base model Choose FT technique Pick enterprise-ready model options Dataset Fine-tuning Evaluation Deploy and monitor Regularly benchmark and iterate!

Demo: Supervised fine-tuning in Zava Bruno Zhao Zava Customer Make
it helpful Customize Cora’s tone & response format to be polite, helpful & conversational

How can I use distillation to reduce cost of operation?
Distillation is the process of using a large, general-purpose teacher model to train a smaller student model to perform better at a specific task Task GPT-4.1 Data generation Data Distillation GPT-4.1-nano Training Teacher Model Student Model Distilled Model Evaluation 4.1-nano*

Demo: Using distillation for Zava Kian Lambert App Dev Manager
Make it cheaper Distill Cora’s knowledge into smaller, cheaper models— reducing cost with comparable accuracy

Retrieval-augmented fine-tuning

When search brings you the wrong tools for the job
Query “What’s the best type of paint to use for a wooden garden bench that stays outside all year?” What often happens Results about painting indoor furniture Results about painting metal benches Articles about wall paint Impact Slower customer help, wasted time, and irrelevant recommendations

Not all matches are created equal Query “What’s the best
type of paint to use for a wooden garden bench that stays outside all year?” Spray painting a wrought-iron bench How to paint dining chairs Distractor documents Relevant documents Best paint options for outdoor wood How to paint wood bench

What RAFT does Teach the model to focus on relevant
documents and why RAFT = Retrieval Augmented Finetuning Uses “hard negatives” (distractors) to teach the model: These look similar, but are wrong These are the true matches Uses Chain of Thought to teach the model why

The Oracle context The gold standard answer RAFT aims for
Exterior acrylic paint is ideal for outdoor wooden benches because it offers excellent resistance to moisture and UV rays. Apply two thin coats, allowing each to dry fully, and finish with a clear exterior-grade polyurethane sealer for maximum protection.

The value to us Why this matters for a hardware
store Staff find the right product guidance faster Customers get accurate, reliable answers Less time spent filtering irrelevant advice Higher customer trust → higher likelihood of purchase + higher customer loyalty

Demo: Hands-on with RAFT Robin Counts Retail Store Manager Make
it precise Improve Cora’s use of retrieved knowledge, reducing customer frustration & driving conversions

RAFT Process

Phase 1/4: Dataset generation

RAFT Dataset generation Documents Fine-tuning dataset Questions LLM Chunks Chunking
Training Valid Eval Oracle and distractor chunks Answers CoT LLM CoT CoT CoT CoT

Phase 2/4: Fine-tuning

Phase 3/4: Model deployment

Phase 4/4: Eval / a - Testing Baseline

Phase 4/4: Eval - a - Testing Student

Phase 4/4: Eval - c - Scoring

Phase 4/4: Eval - d - Results

RAFT makes search as reliable as asking the store expert
By teaching the system to ignore misleading but similar content, RAFT ensures the right answer is always at the top— just like asking the most knowledgeable person in the store Summary

Summary & recap

Recap: Model customization unlocks business goals I’m painting my living
room wall User input System prompt Few shot examples Add my data Grounded responses Prompt engineering + RAG Shorter prompts Supervised fine-tuning Smaller, cheaper models Distillation Better precision Good choice! I recommend our Eggshell Paint. Would you like to know more about color choices? LLM output RAFT

Recap: Azure AI Foundry makes fine-tuning seamless Model choice The
best models from the best providers Choose serverless or managed compute Reliability 99.9% availability for Azure OpenAI models Latency guarantees with PTU-M Foundry platform Everything you need in one place: models, training, evaluation, deployments, and metrics Scalability Start with low cost DevTier to experiment Scale up with PTU-M for production workloads

Learn more: Unlock business value with fine tuning Read Whitepaper
Visit GitHub Repo

Feedback Your feedback is valuable. Please submit your thoughts about
today’s experiences at aka.ms/MicrosoftAITour/Survey …or use the QR code. Scan QR code to respond

aka.ms/MicrosoftAITour/BRK443 Download today’s presentation …or scan the QR code. Scan
QR code to download

aka.ms/BestModelGenAISolution aka.ms/BRK443GHrepo Next steps to advance your AI expertise

aka.ms/aiagentopenhack Free In-Person Hands-on Learning …or scan the QR code.
Scan QR code to download

[AITour 26] Efficient model customization with ...

[AITour 26] Efficient model customization with Azure AI Foundry

More Decks by Nitya Narasimhan, PhD

Other Decks in Technology

Featured

Transcript