AI Customization and Fine Tuning

Slide 1

Slide 1 text

A N A I E N G I N E E R G U I D E T O A I C U S T O M I Z AT I O N & F I N E T U N I N G Nitya Narasimhan, PhD Senior AI Advocate, Microsoft #in/nityan Make Azure AI Real

Slide 2

Slide 2 text

1. Rise of the AI Engineer 2. What is AI Customization? 3. Why is AI Customization Useful? 4. The Model Fine Tuning Process 5. The Azure AI Foundry Platform 6. What’s New in AI Customization? 7. Where Can I Learn More? 8. Summary – Q&A

Slide 3

Slide 3 text

The Rise of the AI Engineer Traditional application development separated model creation roles (data science & ML) from model usage (software engineering) – creating a gap in knowledge.

Slide 4

Slide 4 text

I build foundation AI Models I ship AI Products Generative AI keeps evolving fast. The AI Engineer Role bridges the gap between two existing skillsets • Model Selection • Prompt Engineering • Fine Tuning • Retrieval Augmented Generation • AI-Assisted Evaluation • Agent-Assisted Automation

Slide 5

Slide 5 text

Building my AI Engineer Toolkit – Catalog to Code to Cloud Model Selection How do I pick the right model for my needs? Prompt Engineering How do I design the prompt for optimal responses? RAG Design Architecture How do I ground responses in my data or context Model Fine Tuning How can I customize a pre-trained AI model? Model Evaluation How can I assess the quality of an AI model’s responses? CATALOG CODE Multi-Agent Architecture How do I automate tasks & coordinate complex flows? CLOUD Unified E2EPlatform Rich Developer Tools Model Catalog Code-First SDK AI App Templates

Slide 6

Slide 6 text

Slide 7

Slide 7 text

https://ignite.microsoft.com/sessions/BRK101

Slide 8

Slide 8 text

What is fine tuning? Fine-tuning refers to customizing a pre-trained LLM with additional training on a specific task or new dataset for enhanced performance, new skills, or improved accuracy Curated Data Set LLM Fine-Tuned LLM Azure OpenAI Service uses low rank approximation (LoRA) to fine-tune models. LoRA works by approximating the original high-rank matrix with a lower rank one, only fine-tuning a smaller subset of "important" parameters. This technique reduces the complexity of fine tuning while maintaining performance, making training faster and more affordable.

Slide 9

Slide 9 text

Gen AI journey Plan an iterative path from basic to advanced GenAI leveraging your data Prompt engineering Crafting specialized prompts and pipelines to guide model behavior Retrieval augmented generation (RAG) Combining an LLM/SLM with your enterprise data Fine-tuning Adapting a pre-trained Gen AI model to specific datasets or domains Pre-training Training a GenAI model from scratch Accuracy / Complexity / Compute-Intensive

Slide 10

Slide 10 text

Where does fine-tuning fit in? Fine-tune the LLM to: • Reduce the length of your prompt • Show not tell the model how to behave • Improve the accuracy when you look up information • Improve the model’s handling of retrieved data Will my sleeping bag work for my trip to Patagonia next month? Tone and style Weather lookup Example responses Personalization Intent mapping …and more! User input Prompt engineering Output LLM Basic prompt engineering Retrieval/RAG LLMs are language calculators Yes, your Elite Eco sleeping bag is rated to 21.6F, which is below the average low temperature in Patagonia in September

Slide 11

Slide 11 text

Slide 12

Slide 12 text

Why customize Gen AI models? Scalability Customization enables AI models to scale and adapt to specific enterprise needs. Reducing hallucinations Tailored models are less likely to produce inaccurate or irrelevant responses. Increased reliability Enhances the model's accuracy for domain-specific tasks. Improved efficiency Customization ensures faster and more precise results, saving time and resources. Tailored solutions Models are fine-tuned for specific use cases, providing more relevant and context-aware outcomes.

Slide 13

Slide 13 text

General purpose use cases Reducing prompt length Teaching new skills Improving tool use Domain adaptation

Slide 14

Slide 14 text

Vertical specific use cases Natural language to code Translation and dialects Style and formatting Customer specific knowledge infusion

Slide 15

Slide 15 text

Slide 16

Slide 16 text

Before You Begin – Things to consider 1. Is AI Customization justified? (Benefits & Tradeoffs) 2. Is AI Customization viable? (Model & Data Ready) 3. Is AI Customization successful? (Metrics & Insights) Demo: Model Catalog Demo: GPT-4o-mini FT

Slide 17

Slide 17 text

Fine Tuning Guidance – Using Azure AI Foundry Tooling When to use Azure Open AI Fine Tuning Regional Availability of Fine-Tuning Models Fine Tuning Methods (See: Portal, SDK, REST) There are two unique fine-tuning experiences in Azure AI Foundry portal. Both allow you to fine-tune Azure OpenAI models, but only the Hub/Project view supports fine-tuning non Azure OpenAI models.

Slide 18

Slide 18 text

Fine Tuning for GPT-4o and GPT-4o-mini

Slide 19

Slide 19 text

Vision Fine Tuning with Azure Open AI (Nov 2024)

Slide 20

Slide 20 text

Slide 21

Slide 21 text

Diamonds Demo

Slide 22

Slide 22 text

Slide 23

Slide 23 text

Announcing What’s new in Azure OpenAI fine-tuning? Distillation (Stored Completions + Evals + Fine-tuning) Vision fine-tuning W&B and Gretel integration Provisioned and Global Standard deployments

Slide 24

Slide 24 text

Distillation & Data Generation  Distillation refers to the process of using a large, general purpose teacher model to train a smaller student model to perform well at a specific task.  Distillation is of particular interest for several reasons:  reduce the costs and latency  improve performance.  operate in resource-constrained environments  Distillation typically has three steps:  Data Generation (Stored Completions)  Training (Azure OpenAI Finetuning)  Evaluation (Azure OpenAI Evaluation) From Microsoft Product Terms Azure OpenAI Evaluation Define testing criteria Evaluate your Finetuned model Export data with pass status to fine-tune Fine-tuning Select hyper parameters Finetune a GPT-4o-mini model Stored Completions Log GPT-4o model responses View, query and filter data Export filtered data to fine- tune or evaluation

Slide 25

Slide 25 text

Slide 26

Slide 26 text

No content

Slide 27

Slide 27 text

Build Your AI Engineer Toolkit – #GenerativeAIForBeginners Model Selection How do I pick the right model for my needs? Prompt Engineering How do I design the prompt for optimal responses? RAG Design Architecture How do I ground responses in my data or context Model Fine Tuning How can I customize a pre-trained AI model? Model Evaluation How can I assess the quality of an AI model’s responses? CATALOG CODE Multi-Agent Architecture How do I automate tasks & coordinate complex flows? CLOUD Unified E2EPlatform Rich Developer Tools Model Catalog Code-First SDK AI App Templates