Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Finetuning LLMs on consumer GPUs
Search
Aniket Maurya
November 07, 2023
Programming
170
0
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
Finetuning LLMs on consumer GPUs
Aniket Maurya
November 07, 2023
More Decks by Aniket Maurya
See All by Aniket Maurya
Building RAG powered applications - PyData London 2nd April
aniketmaurya
0
64
Contributing to Lightning AI OSS
aniketmaurya
0
81
Other Decks in Programming
See All in Programming
「AIで開発し、AIを届ける」をEvalでつなぐ 〜AIネイティブに始めるプロダクト開発の実践〜 / Connecting "Develop with AI, deliver AI" with Eval
rkaga
4
4.8k
Javaの型とAI時代に型が大事な理由 / java types and type in AI era
kishida
2
120
Skillsは効率化、Agentsは"自分の拡張"——Builder時代のエージェント編成(CC Night 2026)
wemra
1
110
Oxcを導入して開発体験が向上した話
yug1224
4
290
過去最大のMCPアップデート! 2026-07-28 RC版の謎に迫る
licux
3
150
AIエージェントの隔離技術の徹底比較
kawayu
0
470
Spec-Driven Development with AI-Agents: From High-Level Requirements to Working Software
antonarhipov
2
470
AIとRubyの静的型付け
ukin0k0
0
550
AI 時代のソフトウェア設計の学び方
masuda220
PRO
29
12k
ローカルLLMを使ってB2Bサービスを作っていての学び
yaotti
0
150
柔軟なPDFレイアウトエディタを支える型システム設計 — Discriminated UnionとConditional Typeの実践
minako__ph
4
1.6k
気づいたらRubyで100作品 ー クリエイティブコーディングが生活の一部になるまで / 100 Ruby Sketches Later: How Creative Coding Became Part of My Life
chobishiba
3
550
Featured
See All Featured
実際に使うSQLの書き方 徹底解説 / pgcon21j-tutorial
soudai
PRO
201
75k
The AI Revolution Will Not Be Monopolized: How open-source beats economies of scale, even for LLMs
inesmontani
PRO
3
3.5k
Leveraging LLMs for student feedback in introductory data science courses - posit::conf(2025)
minecr
1
280
Navigating the Design Leadership Dip - Product Design Week Design Leaders+ Conference 2024
apolaine
1
340
The untapped power of vector embeddings
frankvandijk
2
1.7k
Highjacked: Video Game Concept Design
rkendrick25
PRO
1
380
"I'm Feeling Lucky" - Building Great Search Experiences for Today's Users (#IAC19)
danielanewman
231
23k
Marketing Yourself as an Engineer | Alaka | Gurzu
gurzu
0
210
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
31
3.2k
Have SEOs Ruined the Internet? - User Awareness of SEO in 2025
akashhashmi
0
360
Dealing with People You Can't Stand - Big Design 2015
cassininazir
367
27k
My Coaching Mixtape
mlcsv
0
140
Transcript
November 2023 1. Finetuning LLMs on consumer GPUs 2. LLM
Evaluation framework and datasets 3. Deep Dive into Transformers 4. Effortlessly analyze multifaceted financial documents with LlamaIndex
Finetuning LLMs on custom datasets Aniket Maurya, Developer Advocate at
Lightning AI November 2023 X.com/aniketmaurya linkedin.com/in/aniketmaurya
• Overview of LLMs • Parameter efficient finetuning with instruction
dataset • Training on consumer GPUs Lightning AI ©2023 Proprietary and Confidential. All Rights Reserved. 3 Agenda
What are LLMs Lightning AI ©2023 Proprietary and Confidential. All
Rights Reserved. 4
Lightning AI ©2023 Proprietary and Confidential. All Rights Reserved. 5
What are LLMs
Lightning AI ©2023 Proprietary and Confidential. All Rights Reserved. 6
What are LLMs Source: Attention is All you Need
Lightning AI ©2023 Proprietary and Confidential. All Rights Reserved. 7
What are LLMs
Lightning AI ©2023 Proprietary and Confidential. All Rights Reserved. 8
What are LLMs
Lightning AI ©2023 Proprietary and Confidential. All Rights Reserved. 9
What are LLMs *Decoder
Lightning AI ©2023 Proprietary and Confidential. All Rights Reserved. 10
Parameter Efficient Finetuning Source : https://lightning.ai/pages/community/tutorial/lora-llm/
Lightning AI ©2023 Proprietary and Confidential. All Rights Reserved. 11
Parameter Efficient Finetuning
• Remove untruthfulness and toxicity • Customize the output and
tone of language • Privacy and control Lightning AI ©2023 Proprietary and Confidential. All Rights Reserved. 12 Why Finetune LLMs
Lightning AI ©2023 Proprietary and Confidential. All Rights Reserved. 13
Finetuning LLMs on instruction dataset
• Setup model • Prepare data • Finetune the model
Lightning AI ©2023 Proprietary and Confidential. All Rights Reserved. 14 Finetuning LLMs
• Setup model • Prepare data • Finetune the model
Lightning AI ©2023 Proprietary and Confidential. All Rights Reserved. 15 Finetuning LLMs
• Setup model • Prepare data • Finetune the model
Lightning AI ©2023 Proprietary and Confidential. All Rights Reserved. 16 Finetuning LLMs
Lightning AI ©2023 Proprietary and Confidential. All Rights Reserved. 17
Lightning AI ©2023 Proprietary and Confidential. All Rights Reserved. 18
• 4-bit quantized finetuning and inference • Minimal code, easy
to debug and hack • TPU support • Flash-Attention 2 Lightning AI ©2023 Proprietary and Confidential. All Rights Reserved. 19 Lit-GPT
Lightning AI ©2023 Proprietary and Confidential. All Rights Reserved. 20
Finetuning Llama on instruction dataset
Lightning AI ©2023 Proprietary and Confidential. All Rights Reserved. 21
Setup Model
Lightning AI ©2023 Proprietary and Confidential. All Rights Reserved. 22
Setup Model
Lightning AI ©2023 Proprietary and Confidential. All Rights Reserved. 23
Prepare Dataset
Lightning AI ©2023 Proprietary and Confidential. All Rights Reserved. 24
Finetune
CUDA Out Of Memory Lightning AI ©2023 Proprietary and Confidential.
All Rights Reserved. 25
• Llama 7B, fp32: ~28GB • Llama 7B, fp16: ~14GB
Lightning AI ©2023 Proprietary and Confidential. All Rights Reserved. 26 Memory Required to load Llama
• Activation memory • Gradient memory • Optimizer memory •
Model memory Lightning AI ©2023 Proprietary and Confidential. All Rights Reserved. 27 Memory Usage
• Activation memory • Gradient memory • Optimizer memory •
Model memory Source: https://tinkerd.net/blog/machine-learning/distributed-training/ 28 Memory Usage
29 • Reduce the micro batch size Avoid OOM
30 • Reduce the model's context length • Reduce the
micro batch size Avoid OOM
31 • Reduce the model's context length • Use lower
precision • Reduce the micro batch size Avoid OOM
• 4-bit quantization 32 • Reduce the model's context length
• Use lower precision • Reduce the micro batch size Avoid OOM
72% memory reduction Lightning AI ©2023 Proprietary and Confidential. All
Rights Reserved. 33
34 • Reduce the model's context length • Use lower
precision • 4-bit quantization • Do sharding across multiple GPUs • Reduce the micro batch size Avoid OOM
Lightning AI ©2023 Proprietary and Confidential. All Rights Reserved. 35
Avoid OOM
• Lit-GPT with LoRA finetuning • Lower Precision and 4-bit
quantization • Distributed training and activation checkpointing Lightning AI ©2023 Proprietary and Confidential. All Rights Reserved. 36 Conclusion
Lightning AI ©2023 Proprietary and Confidential. All Rights Reserved. 37
Aniket Maurya