Harnessing Backend.AI for AI Model Training in Supply Chain Contexts - LEKSIKOV SERGEY,권용근

Harnessing Backend.AI for AI Model Training in Supply Chain Context
Leksikov Sergey, 권용근

Outline 1. Part 1 – Background and Demo 1. Introduction
2. Background 3. Objective 4. Demo 2. Part 2 - Data Preparation and Processing pipeline 1. Data acquisition for Domain Adaptation 2. Synthetic Data Generation Pipeline 3. Part 3 - Model Development and Evaluation 1. Domain Adaptation 2. Pre-Instruction Tuning (PIT) (using FastTrack) 3. Key-Fact Fine-tuning 4. Fine-tuning & Evaluation

Part 1 – Background and Demo

Introduction  "In business, time is money" o Every minute
wasted on manual processes in buyer-supplier communications is a missed opportunity for efficiency

Background: International Trade (국제무역)  International Trade (국제무역) o Exchange
goods or services across borders, connecting supplier and buyers globally o Key in enabling access to broader markets, reducing costs o Supply Chain Management (SCM) ensures smooth operations by managing  Logistics  Customs  Tarrifs  Deliveries

Background: Buyer-Supplier Interaction – Buyer (Initial Inquiry):  "안녕하세요, 귀사의
제품에 대해 더 많은 정보를 알고 싶습니다. 특히 X 제품의 가격과 배송 기간에 대해 알려주시면 감사하겠습니다." – Supplier (Response):  "안녕하세요. X 제품의 가격은 개당 10,000원이며, 최소 주문 수량은 50개입니다. 배송 기간은 주문 후 약 7일 정도 소요됩니다." – Buyer (Negotiation):  "가격을 조금 조정할 수 있을까요? 100개를 주문하면 할인이 가능한지 알고 싶습니다." – Supplier (Response to Negotiation):  "100개 주문 시 5% 할인을 제공해드릴 수 있습니다. 이 경우 개당 9,500원이 되며, 배송 기간은 동일하게 유지됩니다." – Buyer (Request for Quote):  "최종 조건으로 견적서를 보내주시면 감사하겠습니다. 확인 후 바로 주문하겠습니다." – Supplier (Final Quote):  "첨부된 견적서를 확인해주시기 바랍니다. 기타 문의 사항이 있으시면 언제든지 연락 주십시오."

Background: Quote Document  Quote (견적서) document: o Document from
seller (supplier) to buyer o Contains:  price, quantity, delivery date, shipping options

Background: Quote Document – examples

Background: Key Challenges  Large amounts of data o Ex.
Long email communications  Scattered Information Across Multiple Emails  Manual information extraction o Errors can occur o Missing details o Takes long time  Time-Consuming Quote Generation  Privacy and confidentiality o Can not use OpenAI ChatGPT and other API services to help

Objective  Eliminate manual work via automation  Reduce time
spent for information collection and quote generation  Keep information private and confidential

Backend.AI Solution - "QuoteFlow"  "QuoteFlow" System – Automated email
processing and quote generation pipeline using open-source self-hosted LLM run on Backend.AI  Summarization  Key-Fact Extraction  Tag Assignment  Quote Document Generation

Solution – Inference Pipeline

Large Language Model use case  LoRA continual pre-training LLM
on domain adaptation of international trade domain  Synthetic data generation for email conversation using LLM inference o Email conversations are private data and cannot be obtained  LoRA Fine-tuning LLM on extraction key-facts  Using LLM for quote generation using markdown syntax  We use Gemma 2 – 2b Instruct o Multilingual o Small

Text Inference App Demo on Backend.AI - QuoteFlow

Summary and Tag Extraction

Tag Assignment

Key-Fact Extraction

Quote Generation

Part 2-Data Preparation and Processing Pipeline Diving into more details

Domain Adaptation (DA) - Data collection  Gemma 2 LLM
model, may not know some concepts and definition from International Trade  LoRAcontinual pre-training Gemma 2 on vocabulary definition and domain can better help understand email conversation  Lectures were pre-processed by turning them into detailed summaries Lecture transcript

Instruction Fine-tuning  Continual-pretraining Domain Adaptation => Instruction Finetuning the
model: o Answer questions regarding the domain dataset o To summarize given text

Synthetic Dataset Generation

Data Flow

1. Defining random variables for generation

2. Initialize variable for Scenario Generation

3. Scenario Generation

4. Company Profile Generation

Step 4 - Email conversation generation. 28 • PyAutoGen was
used to instantiate Agent for Buyer and Supplier and make them do conversation

5. Key-Fact extraction using LLM

6. Quote Document Generation

7. PDF file conversion

Preparation dataset for finetuning  Input: Email Conversation  Target:
Key-Facts Total synthetic conversations generated and used for finetuning: 1155 The conversations represented as a json format with 'role', 'content' keys.

Training dataset with Input and Target Train Input Train Target

Formatting dataset for training 34 • Gemma – dialogue template
used to format conversations

Part 3 – Model training and Evaluation

Concept of Domain Adaptation 36

Why Domain Adaptation (vs scratch) 37 Train model from scratch
DAPT No general knowledge Already have general knowledge High cost (Relatively) Low cost (Relatively) Require large domain dataset (At least tens of GB) Relatively require small domain dataset Train domain-adaptive model (Domain-specific PLM) Difficult to make sufficient domain-specific dataset

Training Flow 38 Foundation Model Domain- Adaptive Model Pre- Instructed
Model Fine-tuned Model Pretraining on raw domain dataset Fine-tuning on Q/A dataset (PIT) Fine-tuning on extraction details or Summarization & Tags

Dataset: Pre-Instruction Tuning (PIT) 40 Raw text, speech to text
(lecture) Json format, term-description pair Raw text, question-options-answer

Dataset: Pre-Instruction Tuning (PIT) 41

Dataset: Pre-Instruction Tuning (PIT) 42 Q/A Dataset 25,993 Pairs

Training: Pre-Instruction Tuning (PIT) (using FastTrack) 43

Training: Pre-Instruction Tuning (PIT) (using FastTrack) 44 Skipped Running Todo
Log Checking

Training: Pre-Instruction Tuning (PIT) (using FastTrack) 45 Log Checking Status

Evaluation: Pre-Instruction Tuning (PIT) 46 Base model Trained model 0.62
0.77 1. SemScore • Target response와 Model response간의 의미적 유사도. 1.00 is the best Q: 송장과 포장 명세서의 차이점은 무엇인가요? • Target response: 송장은 수량, 단가, 총 금액을 포함하며, 포장 명세서는 순중량, 총중량, 포장단위, 포장수량을 포함합니다. 송장과 포장 명세서는 대부분 동일한 내용을 가지지만, 포장 명세서는 금액 정보가 없고 포장 관련 정보가 추가됩니다. • Model response: 송장과 포장 명세서는 내용이 유사하지만, 포장 명세서는 물픔의 포장 상태를 구체적으로 기재하여 실제 포장의 확인을 위한 기준으로 사용됩니다. Similarity: 0.723 => Trained model 이 Base model에 비해 의미적 유사도를 더 잘 반영하고 있음

Evaluation: Pre-Instruction Tuning (PIT) 47 2. Truthfulness • LLM에게 Model
response와 Target response를 제공한 뒤, 1-5점 사이의 점수를 이유와 함께 제공받음. Q: 운송이 물류에서 어떤 역할을 하는가요? • Target response: 물류비 절감에 중요한 역할을 합니다. • Model response: 운송은 물류의 핵심이며, 효율적인 운송을 위해서는 적절한 수단과 경로를 선택해야 합니다. • Score: 4 • Reason: 모델의 응답은 대부분 사실에 근거하고 있습니다. 모델은 물류의 핵심 요소로서 운송을 정확히 식별하고, 효율적인 운송을 위해 적절한 수단과 경로를 선택하는 것의 중요성을 강조하고 있습니다. 그러나 비용 절감이라는 특정 측면을 직접적으로 다루지는 않아 감점했습니다. => Training was effective: Low scoring(1-2)이 줄어들고, High scoring(4-5)의 비율이 늘어남.

Fine-tuning: Key Fact Extraction 49 Goal: Extract Key Facts in
Yaml Format from email conversation

Dataset: Key Fact Extraction 50 1,143 Set of synthetic email
conversation dataset

Deepspeed를 활용한 Multi-node training (on Backend.AI Cloud) 51 Training code
Deepspeed Config Hostfile.txt Node IP Num of GPUs

Deepspeed를 활용한 Multi-node training (on Backend.AI Cloud) 52 Main1 Container
Sub1 Container Sub2 Container Sub3 Container

Evaluation: Key Fact Extraction (metric: F1-score) 53 Model Response Answer
Key, Value mismatch

Evaluation: Key Fact Extraction (metric: F1-score) 54 True Positive :
Exact matching True Negative: When both true, prediction are empty False Positive: When prediction is not null & prediction is different from true False Negative: When prediction is null but true has value F1-score F1-score: 0.72

Fine-tuning: Summarization & Tags 56 Goal: Extract summary & tags
from email conversation 1. 초기 문의 및 견적 요청: • 발신자: 이민재 한국에너텍 구매담당 과장. • 긴급성과 규정 준수를 강조하면서 에너지 솔루션에 대한 자세한 견적 요청을 보냈습니다. 예산은 5억 원으로 명시됐습니다. 2. 답변: • 발신자: EnerTech Solutions 수석 영업 관리자 David Park. • 태양광 패널, 풍력 터빈, 에너지 저장 시스템, 설치/유지보수 서비스가 포함된 세부 견적을 제공하며 총 5억 원입니다. • 배송 시 결제, 표준 배송과 빠른 배송 옵션, 규정 준수, 5년 보증 등의 서비스 약관이 포함되어 있습니다. Summary ['거래', '협상', '공급망', '에너지', '기술', '계약', '한국'] Tags

Dataset: Summarization & Tags 57 1143 Set of summary &
tags of email conversation

Evaluation: Summarization 58 FineSurE: Fine-grained Summarization Evaluation using LLMs (2024)
Faithfulness: 문서에 없는 정보를 포함하거나, 문서의 내용과 다른 잘못된 정보를 포함하고 있는가? (Hallucination) Completeness: 모든 핵심 사실을 포함하고 있는가? Conciseness: 모델의 요약에 불필요한 세부 정보가 포함되지 않았는가?

Evaluation: Summarization 59 FineSurE: Fine-grained Summarization Evaluation using LLMs (2024)
0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 70.00% 80.00% 90.00% 100.00% Faithfuflness Completeness Conciseness Evaluation using FineSurE for nine LLMs Phi-2 Mixtral-8x7b Mixtral-8x7b-Inst Llama3-70b-Inst Gemini-1-pro GPT-3.5-turbo GPT-4-turbo GPT-4-omni Fine-tuned model (Ours) • High faithfulness • High completeness • Better than gpt-4 turbo’s conciseness

Evaluation: Tags 60 ['에너지 솔루션', '에너지 규정', '신속 배송', '지불
조건', '주문 확인', '배송', '보증', '한국 에너테크 주식회사'] Model Generate Tags ['에너지', '에너지', '신속 배송', '지불 조건', '주문 확인', '협상', '보증', '한국'] Adjusted Model generated Tags ['거래', '협상', '공급망', '에너지', '기술', '계약', '한국'] Target Tags 의미적 유사도 비교 F1-score 계산

Evaluation: Tags 61 • Metric: F1-score • 추출해야할 태그의 수와
범위가 명확하지 않아 모델의 성능이 과소평가 될 수 있음 • 의미적으로 유사한 단어는 올바르게 추출한 것으로 간주 • Base model에 비해 89% 향상된 태그 추출 능력

Conclusion  "QuoteFlow" - Backend.AI system pipeline introduction  Domain
Adaptation training  Instruction Finetuning  Synthetic Dataset Generation  Key-Fact Extraction  Evaluation

Appendix

Harnessing Backend.AI for AI Model Training in ...

Harnessing Backend.AI for AI Model Training in Supply Chain Contexts - LEKSIKOV SERGEY,권용근

More Decks by Lablup Inc.

Other Decks in Technology

Featured

Transcript