Implementing a RAG System in Dart @FlutterNinjas

Slide 1

Slide 1 text

Implementing a RAG System in Dart @jaichangpark RAG* : Retrieval Augmented Generation 1

Slide 2

Slide 2 text

Dreamus Company Community ! ● Flutter Seoul Organizer ● GDG Golang Korea Organizer PARK. JAI-CHANG (박제창) @jaichangpark 2

Slide 3

Slide 3 text

Agenda 1 2 3 4 Overview What’s RAG (Retrieval Augmented Generation) LangChain Build a RAG App with Dart 3

Slide 4

Slide 4 text

Overview 4

Slide 5

Slide 5 text

Overview 5

Slide 6

Slide 6 text

NLP (Natural Language Processing) Overview 6

Slide 7

Slide 7 text

What’s Large Language Model Overview 7 ● LLM stands for Large Language Model. ● An LLM is a language model based on artiﬁcial neural networks, trained on vast amounts of text data. ● There are limitations in inferring the latest data to produce results. Various methods are being developed to overcome this. ● LLMs are slightly diﬀerent from search engines (LLMs were more often preferred for tasks requiring nuanced understanding and language processing.) ● Examples of usage: ○ Chatbots: Can answer questions and engage in conversations. ○ Text summarization: Can summarize long documents into shorter versions. ○ Translation: Can translate from one language to another. ○ Writing: Can perform creative writing and code generation.

Slide 8

Slide 8 text

@source: Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond The evolutionary tree of modern LLMs traces the development of language models in recent years and highlights some of the most well-known models. Models on the same branch have closer relationships. Transformer-based models are shown in non-grey colors: decoder-only models in the blue branch, encoder-only models in the pink branch, and encoder-decoder models in the green branch. The vertical position of the models on the timeline represents their release dates. Open-source models are represented by solid squares, while closed-source models are represented by hollow ones. The stacked bar plot in the bottom right corner shows the number of models from various companies and institutions. Overview The evolutionary tree of modern LLMs 8

Slide 9

Slide 9 text

9 Overview Limitations of LLMs in Learning from the Latest Data

Slide 10

Slide 10 text

10 Overview Limitations of LLMs in Learning from the Latest Data

Slide 11

Slide 11 text

11 Overview Limitations of LLMs in Learning from the Latest Data

Slide 12

Slide 12 text

3 12 Causes of LLM Hallucinations ● Data Limitations: The training data may not cover all possible knowledge areas comprehensively, leading the model to fill gaps with fabricated information. ● Model Training: When an AI model becomes overfitted, it generates outputs that are highly specific to the training data and do not generalize well to new data. This can lead to hallucinations or the production of irrelevant outputs by the AI model. + training data bias/inaccuracy and high model complexity. ● Prompt Ambiguity: Ambiguous or poorly phrased prompts can lead the model to generate responses that diverge from factual accuracy. Overview LLM Hallucination

Slide 13

Slide 13 text

2024-05-27 @Source: FLO www.music-ﬂo.com/ Top 100: South Korea 13

Slide 14

Slide 14 text

Top 100: South Korea 14 2024-05-27 @Source: FLO www.music-ﬂo.com/

Slide 15

Slide 15 text

@source: https://en.wikipedia.org/wiki/NewJeans https://chatgpt.com/share/dac11c95-89a7-40ec-88bd-deb93b4c6fd3 15 Overview Example of LLM Hallucination

Slide 16

Slide 16 text

16 Overview Preventing LLM hallucinations When is New Jeans' Japanese debut date? 1. Enhancing data through web searches for utilization in user queries. 2. This enables responses to questions regarding the latest data.

Slide 17

Slide 17 text

What’s RAG Retrieval Augmented Generation 17

Slide 18

Slide 18 text

Retrieval Augmented Generation RAG 18 Retrieval-augmented generation (RAG) is a software architecture and technique that integrates large language models with external information sources, enhancing the accuracy and reliability of generative AI models by incorporating speciﬁc business data like documents, SQL databases, and internal applications. @source: Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

Slide 19

Slide 19 text

Retrieval Augmented Generation RAG 19 @source: https://arxiv.org/abs/2005.11401

Slide 20

Slide 20 text

Load document & Save Vector DB RAG 8 Step Process 2 3 4 1 Split Split the loaded document into chunk units Load Read data from documents (pdf, word, xlsx), web pages, Notion, Conﬂuence, etc. Embedding Convert a document to a vector representation Vector Store Save the converted vector in DB 20

Slide 21

Slide 21 text

Search documents & Get results RAG 8 Step Process 6 7 8 5 Prompt Prompt to derive desired results based on search results Retrieval Similarity search cosine, mmr LLM Select model (GPT-4o, GPT-4, GPT-3.5, Gemini, Llama, Gemma etc) Output Output such as text, JSON, markdown, etc. 21

Slide 22

Slide 22 text

RAG Process 22

Slide 23

Slide 23 text

How to Implement? 23 1. Building an LLM-based application using RAG architecture with LangChain 2. Building an LLM-based app using LammaIndex 3. Implementing all processes directly for Developer

Slide 24

Slide 24 text

🦜🔗 Langchain 24

Slide 25

Slide 25 text

LangChain 25 What’s Langchain ● LangChain is a framework for developing applications powered by large language models (LLMs). a. Open-source libraries: Build your applications using LangChain's modular building blocks and components. b. Productionization: Inspect, monitor, and evaluate your apps with LangSmith so that you can constantly optimize and deploy with conﬁdence. c. Deployment: Turn any chain into a REST API with LangServe.

Slide 26

Slide 26 text

LangChain 26 Langchain Dart ● LangChain.dart is an unoﬃcial Dart port of the popular LangChain Python framework created by Harrison Chase. @source: https://pub.dev/packages/langchain

Slide 27

Slide 27 text

LangChain Dart 27 Motivation ● The adoption of LLMs is creating a new tech stack in its wake. However, emerging libraries and tools are predominantly being developed for the Python and JavaScript ecosystems. As a result, the number of applications leveraging LLMs in these ecosystems has grown exponentially. ● In contrast, the Dart / Flutter ecosystem has not experienced similar growth, which can likely be attributed to the scarcity of Dart and Flutter libraries that streamline the complexities associated with working with LLMs. ● LangChain.dart aims to ﬁll this gap by abstracting the intricacies of working with LLMs in Dart and Flutter, enabling developers to harness their combined potential eﬀectively. @source: https://github.com/davidmigloz/langchain_dart

Slide 28

Slide 28 text

LangChain Dart Pros ● Even without knowing Python or JS, you can create an LLM app (client) using only the Dart language. ● The oﬃcial documentation is well-organized and easy to use. ○ https://langchaindart.de v/#/ Cons ● Currently, the number of supported third-party libraries is limited. 28

Slide 29

Slide 29 text

Build a RAG App with Dart 29

Slide 30

Slide 30 text

Architecture 30

Slide 31

Slide 31 text

PDF loader 1 Data Loader Future _pickPDFText() async { var filePickerResult = await FilePicker.platform.pickFiles(); if (filePickerResult != null) { _pdfDoc = await PDFDoc.fromPath(filePickerResult.files.single.path!); String text = await _pdfDoc!.text; setState(() {}); } } Future _fromPDFURL() async { if (urlTextController.text.isNotEmpty) { _pdfDoc = await PDFDoc.fromURL(urlTextController.text.trim()); String text = await _pdfDoc!.text; setState(() {}); } return; } 31 dependencies: flutter: sdk: flutter flutter_pdf_text: 0.6.0

Slide 32

Slide 32 text

PDF loader 1 Data Loader Future _pickPDFText() async { var filePickerResult = await FilePicker.platform.pickFiles(); if (filePickerResult != null) { _pdfDoc = await PDFDoc.fromPath(filePickerResult.files.single.path!); String text = await _pdfDoc!.text; setState(() {}); } } Future _fromPDFURL() async { if (urlTextController.text.isNotEmpty) { _pdfDoc = await PDFDoc.fromURL(urlTextController.text.trim()); String text = await _pdfDoc!.text; setState(() {}); } return; } 32 dependencies: flutter: sdk: flutter flutter_pdf_text: 0.6.0 data: k-pop idol Groups & wikipedia

Slide 33

Slide 33 text

Slide 34

Slide 34 text

2 Text Split 34 dependencies: flutter: sdk: flutter langchain: List docs = []; const splitter = RecursiveCharacterTextSplitter( chunkSize: 2000, chunkOverlap: 400, ); final splitLists = splitter.splitText(_text); final _ids = splitLists.map( (e) => const Uuid().v4(), ) .toList(); docs = splitter.createDocuments(splitLists, ids: _ids);

Slide 35

Slide 35 text

2 Text Split 35 List docs = []; const splitter = RecursiveCharacterTextSplitter( chunkSize: 2000, chunkOverlap: 400, ); final splitLists = splitter.splitText(_text); final _ids = splitLists.map( (e) => const Uuid().v4(), ) .toList(); docs = splitter.createDocuments(splitLists, ids: _ids); dependencies: flutter: sdk: flutter langchain:

Slide 36

Slide 36 text

2 Text Split 36 List docs = []; const splitter = RecursiveCharacterTextSplitter( chunkSize: 2000, chunkOverlap: 400, ); final splitLists = splitter.splitText(_text); final _ids = splitLists.map( (e) => const Uuid().v4(), ) .toList(); docs = splitter.createDocuments(splitLists, ids: _ids); dependencies: flutter: sdk: flutter langchain: Original Text Chunk Chunk Chunk

Slide 37

Slide 37 text

2 Text Split 37 List docs = []; const splitter = RecursiveCharacterTextSplitter( chunkSize: 2000, chunkOverlap: 400, ); final splitLists = splitter.splitText(_text); final _ids = splitLists.map( (e) => const Uuid().v4(), ) .toList(); docs = splitter.createDocuments(splitLists, ids: _ids); dependencies: flutter: sdk: flutter langchain:

Slide 38

Slide 38 text

3 Embedding 38 const openaiApiKey = "API KEY"; final embeddings = OpenAIEmbeddings(apiKey: openaiApiKey); String baseUrl = "localhost"; if (defaultTargetPlatform == TargetPlatform.android) { baseUrl = "10.0.2.2"; } dependencies: flutter: sdk: flutter langchain: final embeddings = OllamaEmbeddings(model: "nomic-embed-text");

Slide 39

Slide 39 text

4 Vector Store 39 dependencies: flutter: sdk: flutter langchain: final vectorStore = Chroma(baseUrl: "http://$baseUrl:8000", embeddings: embeddings); await vectorStore.addDocuments(documents: docs);

Slide 40

Slide 40 text

4 Vector Store 40 final vectorStore = Chroma(baseUrl: "http://$baseUrl:8000", embeddings: embeddings); await vectorStore.addDocuments(documents: docs); dependencies: flutter: sdk: flutter langchain: w/ Text Embedding

Slide 41

Slide 41 text

5 Retriver 41 final retriever = vectorStore.asRetriever(); final setupAndRetrieval = Runnable.fromMap({ 'context': retriever.pipe( Runnable.mapInput((docs) => docs.map((d) => d.pageContent).join('\n')), ), 'question': Runnable.passthrough(), }); Similarity search

Slide 42

Slide 42 text

5 Retriver 42 final retriever = vectorStore.asRetriever(); final setupAndRetrieval = Runnable.fromMap({ 'context': retriever.pipe( Runnable.mapInput((docs) => docs.map((d) => d.pageContent).join('\n')), ), 'question': Runnable.passthrough(), }); Similarity search for query

Slide 43

Slide 43 text

6 Prompt 43 final promptTemplate = PromptTemplate.fromTemplate( "Answer the question based on only the following " "context:\n{context}\n{question}"); final chatPromptTemplate = ChatPromptTemplate.fromTemplates( const [ ( ChatMessageType.system, 'Answer the question based on only the following context:\n{context}' ), (ChatMessageType.human, "\n{question}"), ], ); Prompt is an art

Slide 44

Slide 44 text

6 Prompt 44 final promptTemplate = PromptTemplate.fromTemplate( "Answer the question based on only the following " "context:\n{context}\n{question}"); final chatPromptTemplate = ChatPromptTemplate.fromTemplates( const [ ( ChatMessageType.system, 'Answer the question based on only the following context:\n{context}' ), (ChatMessageType.human, "\n{question}"), ], ); Retrievers Search Result User Query

Slide 45

Slide 45 text

7. LLM 45

Slide 46

Slide 46 text

7 LLM 46 final llm = ChatOllama( baseUrl: "http://10.0.2.2:11434/api", defaultOptions: ChatOllamaOptions( temperature: 0.1, model: modelValue, ), ); aya llama3 gemma etc..

Slide 47

Slide 47 text

8 Chian(LCEL) & Outout 47 const outputParser = StringOutputParser(); final chain = setupAndRetrieval.pipe(promptTemplate).pipe(llm).pipe(outputParser); final result = chain.stream(text); await for (var text in result) { setState(() { resultText += text; }); }

Slide 48

Slide 48 text

8 Chian(LCEL) & Output 48 const outputParser = StringOutputParser(); final chain = setupAndRetrieval.pipe(promptTemplate).pipe(llm).pipe(outputParser); final result = chain.stream(text); await for (var text in result) { setState(() { resultText += text; }); } User Query Output

Slide 49

Slide 49 text

● The RAG architecture is used to improve the hallucination phenomenon in LLMs and to obtain accurate answers for personal data and updated information. ● Recent LLM services handle responses for the latest data by performing web searches in advance (Agent). ● LangChain and LlamaIndex frameworks are mainly used for developing LLM apps. ● LangChain is implemented in Python and JavaScript. ● By using the LangChain Dart package, LLM app development can be done easily and quickly. ● We learned how to implement it based on the basic 8 steps of RAG with dart . Summary Implementing a RAG System with Dart language 49

Slide 50

Slide 50 text

v 50 THANK YOU! @jaichangpark