Slide 1

Slide 1 text

1 Harness the Power of Advanced LLM and CI/CD Practices DSSG x CircleCI Masahiko Funaki Principal Developer Advocate, CircleCI JAPAC

Slide 2

Slide 2 text

2 Speaker Current Role: Principal Developer Advocate, CircleCI, JAPAC Previous Companies: Microsoft > SAP > Sybase iAnywhere > Dejima (aka US DARPA IRIS or Siri) Based in: Tokyo, Japan

Slide 3

Slide 3 text

3 Agenda Introducing CI (Continuous Integration) and CD (Continuous Deployment) Developing and Testing (Evaluating) LLM Applications using Google Gemini + LangChain + CircleCI Automating Evaluations Automating Model-Graded Evaluation Comprehensive Testing Framework 1 2

Slide 4

Slide 4 text

4 Key Outcomes ● Gain confidence in learning how to code LLM applications using Google Gemini with LangChain. ● Acquire a solid understanding of key evaluation (testing) components. ● Become familiar with automating the evaluation (testing) of LLM applications using CircleCI.

Slide 5

Slide 5 text

5 Introducing CI and CD

Slide 6

Slide 6 text

6 DevOps PLAN CODE BUILD TEST RE LEASE DEPLOY OPE RATE MONI TOR Able to Automate Human Works Human Works CI (Continuous Integration) CD DEVelopment OPerationS T R I G G E R

Slide 7

Slide 7 text

7 Benefits of CircleCI (compared to other CI/CD tools) ● Wider coverage of CI/CD environments and SSH support ○ GPU (Nvidia Tesla P4/T4/A10G/V100) ○ Linux VM/Container (x86/Arm) ○ macOS (x86/Apple Silicon) ○ Windows ○ Self-hosted Runner ● Intelligent Parallel Test Assignment

Slide 8

Slide 8 text

8 Wider coverage of CI/CD environments and SSH support

Slide 9

Slide 9 text

9 Intelligent Parallel Test Assignment Running tests on CircleCI is faster than on your PC. Parallel Test Assignment ・By past execution durations (intelligent assignment) ・By file size sum of tests (acceptable assignment) ・By filename (controllable assignment) Parallels;1 Parallels:3 Parallels:5 6:20 ¢4.2 2:09 ¢5.4 1:44 ¢6.0 Duration: 1 / 3 Cost: x1.28 Duration: > 1/5 Cost: x1.43

Slide 10

Slide 10 text

10

Slide 11

Slide 11 text

11 Developing and Testing (Evaluating) LLM Applications Using Google Gemini + LangChain + CircleCI refracting

Slide 12

Slide 12 text

12 CircleCI: Document Search using LLM

Slide 13

Slide 13 text

13 CircleCI: Knowledge Base Search using LLM

Slide 14

Slide 14 text

14 CircleCI: Build Error – CircleCI or App Problems?

Slide 15

Slide 15 text

15 Repositories ● Automated Per-Commit Evaluations ○ https://github.com/mfunaki-circleci/llmops-course1 ● Automating Model-Graded Evaluations ○ https://github.com/mfunaki-circleci/llmops-course2 ● Comprehensive Testing Framework ○ https://github.com/mfunaki-circleci/llmops-course3

Slide 16

Slide 16 text

16 Result AIMessage(content='1. What is the capital of France?\n a) Paris\n b) London\n c) Madrid\n d) Rome\n\n\n2. What is the largest country in the world by land area?\n a) China\n b) Brazil\n c) Russia\n d) Canada\n\n\n3. Who was the first president of the United States?\n a) George Washington\n b) Thomas Jefferson\n c) Abraham Lincoln\n d) Franklin D. Roosevelt\n\n\n4. What is the chemical symbol for gold?\n a) Au\n b) Ag\n c) Cu\n d) Fe\n\n\n5. What is the scientific name for the common house cat?\n a) Felis domesticus\n b) Canis lupus familiaris\n c) Bos taurus\n d) Equus caballus\n\n\n6. What planet is known as the "Red Planet"?\n a) Mars\n b) Jupiter\n c) Saturn\n d) Uranus\n\n\n7. What is the name of the largest ocean in the world?\n a) Pacific Ocean\n b) Atlantic Ocean\n c) Indian Ocean\n d) Arctic Ocean\n\n\n8. What is the chemical formula for Code from langchain_google_vertexai import ChatVertexAI llm = ChatVertexAI( project='plucky-agent-412507’, model_name="gemini-pro", convert_system_message_to_human=True, temperature=0) llm.invoke("ask me a quiz") ChatPromptTemplate and StrOutputParser If you are a teacher, you need to specify 1) subject (category) 2) textbook from which you ask questions (quiz_bank) 3) the number of quizzes and the format (prompt_template)

Slide 17

Slide 17 text

17 Demo (ChatPromptTemplate) ● Few-shots prompts ● Retrieval Augumented Generation (RAG) - Embedding ● Demo

Slide 18

Slide 18 text

18 Evaluations (category, quiz_bank, expected_words)

Slide 19

Slide 19 text

19 Automated Evaluation with CircleCI ● Commit and push app.py, test_assistant and config.yml to trigger CircleCI pipeline jobs: run-commit-evals: docker: - image: cimg/python:3.10.5 steps: - checkout - python/install-packages: pkg-manager: pip - run: name: Run assistant evals. command: python -m pytest --junitxml results.xml test_assistant.py - store_test_results: path: results.xml

Slide 20

Slide 20 text

20 Demo

Slide 21

Slide 21 text

Thank you. 21