Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Harness the Power of Advanced LLM and CI/CD Practices

Harness the Power of Advanced LLM and CI/CD Practices

This presentation was used at DSSG x CircleCI on February 1, 2024.

More Decks by Masahiko Funaki(舟木 将彦)

Other Decks in Technology


  1. 1 Harness the Power of Advanced LLM and CI/CD Practices

    DSSG x CircleCI Masahiko Funaki Principal Developer Advocate, CircleCI JAPAC
  2. 2 Speaker Current Role: Principal Developer Advocate, CircleCI, JAPAC Previous

    Companies: Microsoft > SAP > Sybase iAnywhere > Dejima (aka US DARPA IRIS or Siri) Based in: Tokyo, Japan
  3. 3 Agenda Introducing CI (Continuous Integration) and CD (Continuous Deployment)

    Developing and Testing (Evaluating) LLM Applications using Google Gemini + LangChain + CircleCI Automating Evaluations Automating Model-Graded Evaluation Comprehensive Testing Framework 1 2
  4. 4 Key Outcomes • Gain confidence in learning how to

    code LLM applications using Google Gemini with LangChain. • Acquire a solid understanding of key evaluation (testing) components. • Become familiar with automating the evaluation (testing) of LLM applications using CircleCI.

    RATE MONI TOR Able to Automate Human Works Human Works CI (Continuous Integration) CD DEVelopment OPerationS T R I G G E R
  6. 7 Benefits of CircleCI (compared to other CI/CD tools) •

    Wider coverage of CI/CD environments and SSH support ◦ GPU (Nvidia Tesla P4/T4/A10G/V100) ◦ Linux VM/Container (x86/Arm) ◦ macOS (x86/Apple Silicon) ◦ Windows ◦ Self-hosted Runner • Intelligent Parallel Test Assignment
  7. 9 Intelligent Parallel Test Assignment Running tests on CircleCI is

    faster than on your PC. Parallel Test Assignment ・By past execution durations (intelligent assignment) ・By file size sum of tests (acceptable assignment) ・By filename (controllable assignment) Parallels;1 Parallels:3 Parallels:5 6:20 ¢4.2 2:09 ¢5.4 1:44 ¢6.0 Duration: 1 / 3 Cost: x1.28 Duration: > 1/5 Cost: x1.43
  8. 10

  9. 15 Repositories • Automated Per-Commit Evaluations ◦ https://github.com/mfunaki-circleci/llmops-course1 • Automating

    Model-Graded Evaluations ◦ https://github.com/mfunaki-circleci/llmops-course2 • Comprehensive Testing Framework ◦ https://github.com/mfunaki-circleci/llmops-course3
  10. 16 Result AIMessage(content='1. What is the capital of France?\n a)

    Paris\n b) London\n c) Madrid\n d) Rome\n\n\n2. What is the largest country in the world by land area?\n a) China\n b) Brazil\n c) Russia\n d) Canada\n\n\n3. Who was the first president of the United States?\n a) George Washington\n b) Thomas Jefferson\n c) Abraham Lincoln\n d) Franklin D. Roosevelt\n\n\n4. What is the chemical symbol for gold?\n a) Au\n b) Ag\n c) Cu\n d) Fe\n\n\n5. What is the scientific name for the common house cat?\n a) Felis domesticus\n b) Canis lupus familiaris\n c) Bos taurus\n d) Equus caballus\n\n\n6. What planet is known as the "Red Planet"?\n a) Mars\n b) Jupiter\n c) Saturn\n d) Uranus\n\n\n7. What is the name of the largest ocean in the world?\n a) Pacific Ocean\n b) Atlantic Ocean\n c) Indian Ocean\n d) Arctic Ocean\n\n\n8. What is the chemical formula for Code from langchain_google_vertexai import ChatVertexAI llm = ChatVertexAI( project='plucky-agent-412507’, model_name="gemini-pro", convert_system_message_to_human=True, temperature=0) llm.invoke("ask me a quiz") ChatPromptTemplate and StrOutputParser If you are a teacher, you need to specify 1) subject (category) 2) textbook from which you ask questions (quiz_bank) 3) the number of quizzes and the format (prompt_template)
  11. 19 Automated Evaluation with CircleCI • Commit and push app.py,

    test_assistant and config.yml to trigger CircleCI pipeline jobs: run-commit-evals: docker: - image: cimg/python:3.10.5 steps: - checkout - python/install-packages: pkg-manager: pip - run: name: Run assistant evals. command: python -m pytest --junitxml results.xml test_assistant.py - store_test_results: path: results.xml