Upgrade to Pro — share decks privately, control downloads, hide ads and more …

[RubyConf Taiwan 2023] Catching the AI train

[RubyConf Taiwan 2023] Catching the AI train

The Large-Language Models have taken application development by storm. Ruby has libraries that help build LLM applications, e.g.: Langchain.rb. We need to understand what kind of LLM applications can be built, how to build them, and what are common pitfalls when building them. We're going to look at building vector (semantic) search, chat bots and business process automation solutions. We're going to learn AI/LLM-related concepts, terms and ideas, learn what AI agents are, and look at some of the latest cutting edge AI research.

Andrei Bondarev

December 16, 2023
Tweet

More Decks by Andrei Bondarev

Other Decks in Technology

Transcript

  1. About Me 13 years with Ruby ❤️ Built software and

    led teams in the federal government, SaaS consumer products and B2B enterprise verticals Currently: Architect / Fractional CTO Created the Langchain.rb library GitHub - andreibondarev/langchainrb: Build LLM-backed Ruby applications
  2. The Gen AI promise "Generative AI’s impact on productivity could

    add the equivalent of $2.6 trillion to $4.4 trillion annually in value to the global economy." "75% of the value delivered will be across 4 areas: Customer Operations, Marketing & Sales, Software Engineering, and R&D" "Generative AI will be automating work activities that take up to 60-70% of employees' time today" "Half of today’s work activities could be automated between 2030 and 2060"
  3. Why (not) Ruby? Monoliths are back in fashion Pragmatic community

    OOP / Good software dev fundamentals Ruby ~ Python
  4. What is Generative AI? Generative AI is a type of

    artificial intelligence technology that can produce various types of content, including text, imagery, audio/video, etc. Large Language Models (LLMs) Deep Learning Artificial Neural Networks (models) with general-purpose language understanding and generation. Exploded in popularity after the Attention Is All You Need (2017) research paper that introduced the Transformers architecture.
  5. LLMs excel at Structuring Data Collecting and converting unstructured data

    to structured data. Summarizing Data Contextualizing a large body of text and producing a summary. Classifying Data Bucketing a large body of text into topics. and many more other tasks…
  6. Problems with LLMs Hallucinations Model generating incorrect or non-sensical text

    Outdated Data Example: GPT-4 was trained on data up to April 2023 Relevant Knowledge is Not Used …there may be a solution… 🤔
  7. Retrieval Augmented Generation (RAG) Technique for enhancing the accuracy and

    reliability of generative AI models with facts fetched from external sources. 1 Generate vector embeddings from user's question. 2 Retrieve relevant documents by running similarity search in a vector database. 3 Construct the RAG prompt to send to the LLM. 4 Get the response back from the LLM in natural language
  8. Vector Embeddings Machine learning technique to represent data in an

    N-dimensional space. LLM encode meaning behind texts in the embedding space or "latent space". OpenAI's text-embedding-ada-002 model uses 1536 dimensions.
  9. RAG Prompt instructions to enforce a format or style of

    response context, i.e. relevant data/documents question, i.e. user's original question prompt = Langchain::Prompt.load_from_path("rag_prompt.y ml") prompt.format(instructions:, context:, question:)
  10. Putting it all together (McKinsey: Generative AI technology could drive

    value across an entire organization by revolutionizing internal knowledge management systems. Knowledge workers spent about one day each work week, searching for and gathering information)
  11. Optimizing RAG Human evals ( 👍🏻 / 👎🏼) RAGAS metrics

    Faithfulness - ensuring retrieved context can act as a justification for the generated answer Context Relevance - context is focused, with little to no irrelevant information Answer Relevance - the answer addresses the actual question ragas = Langchain::Evals::Ragas::Main.new(llm: llm) ragas.score(answer: "", question: "", context: "") #=> { # ragas_score: 0.6601257446503674, # answer_relevance_score: 0.9573145866787608, # context_relevance_score: 0.6666666666666666, # faithfulness_score: 0.5 # }
  12. Vector Search DBs X LLMs Matrix Pair up any vector

    search DB with any LLM. Identical APIs. Lower vendor lock-in. Optionality. Chroma Pgvector Pinecone Weaviate … Google Vertex AI ✅ ✅ ✅ ✅ ✅ AWS Bedrock ✅ ✅ ✅ ✅ ✅ OpenAI ✅ ✅ ✅ ✅ ✅ Local Llama 2 ✅ ✅ ✅ ✅ ✅ … ✅ ✅ ✅ ✅ ✅
  13. AI Agents Autonomous (semi-autonomous) general purpose LLM-powered programs Can use

    Tools (APIs, other systems) Work best with powerful LLMs Can be used to automate workflows/business processes and execute multi-step tasks
  14. Recap / Wrapping up 🫡 AI emerging as the centerpiece

    of each tech stack Generative AI, it's use-cases and problems. RAG Vector embeddings Similarity search RAG prompt (prompt engineering) Evals Ruby ought to adapt and address the growing AI needs.