Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Advanced RAG AI-driven Retriever Selection with...

Marco Frodl
November 27, 2024

Advanced RAG AI-driven Retriever Selection with Turbo

Retrieval Augmented Generation (RAG) utilizes retrievers like vector databases to find relevant information for answering queries. In complex RAG scenarios, multiple data sources are often used. The selection of the appropriate retriever can be performed through a MultiRoute Chain, where a Large Language Model (LLM) dynamically selects the semantically best data source.

However, this approach is time-consuming and costly. A faster and more cost-effective alternative is the use of a Semantic Router, which uses an embedding model instead of an LLM for retriever selection. This approach offers comparable quality at significantly lower costs.

Through live coding, a MultiRoute Chain is implemented and then optimized for the Semantic Router.

Marco Frodl

November 27, 2024
Tweet

More Decks by Marco Frodl

Other Decks in Technology

Transcript

  1. Advanced RAG AI-driven Retriever Selection with Turbo Turbo https://www.aurelio.ai/semantic-router Semantic

    Router is a superfast decision-making layer for your LLMs and agents. Rather than waiting for slow, unreliable LLM generations to make tool-use or safety decisions, we use the magic of semantic vector space — routing our requests using semantic meaning.
  2. Advanced RAG AI-driven Retriever Selection with Turbo Turbo https://www.aurelio.ai/semantic-router Semantic

    Router is a superfast decision-making layer for your LLMs and agents. Rather than waiting for slow, unreliable LLM generations to make tool-use or safety decisions, we use the magic of semantic vector space — routing our requests using semantic meaning. It’s perfect for: input guarding, topic routing, tool-use decisions.
  3. Advanced RAG AI-driven Retriever Selection with Turbo Turbo in Numbers

    In my RAG example, a Semantic Router using remote services is 3.4 times faster than an LLM and it is 30 times less expensive. A local Semantic Router is 7.7 times faster than an LLM and it is 60 times less expensive.
  4. Advanced RAG AI-driven Retriever Selection with Turbo About Me Marco

    Frodl Principal Consultant for Generative AI Thinktecture AG X: @marcofrodl E-Mail: [email protected] LinkedIn: https://www.linkedin.com/in/marcofrodl/ https://www.thinktecture.com/thinktects/marco-frodl/
  5. Advanced RAG AI-driven Retriever Selection with Turbo Refresher: What is

    RAG? “Retrieval-Augmented Generation (RAG) extends the capabilities of LLMs to an organization's internal knowledge, all without the need to retrain the model.
  6. Advanced RAG AI-driven Retriever Selection with Turbo Refresher: What is

    RAG? https://aws.amazon.com/what-is/retrieval-augmented-generation/ “Retrieval-Augmented Generation (RAG) extends the capabilities of LLMs to an organization's internal knowledge, all without the need to retrain the model. It references an authoritative knowledge base outside of its training data sources before generating a response”
  7. Ask me anything Advanced RAG AI-driven Retriever Selection with Turbo

    Simple RAG Question Prepare Search Search Results Question LLM Vector DB Embedding Model Question as Vector Workflow Terms - Retriever - Chain Elements Embedding- Model Vector- DB Python LLM LangChain
  8. Best source determination before the search Advanced RAG AI-driven Retriever

    Selection with Turbo Advanced RAG Question Retriever Selection 0-N Search Results Question LLM Embedding Model Vector DB A Question as Vector Vector DB B LLM Prepare Search or
  9. Best source determination before the search Advanced RAG AI-driven Retriever

    Selection with Turbo Advanced RAG Question Retriever Selection 0-N Search Results Question LLM Embedding Model Vector DB A Question as Vector Vector DB B LLM Prepare Search or
  10. Best source determination before the search Advanced RAG AI-driven Retriever

    Selection with Turbo Advanced RAG w/ Semantic Router Question Retriever Selection 0-N Search Results Question Embedding Model Vector DB A Question as Vector Vector DB B LLM Prepare Search or Embedding Model
  11. • Please rate my talk in the conference app •

    I look forward to your questions and comments Your feedback is important to me