Retrieval Augmented Generation (RAG) utilizes retrievers like vector databases to find relevant information for answering queries. In complex RAG scenarios, multiple data sources are often used. The selection of the appropriate retriever can be performed through a MultiRoute Chain, where a Large Language Model (LLM) dynamically selects the semantically best data source.
However, this approach is time-consuming and costly. A faster and more cost-effective alternative is the use of a Semantic Router, which uses an embedding model instead of an LLM for retriever selection. This approach offers comparable quality at significantly lower costs.
Through live coding, a MultiRoute Chain is implemented and then optimized for the Semantic Router.