Slide 23
Slide 23 text
MLCon 2025
Beyond LLMs: Using Embedding Models for Input Guarding, Semantic Routing, and Tool Decisions
Speed & Budget in Numbers
SR Remote is 3.4 times faster than LLM (0,62s vs 0,18s)
SR Local is 7.75 times faster than LLM (0,62s vs 0,08s)
SR Remote is 30 times cheaper than LLM ($0,60 vs $0,02)
SR Local is 60 times cheaper than LLM ($0,60 vs $0,01)