Semantic Image Search in Ruby: Postgres, Redis, or LLM? A CO₂-Conscious Comparison

Michele Franzin Three engines. Same data. Same machine. Latency. Cents.
Carbon. Postgres, Redis, or LLM? Semantic image search in Ruby.

It depends. It always depends. Postgres, Redis, or LLM? …depends
on what, exactly? Speed Cost Complexity Qu a lity gCO2e

The problem

"sunset be a ch" → [0.12, -0.34, 0.91, 0.07, ...,
-0.22] ↑ 768 numbers Embeddings Dist a nce is a simil a rity score, not metres.

HNSW (gr a ph) Sub-line a r, the st a
nd a rd Brute Force O(N) - comp a re with a ll Why approximate? ce with all Tree-based Breaks in high dimensions HNSW (graph) Sub-linear, the standard splits get messy Brute force O(N) — compare with all Tree-based Breaks in high dimensions HN Sub-lin splits get messy Both pgvector a nd Redis use HNSW.

Solutions

Im a ges 7.000+ Embedding 768-D vector C a ption
+ t a gs text met a d a t a SigLIP 2 Qwen3.5-9B manifest.json Common o ff l ine phase Two AI models. One sh a red f ile. The st a rting point of every engine.

The manifest record used by pgvector & Redis used by
the LLM engine

Every rest a rt Once Never Prep An a lyze
im a ges Lo a d Push d a t a to the engine Query Answer the user manifest.json Every request Once, upfront Three phases: Prep, Load, Query We me a sure a ll three. Query repe a ts — so it weighs more. Prep a nd Lo a d a re re a l costs, not free.

[ SigLIP 2 ] [0.12, -0.34, 0.76, 0.33, … ,
0.91] 768 numbers s a me sp a ce a s the im a ges "summer be a ch p a rty" First, the query becomes an array… The engine never sees your text — only its embedding.

pgvector F a shion client Production Redis HSNW Tr a
vel photogr a phy pl a tform Production LLM Qwen3.5-9B Intern a l experiment B a seline Re a l code. Re a l clients. Re a l choice..

pgvector • ACID • Your DBA st a ys •
HNSW + cosine, in SQL Durable relational baseline

SQL search cosine dist a nce oper a tor HNSW
index does the he a vy lifting

Redis HNSW • Sub-millisecond • S a me a lgorithm
f a mily a s pgvector • FLOAT32 bin a ry Low-latency in-memory baseline

KNN query K-ne a re a st neighbours (KNN) se
a rch

LLM - Qwen3.5-9B • S a me model used in
Prep — now used a t runtime • No vector index · No lo a d • O(N) — every query re a ds everything • ⾠ Not production-re a dy without a retriev a l f irst st a ge Reasoning-heavy baseline

SYSTEM PROMPT USER PROMPT RANK RESULTS the prompt + rank
pipeline temper a ture: 0.0 — s a me query, s a me a nswer, every run.

What to do in real life production query STAGE 1:
retriev a l vector se a rch (HNSW) O(log N) · f a st · che a p STAGE 2: LLM rer a nk re a soning over text O(K) · slow · sm a rt ~50 c a ndid a tes top-K NUMBERS We kept the LLM pure to isol a te one cost. In production, you'd hybridize.

build_search_result(title:, rows:) redis/se a rch.rb llm/se a rch.rb pgvector/se a
rch.rb S a me sign a ture. S a me return sh a pe. Di ff erent worlds inside.

Benchmarks

3 engines. How do we compare them? no sh a
red method — yet Speed Cost Complexity Qu a lity gCO2e

How to measure carbon footprint for real 🏭 Manufacturing 5+
inputs Σ (component_embodied_CO₂e × quantity) / lifetime + transport + EOL ⚡ Operational energy 7+ inputs Σ (P_CPU + P_GPU + P_RAM + P_storage + P_network) × h × util 🌬 Datacenter overhead 2 inputs total_energy × PUE (cooling, UPS, lighting, losses) 🔌 Grid carbon intensity 3 inputs Σ (energy_t × gCO₂_per_kWh_t) over time, region-specific ✂ Allocation 4 inputs your_share = your_resources / total_resources (CPU, RAM, ...) Following the ADEME PCR for Datacenter and Cloud services Requires d a t a center sensors, m a nuf a cturing certi f ic a tes, grid d a t a . We h a ve none of th a t.

Our way: SCI-lite A public st a nd a rd
from the Green Softw a re Found a tion. Reproducible by a nyone with a power meter. Software Carbon Intensity — operational only Rigorous w a y SCI-lite All f ive l a yers Oper a tion a l only D a t a center a ccess Power re a dings on the met a l Absolute cl a ims Rel a tive cl a ims Me a surement-gr a de Decision-gr a de

gCO2e = Watt × seconds × 215.9 3.600.000 • 🟰
S a me m a chine for a ll three → system a tic errors c a ncel • 📊 powermetrics re a ds SoC power model → consistent, not precise • 🇮🇹 215.9 gCO₂/kWh = It a li a n grid a ver a ge → decl a red a ssumption The recipe

CLOCK_MONOTONIC me a sures dur a tions only, never jumps
b a ckw a rds on a clock sync Measuring in Ruby, in-house

HARDWARE M4 Mac mini M4 Pro · 32 GB Bare-metal
· macOS native ENERGY PROBE /sbin/powermetrics 5 samples per second MODELS ON THE METAL SigLIP 2 · MLX adapter Qwen3.5-9B · LM Studio · T = 0.0 PROTOCOL Warm-up · idle baseline 60 s N queries → energy / N Setup

Honest disclosures powermetrics ≠ w a ll-plug. SoC estim a
tes, v a lid for rel a tive comp a rison. Apple Neur a l Engine sometimes reports 0 mW under re a l lo a d. Logged. Tr a ining cost of SigLIP 2 / Qwen3.5-9B excluded. Not published, not a udit a ble. Runtime overhe a d (Docker, LM Studio, MLX vi a Python): a few percent, consistent a cross engines.

Results

Axis 1: gCO₂e per query 0 20 40 60 80
p r l 75,39 0,90 0,87 mgCO₂e/query 1 10 100 1000 p r l x866,5 x1,2 x1,0 mgCO₂e/query (r a tio) NUMBERS 200-im a ge d a t a set · 50 queries · sequenti a l · ste a dy st a te.

Axis 1: gCO₂e per phase mgCO₂e p r l 1
10 100 1000 12 1 1 1 1 2348 2 2 Prep ph a se Lo a d ph a se Add extr a im a ge Prep is sh a red — s a me model, s a me im a ges, s a me cost. Lo a d a nd Query a re where the engines diverge.

Axis 2: Latency 0 45 90 135 180 p r
l 169,7 0,5 0,5 L a tency (seconds)

Axis 3: Cost, a proxy of latency 0 350 700
1050 1400 p r l 1.385 4 3 € per 1000 queries € per 1000 queries · 0.50 €/h fi xed rate

Query: “traditional local expertise being demonstrated or passed on by
hand” Axis 4: Result quality #1 #2 #3

Axis 5: Adoption complexity

L a tency Cost Qu a lity Adoption simplicity gCO₂e
1 3 4 5 pgvector redis llm Synthesis P R L L a tenct 5 5 1 Cost 5 5 1 Qu a lity 3/4 3/4 5 Adoption 5 3 1 gCO₂e 5 5 1

Thanks Today. Postgres, Redis, or LLM? It depends — but
on fi ve axes now, not four. Why. Carbon is a real cost. Worth a number. Tomorrow. Just start asking.

Semantic Image Search in Ruby: Postgres, Redis,...

Semantic Image Search in Ruby: Postgres, Redis, or LLM? A CO₂-Conscious Comparison

More Decks by Michele Franzin

Other Decks in Technology

Featured

Transcript