embedding Save embedding in the database Embedding Model (OpenAI, Gemini, In-house, Open-source ones etc.) App Get query Search embedding in the database App 🏞📄 🎶 [0.5,0.3,...,0.1] [0.6,0.2,...,0.9] "flying whales" [0.5,0.3,...,0.1] [0.6,0.2,...,0.9] [0.4,0.3,...,0.1]
Makefile ├─ myextension.control ├─ sql/ │ └─ myextension--1.0.sql ├─ src/ │ └─ myextension.c ← how to compile the C code ← metadata about the extension ← create functions, types... ← implementation of C functions
MODULE_big = vector PG_CONFIG ?= pg_config PGXS := $(shell $(PG_CONFIG) --pgxs) include $(PGXS) # name; vector.control file; dest. prefix/share/vector: # build a single shared library named vector (e.g. vector.so): # get PostgreSQL’s inc. dirs, compiler flags, and inst. paths: # get the path to PostgreSQL's PGXS Makefile fragment: # loads the PGXS build system:
l2_distance(vector, vector) RETURNS float8 AS 'MODULE_PATHNAME' LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE; // C implementation FUNCTION_PREFIX PG_FUNCTION_INFO_V1(l2_distance); Datum l2_distance(PG_FUNCTION_ARGS) { ...
▣ PosgtreSQL runs vector.sql ▣ CREATE TYPE vector… triggers INSERT INTO pg_type… ▣ CREATE FUNCTION… triggers INSERT INTO pg_proc… ▣ CREATE OPERATOR CLASS… triggers INSERT INTO pg_opclass… ▣ And so on… ▣ Metadata stored, but no C code executed yet! 21
+ vector_l2_ops ▣ Stores metadata in pg_index and pg_class 22 CREATE INDEX idx ON table USING hnsw (embedding vector_l2_ops); ^ ^^^^^^^^^^^^^ | operator class column
There’s an HNSW index! ▣ Executor: Which operator class? → vector_l2_ops ▣ What is FUNCTION 1 of vector_l2_ops? → vector_l2_squared_distance □ Looks up pg_proc → probin="/path/vector.so", prosrc="..." etc. □ Loads the C function ▣ Perform HNSW and L2 calls the function during search 23 SELECT * FROM items WHERE embedding <-> '[3,1,2]' < 5;