Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
How Postgres Could Index Itself
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Andrew Kane
September 08, 2017
Programming
0
3.2k
How Postgres Could Index Itself
Andrew Kane
September 08, 2017
Tweet
Share
Other Decks in Programming
See All in Programming
360° Signals in Angular: Signal Forms with SignalStore & Resources @ngLondon 01/2026
manfredsteyer
PRO
0
120
生成AIを使ったコードレビューで定性的に品質カバー
chiilog
1
260
AIによる高速開発をどう制御するか? ガードレール設置で開発速度と品質を両立させたチームの事例
tonkotsuboy_com
7
2.3k
フルサイクルエンジニアリングをAI Agentで全自動化したい 〜構想と現在地〜
kamina_zzz
0
400
izumin5210のプロポーザルのネタ探し #tskaigi_msup
izumin5210
1
110
AI Agent の開発と運用を支える Durable Execution #AgentsInProd
izumin5210
7
2.3k
CSC307 Lecture 06
javiergs
PRO
0
680
Vibe Coding - AI 驅動的軟體開發
mickyp100
0
170
なぜSQLはAIぽく見えるのか/why does SQL look AI like
florets1
0
450
なるべく楽してバックエンドに型をつけたい!(楽とは言ってない)
hibiki_cube
0
140
AIエージェントのキホンから学ぶ「エージェンティックコーディング」実践入門
masahiro_nishimi
5
430
OCaml 5でモダンな並列プログラミングを Enjoyしよう!
haochenx
0
140
Featured
See All Featured
Typedesign – Prime Four
hannesfritz
42
2.9k
StorybookのUI Testing Handbookを読んだ
zakiyama
31
6.6k
Building an army of robots
kneath
306
46k
Stop Working from a Prison Cell
hatefulcrawdad
273
21k
Primal Persuasion: How to Engage the Brain for Learning That Lasts
tmiket
0
250
Visualization
eitanlees
150
17k
svc-hook: hooking system calls on ARM64 by binary rewriting
retrage
1
99
Building a A Zero-Code AI SEO Workflow
portentint
PRO
0
310
Skip the Path - Find Your Career Trail
mkilby
0
54
How To Speak Unicorn (iThemes Webinar)
marktimemedia
1
380
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
35
3.4k
ReactJS: Keep Simple. Everything can be a component!
pedronauck
666
130k
Transcript
How Postgres Could Index Itself
None
github.com/ankane
None
Read speed vs Write speed Space
None
v1
Collect queries Analyze queries
pg_stat_statements Query Total Time (ms) Calls Average Time (ms) SELECT
… 40,000 80,000 0.5 SELECT … 30,000 300 100
SELECT * FROM products WHERE store_id = 1
pg query github.com/lfittl/pg_query _
SELECT * FROM products WHERE store_id = 1
SELECT * FROM products WHERE store_id = 1 AND brand_id
= 2
Stores have many products Brands have a few products
id store_id brand_id 1 1 2 2 4 8 3
1 9 4 1 3 fetch store_id = 1 id store_id brand_id 1 1 2 2 4 8 3 1 9 4 1 3 filter brand_id = 2 id store_id brand_id 1 1 2 2 4 8 3 1 9 4 1 3 id store_id brand_id 1 1 2 2 4 8 3 1 9 4 1 3 filter store_id = 1 fetch brand_id = 2
pg_stats n_distinct null_frac
store_id brand_id Rows 100,000 100,000 null_frac 0 0.10 n_distinct 100
9,000 Estimated Rows 1,000 10
store_id brand_id Rows 100,000,000 100,000,000 null_frac 0 0.10 n_distinct 100
9,000 Estimated Rows 1,000,000 10,000
store_id Rows 10,000 null_frac 0 n_distinct 100 Estimated Rows 100
SELECT * FROM products ORDER BY created_at DESC LIMIT 10
SELECT * FROM products WHERE store_id = 1 ORDER BY
created_at DESC LIMIT 10
None
Shortcomings
Single table plus Simple WHERE clause and/or Simple ORDER BY
clause
Duplicating planner logic
pg_stats n_distinct null_frac ✗ most_common_vals ✗ most_common_freqs ✗ histogram_bounds
most_common_vals {2, 5, 1} most_common_freqs {0.9, 0.05, 0.01} store_id =
1 vs store_id = 2
histogram_bounds {0, 9, 25, 60, 99} qty < 5 vs
qty > 5
SELECT * FROM products WHERE store_id = ?
v2
log_min_statement_duration duration: 100 ms statement: SELECT * FROM products WHERE
store_id = 1
Given a query and a set of indexes best indexes
to use
Given a query and all possible indexes best indexes possible
/* Allow a plugin to editorialize on the info we
obtained from the catalogs. Actions might include altering the assumed relation size, removing an index, or adding a hypothetical index to the indexlist. */ get_relation_info_hook 604ffd2
hypopg github.com/dalibo/hypopg
SELECT * FROM products WHERE store_id = 1 AND brand_id
= 2
EXPLAIN Seq Scan on products (cost=0.00..1000.00 rows=100 width=108) Filter: (store_id
= 1 AND brand_id = 2) Final Cost
Cost Hypothetical Indexes Original 1000
Add hypothetical indexes store_id brand_id
EXPLAIN Index Scan using <41072>hypo_btree on products (cost=0.28..50.29 rows=1 width=108)
Index Cond: (brand_id = 2) Filter: (store_id = 1) Final Cost Index
Cost Hypothetical Indexes Original 1000 Single Column 50 brand_id
Add hypothetical indexes store_id, brand_id brand_id, store_id (does not try
different sort orders right now)
Cost Hypothetical Indexes Original 1000 Single Column 50 brand_id Multi
Column 45 brand_id, store_id
Dexter github.com/ankane/dexter
tail -F -n +1 <log-file> | dexter <conn-opts>
--create --exclude big_table --min-time 10
Shortcomings
SELECT * FROM products WHERE a = 1 AND b
= 2 SELECT * FROM products WHERE b = 2
B-TREE Only No Expressions No Partial
SELECT * FROM products WHERE qty = 0
DROP INDEX Unused indexes
HypoPG Extension Support
None
pg_query HypoPG
Get Involved github.com/ankane/dexter