MySQLとインデックスとPHPer

MySQLとインデックスとPHPer PHPが本職でもMySQLを手懐けるために 2021/10/03 yoku0825 PHP Conference Japan 2021

＼おはようございます／ yoku0825＠とある企業のDBA オラクれない ‐ ポスグれない ‐ マイエスキューエる ‐ 生息域 Twitter:
@yoku0825 ‐ Blog: 日々の覚書 ‐ 日本MySQLユーザ会 ‐ MySQL Casual ‐ 1/98

はじめに昨日のまみーさんのセッションは聞きましたか？おすすめです。 PHPer が知るべき MySQL クエリチューニング by まみー ‐
PHPer が知るべき MySQL クエリチューニング ‐ まみーさんの話よりもMySQL寄りなはなし MySQLの気持ちになってSQLレビューをうまく乗りこなそう 2/98

(この資料における)インデックス #とは MySQLのInnoDBストレージエンジンの話です。RDBMSやストレージエンジンが変わると多少事情は変わりますイメージしやすくするための「たとえ」がほとんどなので、内部実装的な話はさらに事情が変わります ‐ PHPはHello, Worldレベルなので、文法とか間違っていても大目に見てください ‐
MySQLで一番よく使われる、B+Treeインデックスのことについて説明します InnoDBにおける「インデックス」はPRIMARY KEYのことも含みます。InnoDBでは「行データ」は「PRIMARY KEYのB+Treeのリーフ」にぶら下がっています 3/98

インデックス #とは「主に検索のために」「行データの部分集合を」「あらかじめソートして複製したもの」「ロックの単位でもある」 4/98

インデックス #とは「主に検索のために」「行データの部分集合を」「あらかじめソートして複製したもの」「ロックの単位でもある」これを上手く扱うことで、MySQLの性能問題はかなりの割合で改善できる 5/98

たとえばINDEX(Continent) PHPerのみなさんにはこう表現するとわかりやすいかしらんセカンダリーインデックスのリーフの要素にはそのテーブルの「プライマリーキーの値」と「前の要素のポインタ」「次の要素のポインタ」が入っています ‐ // ALTER TABLE country ADD
INDEX (continent); $index_continent["Asia"] = array( "AFG" => array("continent" => "Asia", "code" => "AFG"), "ARE" => array("continent" => "Asia", "code" => "ARE"), "ARM" => array("continent" => "Asia", "code" => "ARM"), "AZE" => array("continent" => "Asia", "code" => "AZE"), "BGD" => array("continent" => "Asia", "code" => "BGD"), .. ); $index_continent["Europe"]= array( "ALB" => array("continent" => "Europe", "code" => "ALB"), "AND" => array("continent" => "Europe", "code" => "AND"), "AUT" => array("continent" => "Europe", "code" => "AUT"), "BEL" => array("continent" => "Europe", "code" => "BEL"), "BGR" => array("continent" => "Europe", "code" => "BGR"), .. ); .. 6/98

InnoDBにおける行の構造行データそのものはPRIMARY KEYの要素として取り扱われる $clustered_index["JPN"]= array( "code" => "JPN", "name" =>
"Japan", "continent" => "Asia", "region" => "Eastern Asia", .. ); $clustered_index["USA"]= array( "code" => "USA", "name" => "United States", "continent" => "North America", "Region" => "North America", .. ); 7/98

参考: MySQLとインデックスと私 8/98

インデックスを使った場合の「二度引き」 // SELECT name FROM country WHERE continent = 'Asia'
$asia= $index_continent["Asia"]; $result_buffer= []; foreach ($asia AS $code => $detail) { // $index_continent["Asia"] = array( // "AFG" => array("continent" => "Asia", "code" => "AFG"), // .. // ); // インデックス(continent)は"name"カラムの要素を知らないので、行本体から引く $real_row= $clustered_index[$code]; $name= $real_row["name"]; array_push($result_buffer, array("name" => $name)); } 9/98

たとえばINDEX(Continent, Population) 複合インデックスになると配列が深くなるイメージ // ALTER TABLE country ADD INDEX(continent, population);
$index_continent_population["Asia"]= array( 286000 => array("MDV" => array("continent" => "Asia", populatin => 286000, "code" => "MDV")), 328000 => array("BRN" => array("continent" => "Asia", populatin => 328000, "code" => "BRN")), 473000 => array("MAC" => array("continent" => "Asia", populatin => 473000, "code" => "MAC"), "ZZZ" => array("continent" => "Asia", populatin => 473000, "code" => "ZZZ")), .. ); $index_continent_population["Europe"]= array( 1000 => array("VAT" => array("continent" => "Europe", populatin => 1000, "code" => "VAT")), 3200 => array("SJM" => array("continent" => "Europe", populatin => 3200, "code" => "SJM")), 25000 => array("GIB" => array("continent" => "Europe", populatin => 25000, "code" => "GIB")), .. ); .. 10/98

たとえばINDEX(Continent, Population) // WHERE continent = 'Europe' には使える var_dump($index_continent_population["Europe"]); //
WHERE continent = 'Europe' AND population = 1000にも使える var_dump($index_continent_population["Europe"][1000]); 11/98

たとえばINDEX(Continent, Population) これがいわゆる「左側からしか使えない」いや本物のPHPにはスマートな書き方があるかも知れませんが、まあこんな感じで使いにくいっていうことで… ‐ // WHERE population =
1000には使いにくい $result_buffer= []; foreach ($index_continent_population as $one_continent => $population_and_code) { // 親要素を全部ループして刈り込むくらいしかできない if (array_key_exists(1000, $population_and_code)) array_push($result_buffer, $population_and_code[1000]); } var_dump($result_buffer); 12/98

たとえばINDEX(Continent, Population) NOT演算は効率が悪い // WHERE continent = 'Asia' AND population
<> 1000 $result_buffer= []; foreach ($index_continent_population["Asia"] as $population => $code) { // 結局全部さらって != 比較しないといけない if ($population != 1000) array_push($result_buffer, $code); } var_dump($result_buffer); 13/98

たとえばINDEX(Continent, Population) レンジ検索はできるこのへんからPHPのコードは怪しくなってきます ‐ // WHERE continent = 'Asia'
AND population > 1000 $current_key= $index_continent_population["Asia"][1000]->next; $result_buffer= []; while ($current_key) { if ($current_key["continent"] != "Asia") break; 14/98

たとえばINDEX(Continent, Population) 左側で範囲検索を使ってしまうと右側の要素で一発では引けない // WHERE continent > 'Asia' AND population
= 1000 // ここで "Asia" の次のキーにカーソルを合わせることはできるけど $current_key= $index_continent_population["Asia"]->next; $result_buffer= []; while ($current_key) { // "Asia" 以降の全ての国に対して == 1000のフィルタリングが必要、 // whileループの中の数が減らない if ($current_key["population"] == 1000) array_push($result_buffer, $current_key); $current_key= $current_key->next; } 15/98

たとえばINDEX(Population, Continent) ←さっきと逆 = で仕留めきれるものは右側に効果が波及するので使える // WHERE continent > 'Asia'
AND population = 1000 // populationが左なら刈り込んでからループに入れる $current_key= $index_population_continent[1000]["Asia"]->next; $result_buffer= []; while ($current_key) { // whileをループする回数がそもそも必要最低限になる array_push($result_buffer, $current_key); $current_key= $current_key->next; } var_dump($result_buffer); 16/98

練習1 SELECT * FROM country WHERE continent = 'Asia' AND
name = 'Japan'; 17/98

name = 'Japan'; /* ↓のどっちかで仕留めれば絞り込みが済むから $index["Asia"]["Japan"] $index["Japan"]["Asia"] */ 18/98

name = 'Japan'; -- たとえば$index["Japan"]["Asia"]狙いの方 ALTER TABLE country ADD INDEX idx_name_continent(name, continent); 19/98

name = 'Japan'; 20/98

indepyear < 0 AND region = 'Eastern Asia' AND population > 1000000; 21/98

indepyear < 0 AND region = 'Eastern Asia' AND population > 1000000; /* 等価検索しているのは (continent, region)、indepyearとpopulationは範囲検索 $index["Asia"]["Eastern Asia"] (またはその逆)を population > 1000000 で絞るか、indepyear < 0で絞るか ALTER TABLE country ADD INDEX idx_continent_region_population(continent, region, population), ADD INDEX idx_continent_region_indepyear(continent, region, indepyear); */ 22/98

indepyear < 0 AND region = 'Eastern Asia' AND population > 1000000; 23/98

余談この命名規則気に入ってる 24/98

ここまでのまとめインデックスはキー部分がソート済の連想配列の連想配列みたいなもの…だと思えば、「インデックスは左から順番に使わないと効果がない」理由がｶﾞｯﾃﾝしていただけたでしょうか範囲検索もイテレーターを使い始める感じになるので、範囲検索が始まるとそれ以降の刈り込みは発生しなくなる 25/98

はい次 26/98

WHEREとORDER BYの連携 // WHERE continent = 'Asia' ORDER BY gnp
ASC $asia= $index_continent_population["Asia"]; // WHEREには左端なので効く $sort_buffer= []; foreach ($asia as $population => $code) { // gnpカラムの値はインデックスには含まれていないので、行をフェッチして値を取る $real_row= $clustered_index[$code["code"]]; $sort_buffer[$real_row["code"]]= $real_row["gnp"]; } uasort($sort_buffer, ..); // ソートバッファのソート var_dump($sort_buffer); 27/98

WHEREとORDER BYの連携 // WHERE continent = 'Asia' ORDER BY population
ASC // continentで刈り込みを入れる $current_key= $index_continent_population["Asia"]->first; $result_buffer= []; while ($current_key) { if ($current_key["continent"] != "Asia") break; // 既にpopulationの昇順に入っているので、 // nextのポインタをたどって順番に入れていくだけでORDER BYが完成している 28/98

WHEREと ORDER BYの連携 // /* WHERE句なし */ ORDER BY continent
ASC, population ASC $result_buffer= []; $current_key= $index_continent_population->first; while ($current_key) { array_push($result_buffer, $code); $current_key= $current_key->next; } // 追加ソート不要 var_dump($result_buffer); 29/98

練習3 SELECT * FROM country WHERE continent = 'Asia' ORDER
BY name; 30/98

BY name; /* $index["Asia"]->first; が効かせられるように、 WHEREのカラムが先、 $current_key->next が効かせられるようにORDER BYのカラム、の順で狙う ALTER TABLE country ADD INDEX idx_continent_name(continent, name) */ 31/98

BY name; 32/98

region = 'Eastern Asia' ORDER BY indepyear; さっき作ったインデックスではcontinentの刈り込みは効いているけど追加のソートと追加のフィルタリングが発生している ‐ 33/98

region = 'Eastern Asia' ORDER BY indepyear; /* $index[$continent][$region] (または逆) ->first; $current_key->next がindepyearの順になるように ALTER TABLE country ADD INDEX idx_continent_region_indepyear(continent, region, indepyear) */ 34/98

region = 'Eastern Asia' ORDER BY indepyear; 35/98

練習4改一度さっき作った idx_continent_region_indepyear を引っぺがして、 idx_continent_indepyear を作ってみます WHERE, ORDER BYの順番だけどWHEREの間が抜けてるパターン ‐
continentで刈り込んでループに入って、ループの中で追加のフィルタリングをかけつつ追加ソートは発生しないパターン ‐ ALTER TABLE country DROP KEY idx_continent_region_indepyear, ADD KEY idx_continent_indepyear(continent, indepyear); 36/98

練習4改 // WHERE continent = 'Asia' AND region = 'Eastern
Asia' ORDER BY indepyear; // continentで刈り込みを入れる $current_key= $index_continent_indepyear["Asia"]->first; $result_buffer= []; foreach ($index_continent_indepyear["Asia"] as $indpyear => $code) { // 行本体をフェッチして追加フィルタリングだけど、 // 順番は満たされているので追加のソートは不要 $real_row= $cluster_index[$current_key["code"]]; if ($real_row["region"] == "Eastern Asia") 37/98

練習4改 SELECT * FROM country WHERE continent = 'Asia' AND
region = 'Eastern Asia' ORDER BY indepyear; 38/98

ここまでのまとめ WHEREとORDER BYを複合カラムインデックスでカバーさせる場合、 WHERE, ORDER BYの順 ‐ ORDER BY, ORDER
BYの順 ‐ で左から使い切らなくてはならない範囲検索が入るとそれより右のORDER BYはソートを早抜けできない 39/98

はい次 40/98

WHEREとORDER BYの連携実際ソートのコストだけで ORDER BY が劇的に速くなることはそんなにない典型的に効いてくるのは ORDER BY ..
LIMIT .. のケースと GROUP BY のケースいきなりGROUP BYが出てくるのは不思議かも知れないけれど ‐ 41/98

WHEREとORDER BY LIMITの連携 // WHERE continent = 'Asia' ORDER BY
gnp ASC LIMIT 5 $asia= $index_continent_population["Asia"]; /* snip */ uasort($sort_buffer, ..); // ソートバッファのソート // ソートが終わってからでないと LIMIT 5 が確定しない var_dump(array($sort_buffer[0], $sort_buffer[1], $sort_buffer[2], $sort_buffer[3], $sort_ buffer[4])); 42/98

WHEREとORDER BY LIMITの連携 // WHERE continent = 'Asia' ORDER BY
population ASC LIMIT 5 $n= 0; $result_buffer= []; foreach ($index_continent_population["Asia"] as $population => $code) { // populationの順番に並んでいることが保証されている array_push($result_buffer, $code); // 事前にトップ5が確定しているので、LIMIT 5を満たしたらループを抜けられる if (++$n == 5) break; } var_dump($result_buffer); 43/98

WHEREとGROUP BYの連携 MySQLのGROUP BYは「GROUP BYで指定されたカラムであらかじめソート済みでなければテンポラリーテーブルを作る」という制約がある https://dev.mysql.com/doc/refman/8.0/ja/group-by-optimization.html ‐ coreutils の
sort と uniq で ` .. | sort | uniq -c` みたいなやつと一緒 ‐ 「あらかじめソート済み」とは、インデックスを使って(ORDER BYの時のような) ソートのかっ飛ばしが成立すること集約関数の結果に対する HAVING と ORDER BY は「GROUP BYを計算し終えてからでないと確定しない」のでインデックスの刈り込みは不可能（GROUP BYでテンポラリーテーブルを避けるところまでしかインデックスで高速化できない） 44/98

BY population LIMIT 5; 45/98

練習5 FLUSH STATUS; SELECT * FROM country WHERE continent =
'Asia' ORDER BY population LIMIT 5; SHOW SESSION STATUS LIKE 'Handler_read%'; +-----------------------+-------+ | Variable_name | Value | +-----------------------+-------+ | Handler_read_first | 0 | | Handler_read_key | 1 | | Handler_read_last | 0 | | Handler_read_next | 4 | | Handler_read_prev | 0 | | Handler_read_rnd | 0 | | Handler_read_rnd_next | 0 | +-----------------------+-------+ 7 rows in set (0.00 sec) 46/98

BY population LIMIT 5; /* ORDER BY LIMITの早抜けができないように、 `idx_continent_population_indepyear` を消して `id x_continent` だけを足す ALTER TABLE country DROP INDEX idx_continent_population_indepyear, ADD INDEX idx_continent(continent); */ 47/98

BY population LIMIT 5; 48/98

練習6 早抜けができないので1 + 51行フェッチしてからソートして上から5行を取ってるのがおわかりいただけるだろうか FLUSH STATUS; SELECT * FROM
country WHERE continent = 'Asia' ORDER BY population LIMIT 5; SHOW SESSION STATUS LIKE 'Handler_read%'; +-----------------------+-------+ | Variable_name | Value | +-----------------------+-------+ | Handler_read_first | 0 | | Handler_read_key | 1 | | Handler_read_last | 0 | | Handler_read_next | 51 | | Handler_read_prev | 0 | | Handler_read_rnd | 0 | | Handler_read_rnd_next | 0 | +-----------------------+-------+ 49/98

練習7 SELECT region, COUNT(*) FROM country WHERE continent = 'Asia'
GROUP BY region; 50/98

GROUP BY region; /* $index["Asia"] の刈り込みはできているけど、regionがソート済みでないのでテンポラリーテーブルが発生している ALTER TABLE country ADD INDEX idx_continent_region(continent, region); */ 51/98

GROUP BY region; 52/98

ここまでのまとめ ORDER BY LIMITの早抜けはWHEREで刈り込んだ後の件数が多くなれば多くなるほど効果を発揮する WHEREで絞り込んでも1000万行……とかでも、LIMITの件数だけで完結することがある ‐ 範囲検索が挟まっている場合でも、ORDER BY LIMITの早抜けを使わせて追加フィルタリング
の方が速いことがあるこの場合、追加フィルタリングでフィルターされない行が多ければ多いほど早抜けがキマる ‐ 53/98

ここまでのまとめ GROUP BYのテンポラリーテーブルを避けるにはORDER BYと同じノリでインデックスを使わせる HAVINGとORDER BYはGROUP BY処理の後なので、GROUP BY処理までがインデックスの勝負どころ
そこから先は如何に「追加のフィルタリング、追加のソートのコストを避けるか」 = 「如何に GROUP BY終了時点の結果サイズを小さくするか」の勝負になる ‐ 54/98

はい次 55/98

Nested Loop Join // FROM city JOIN country ON city.id
= country.capital $result_buffer= []; foreach ($rows_from_city as $outer_row) { foreach ($rows_from_country as $inner_row) { if ($outer_row["id"] == $inner_row["capital]) array_push($result_buffer, array($outer_row, $inner_row)); } } var_dump($result_buffer); 56/98

Nested Loop Join INNER JOIN は換可能な演算どっちのループがより回数が少なくなるかを評価するのがオプティマイザー ‐ // FROM
city JOIN country ON city.id = country.capital $result_buffer= []; foreach ($rows_from_country as $outer_row) { // city.idはPKなのでここはインデックスで刈り込める foreach ($index_city_id[$outer_row["capital"]] as $inner_row) { array_push($result_buffer, array($outer_row, $inner_row)); } } var_dump($result_buffer); 57/98

Nested Loop Join country を外側にする戦略を選んだ SELECT city.* FROM city JOIN
country ON city.id = country.capital; ↓ 58/98

Nested Loop Join LEFT OUTER JOINは交換不可能な演算 // FROM city LEFT
JOIN country ON city.id = country.capital $result_buffer= []; foreach ($rows_from_city as $outer_row) { foreach ($rows_from_country as $inner_row) { if ($outer_row["id"] == $inner_row["capital"]) array_push($result_buffer, array($outer_row, $inner_row)); else { // 一致する行が無かったらNULLを詰めた行を結合しないといけないから array_push($result_buffer, array($outer_row, NULL_ROW)); } } } var_dump($result_buffer); 59/98

練習8 SELECT city.* FROM city JOIN country ON city.id =
country.capital; 60/98

練習8 SELECT city.* FROM city JOIN country ON city.id =
country.capital; /* 最低限、ループの内側をインデックスで刈り込めるようにcountry側に idx_capital が必要だろうが、 countryテーブルの件数がcityテーブルより十分少なくて、city.idはPKなのでひっくり返ることはない */ 61/98

練習9 SELECT city.* FROM city LEFT JOIN country ON city.id
= country.capital; 62/98

= country.capital; /* LEFT JOINなのでcountryを外側にできない。さっきの話題の通りcountry.idx_capitalで戦うしかない ALTER TABLE country ADD INDEX idx_capital(capital); */ 63/98

= country.capital; 64/98

ここまでのまとめ INNER JOINは外側と内側のテーブルを入れ替えられるため最適化が効きやすい ONで指定されたカラムを内側テーブルのインデックスで仕留める ‐ どちらが外側、内側になってもいいように、ONの両端のカラムでそれぞれのテーブルにインデックスがあると良いんじゃないかしら（完璧ではない） ‐ LEFT JOINは交換不可能なため、外側のループが固定される
外側のテーブルにWHERE句がない場合は当然外側はテーブルスキャン ‐ 65/98

はい続き 66/98

Nested Loop Join // FROM city INNER JOIN country ON
city.id = country.capital // WHERE country.continent = 'Asia' $result_buffer= []; foreach ($idx_continent["Asia"] as $outer_row) // ここでWHEREの刈り込みが効く { // city.idはPKなのでここはインデックスで刈り込める foreach ($index_city_id[$outer_row["capital"]] as $inner_row) { array_push($result_buffer, array($outer_row, $inner_row)); } } var_dump($result_buffer); 67/98

Nested Loop Join JOINの数が増えてもネストが深くなるだけ、めんどくさくはなるけど考えることはできる // FROM city JOIN //
country ON city.id = country.capital JOIN // countrylanguage ON country.code = countrylanguage.countrycode foreach ($city_table as $most_outer_row) { foreach ($country_table as $second_outer_row) { if ($most_outer_row["id"] == $second_outer_row["capital"]) { foreach ($countrylanguage_table as $inner_row) { if ($second_outer_row["code"] == $inner_row["counterycode"]) { array_push($result_buffer, array($most_outer_row, $second_outer_row, $inner_row)); } } } } } 68/98

ここまでのまとめ JOINの数が増えると目XPLAINは面倒になるけど、面倒なだけで不可能じゃない Nested Loopのイメージがあれば一気に考えやすくなるはず ‐ オプティマイザーは案外賢いというか、組み合わせを考えて雑に計算するだけだったら人間よりも機械の方が得意 ‐ 69/98

余談これ見やすい 70/98

余談これちょっと見にくい重複してないカラムのテーブル修飾を取っただけ ‐ 71/98

余談これすごく見にくいテーブルエイリアスつけた ‐ 72/98

名前大事 73/98

はい次 74/98

Nested Loop JoinとORDER BY ORDER BY/GROUP BYのカラムが「一番外側のループだけにある」 && インデックスでORDER
BY/GROUP BYまで解決できる時に限って、追加のソートを無効化できる追加のソートを無効化した状態ならLIMITによるループの早抜けやGROUP BYのテンポラリーテーブルかっ飛ばしが有効 75/98

Nested Loop JoinとORDER BY country.idx_continent と countrylanguage.idx_countrycode_percentage のインデックス //
FROM country JOIN countrylanguage ON country.code= countrylanguage.countrycode // WHERE country.continent = 'Asia' // ORDER BY countrylanguage.percentage LIMIT 5; $sort_buffer= []; foreach ($idx_continent["Asia"] as $outer_row) { // idx_countrycode_percentageならWHERE .. ORDER BY連携に使える気がしなくもないけれど foreach ($idx_countrycode_percentage[$outer_row["code"]] as $percentage => $countrycode_pk) { // このループの中ではpercentage昇順が保証されるけど <snip> } } // ここのループが戻るたびにpopulationは巻き戻るので、結局最後に追加のソートが必要 usort($sort_buffer, ..); 76/98

Nested Loop JoinとORDER BY 77/98

Nested Loop JoinとORDER BY country.idx_code_continent と countrylanguage.idx_percentage の組み合わせ // FROM
country JOIN countrylanguage ON country.code= countrylanguage.countrycode // WHERE country.continent = 'Asia' // ORDER BY countrylanguage.percentage LIMIT 5; $n= 0; $result_buffer= []; // countrylanguageはpercentageの昇順にだけ foreach ($idx_percentage as $outer_row) { foreach ($idx_code_continent[$outer_row["countrycode"]["Asia"] as $inner_row) { array_push($result_buffer, array($outer_row, $inner_row)); // $idx_percentage によって順番が保証されているのでLIMITの早抜けができる 83/98

練習11 SELECT country.name, countrylanguage.language, country.population, countrylanguage.percentage FROM country JOIN countrylanguage
ON country.code= countrylanguage.countrycode WHERE country.continent = 'Asia' AND countrylanguage.language = 'English' ORDER BY countrylanguage.percentage LIMIT 5; ↓ 84/98

練習11 SELECT country.name, countrylanguage.language, country.population, countrylanguage.percen tage FROM country JOIN
countrylanguage ON country.code= countrylanguage.countrycode WHERE country.continent = 'Asia' AND countrylanguage.language = 'English' ORDER BY countrylanguage.percentage LIMIT 5; /* 外側をcountrylanguageにしたいので、WHERE, ORDER BYの順で idx_language_percentage 内側がcountryになるのでONとWHEREの idx_code_continent ALTER TABLE countrylanguage ADD INDEX idx_language_percentage(language, percentage); ALTER TABLE country ADD INDEX idx_code_continent(code, continent); */ 85/98

練習11 code がもともとPRIMARY KEYだったから使わなかった… とはいえ目的のORDER BY LIMITの最適化はできている ‐ SELECT country.name,
countrylanguage.language, country.population, countrylanguage.percentage FROM country JOIN countrylanguage ON country.code= countrylanguage.countrycode WHERE country.continent = 'Asia' AND countrylanguage.language = 'English' ORDER BY countrylanguage.percentage LIMIT 5; ↓ 86/98

練習12 SELECT country.name, countrylanguage.language, country.population, countrylanguage.percentage FROM country LEFT JOIN
countrylanguage ON country.code= countrylanguage.countrycode WHERE country.continent = 'Asia' ORDER BY countrylanguage.percentage LIMIT 5; ↓ 87/98

練習12 SELECT country.name, countrylanguage.language, country.population, countrylanguage.percent age FROM country LEFT
JOIN countrylanguage ON country.code= countrylanguage.countrycode WHERE country.continent = 'Asia' ORDER BY countrylanguage.percentage LIMIT 5; /* LEFT JOINでcountryが外側に固定されるため、ORDER BYの追加ソートは「回避できない」外側テーブルを刈り込む country.idx_continent , ONで使う countryranguage.countrycode (PRIM ARY KEYの一部)だけが有効 */ 88/98

練習13 SELECT country.continent, SUM(country.population) FROM country JOIN countrylanguage ON countrylanguage.countrycode
= country.code WHERE countrylanguage.language = 'Japanese' GROUP BY country.continent; ↓ 89/98

練習13 SELECT country.continent, SUM(country.population) FROM country JOIN countrylanguage ON countrylanguage.countrycode
= country.code WHERE countrylanguage.language = 'Japanese' GROUP BY country.continent; /* 追加ソートを無効化するためにはcoutryが外側になければならない、絞り込みに使えそうなカラムはないのでGROUP BYだけカバーする idx_continent 内側のcountrylanguageはPKが (countrycode, language) なのでそのまま使えそうが、そのままでは選んでくれていないのでFORCE INDEXとJOIN_ORDERで外側のテーブルを固定する */ 90/98

練習13 SELECT /*+ JOIN_ORDER(country, countrylanguage) */ country.continent, SUM(country.population) FROM country
FORCE INDEX(idx_continent) JOIN countrylanguage ON countrylanguage.countrycode = co untry.code WHERE countrylanguage.language = 'Japanese' GROUP BY country.continent; ↓ 91/98

練習13 左が FORCE INDEX, 右がそのままこの件数だときもち速いかな…程度 ‐ 92/98

ここまでのまとめ JOINでもORDER BY LIMITやGROUP BYの高速化が狙えるけど、一番外側のテーブルでインデックスを使わせないといけない往々にして「JOINが重い理由」はここにある ‐ 絞り込んだ行数が十分小さければ、無理にGROUP BYでテンポラリーテーブルを飛
ばす必要はない WHEREとONで十分に絞り込むことをまずは考えて ‐ 93/98

以上 94/98

インデックス #とは「主に検索のために」「行データの部分集合を」「あらかじめソートして複製したもの」「ロックの単位でもある」これを上手く扱うことで、MySQLの性能問題はかなりの割合で改善できる 95/98

MySQLの気持ちを知って、楽しいSQL レビューライフを！ 96/98

参考片手間MySQLチューニング戦略 MySQLとインデックスと私 Where狙いのキー、order by狙いのキー PHPer が知るべき MySQL クエリチューニング MySQL
データベースの負荷対策/パフォーマンスチューニング備忘録インデックスの基礎〜実践 97/98

Any Questions and/ or Suggestions? 98/98

MySQLとインデックスとPHPer

MySQLとインデックスとPHPer

More Decks by yoku0825

Other Decks in Technology

Featured

Transcript