look? • Use performance monitoring that can eind your 99th percentile slow transactions ‣ e.g. NewRelic, Skylight • Look at your database’s slow query log • Look at your error logs for timeouts • Listen to your best users
have is going to greatly affect your query performance • The load on the database’s CPU from other queries will also affect performance • You can get good test results locally, but nothing beats production results
apply to most RDBMS • We’re using an index. This is good. • extras has using 2ilesort. This is bad. • rows_examined is quite large. This is bad, sometimes.
rows are stored on disk as… ‣ a B-tree in som databases MySQL (InnoDB) ‣ a Heap table in others (e.g. Postgres) • Indexes are (also) B-trees where the leaf nodes point to the table rows
given user_id using the index (fast) • Retrieves ALL records from the table ‣ despite the LIMIT/OFFSET • Sorts the records by visited_at • Returns LIMIT records at OFFSET
“visited_at”] +----+-------------+-----------+------+---------+-------+------+-------------+ | id | select_type | table | type | key_len | ref | rows | Extra | +----+-------------+-----------+------+---------+-------+------+-------------+ | 1 | SIMPLE | bookmarks | ref | 5 | const | 2523 | Using where | +----+-------------+-----------+------+---------+-------+------+------------- • This eliminates the eilesort, and now its fast
1 6, Feb 3 6, Feb 5 6, Mar 12 13, Feb 22 32 3 99 42 12 Branch Node Leaf Node For an index like (user_id, visited_at), you can visualise the sorting on the second column as above
a smell • As the application developer, if you know what queries you make, specifying composite indexes improves index usage • Composite indexes are also used by subsets of the key, counting from the left
delete operations. It’s a tradeoff ‣ Therefore, only optimise when necessary • Applying indexes to a large table can lock the table ‣ You may need to use a tool like Percona Online Schema Change (pt-osc)
:counter_cache This option can be used to configure a custom named :counter_cache. You only need this option when you customized the name of your :counter_cache on the belongs_to association. • Turns out, in cases like this where Rails can’t guess the inverse_relation, we can be explicit
count >= LIKES_REQUIRED_FOR_PUBLISH end • All we want is to eind out if a recipe has been liked by several followees… • But… what if there are lots of followees?
INNER JOIN `follows` ON `users`.`id` = `follows`.`followee_id` WHERE `follows`.`follower_id` = ? ORDER BY follows.id DESC SELECT COUNT(*) FROM `recipe_likes` WHERE `recipe_likes`.`recipe_id` = ? AND `recipe_likes`.`user_id` IN (?+);
IN clause as being similar to an OR operation • Each element of the IN means a separate index lookup • A better index won’t help… we can only avoid using IN
the intersection between followers and “likers” • It’s easy to add more app servers • It’s harder to add more database instances (even slaves) • In this case, even with (say) 1 million ids, doing it in Ruby is still fast
time not spent improving the product ‣ Have good hygiene (understand indexes, N+1s, etc) • Rails makes it easy to write clean code, but problems can be hard to detect ‣ But, clean code can make diagnosis easier • Prefer quick and easy solutions