Slide 1

Slide 1 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application Using Database to pull your application’s weight

Slide 2

Slide 2 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application { “name” => "Harisankar P S", “email” => ”[email protected]”, “twitter” => "coderhs", “facebook" => "coderhs", “github” => “coderhs”, “linkedin” => “coderhs”, } What I do: I write Ruby code for a living

Slide 3

Slide 3 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application One thing you notice about me is. I love Stickers!!

Slide 4

Slide 4 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application So if you have stickers give them to me!! If you want stickers meet me after my talk.

Slide 5

Slide 5 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application I work at Ruby on Rails dev shop https://redpanthers.co

Slide 6

Slide 6 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application We are from KOCHI, Kerala

Slide 7

Slide 7 text

PGConf India 2017 So let me tell a story

Slide 8

Slide 8 text

PGConf India 2017 of a novice developer

Slide 9

Slide 9 text

PGConf India 2017 who wrote the first line of code

Slide 10

Slide 10 text

PGConf India 2017 for a web app that was meant to process a

Slide 11

Slide 11 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application Couple of 1000 rows of data, less than 10 users

Slide 12

Slide 12 text

PGConf India 2017 but grew so BIG that it was process GB’s of data every hour.

Slide 13

Slide 13 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application

Slide 14

Slide 14 text

PGConf India 2017 This talk is about all the things that I learned Which helped me Scale the application without having to spend a fortune in Hardware

Slide 15

Slide 15 text

PGConf India 2017 So what are the tools that I work with

Slide 16

Slide 16 text

PGConf India 2017 In my universe these are my tools called

Slide 17

Slide 17 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application Really awesome when they work together

Slide 18

Slide 18 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application Ruby is Captain America

Slide 19

Slide 19 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application Rails is Iron man

Slide 20

Slide 20 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application PostgreSQL is the hulk

Slide 21

Slide 21 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application The Hulk can smash, anything

Slide 22

Slide 22 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application But we make him carry our suitcase

Slide 23

Slide 23 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application This talk is about how we can offload couple of the jobs done by Rails to Database. You have a HULK then don’t feel scared to USE it.

Slide 24

Slide 24 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application Today we are going to talk about • Query Planner • Indexing • Attribute Preloading • Materialised Views • Generating JSON • Synchronous Commit

Slide 25

Slide 25 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application Query Planner

Slide 26

Slide 26 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application Database is a General Purpose Software

Slide 27

Slide 27 text

PGConf India 2017 A database is not build for a single use case or industry.

Slide 28

Slide 28 text

PGConf India 2017 Then how does it handle all the scenarios?

Slide 29

Slide 29 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application Truth is, it doesn’t!

Slide 30

Slide 30 text

PGConf India 2017 DB doesn’t know what all scenarios its put under, its upto us to nudge and optimise it.

Slide 31

Slide 31 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application • SQL syntax is all about how the results should be • What you want in your result - SELECT id, name • Or some information about data like - SELECT average(price), max(price), min(price) Where is the decision on how the data should be fetch made.

Slide 32

Slide 32 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application Well thats what Query planner is all about.

Slide 33

Slide 33 text

PGConf India 2017 Its the Brain of your DB

Slide 34

Slide 34 text

PGConf India 2017 We need to understand how the system work before we can improve its performance

Slide 35

Slide 35 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application • A query plan is created by the DB before the query you gave is executed. • Plan is the cost of running the query. The DB chooses the one with the least cost. • Query Plan assumes the plan it has is the ideal one

Slide 36

Slide 36 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application So we need to see what the query planner see Active Record has .explain method to help us there

Slide 37

Slide 37 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application Asset.where(asset_id: 1).explain User.where(id: 1).explain

Slide 38

Slide 38 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application So we check the query plan find where we are slowing down and then fix them and make the plan choose the faster method.

Slide 39

Slide 39 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application I have done all these in production =), so you don’t to feel scared to run this.

Slide 40

Slide 40 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application Sounds Simple Doesn't it =)

Slide 41

Slide 41 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application So lets see how we can do that.

Slide 42

Slide 42 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application Tip: We can make the query plan display in JSON, YML & XML formats as well EXPLAIN (format YAML) select * from users

Slide 43

Slide 43 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application Indexing

Slide 44

Slide 44 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application Okay..Lets not do that.

Slide 45

Slide 45 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application Indexes are a special lookup table that the database search engine can use to speed up data retrieval.

Slide 46

Slide 46 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application An Index is like a pointer to a particular row of a table. Where all the fields in the table are ordered.

Slide 47

Slide 47 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application But you know something?

Slide 48

Slide 48 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application Databases are smart

Slide 49

Slide 49 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application Even if you have indexes if it find the sequential search to be cost less then it would go for that one.

Slide 50

Slide 50 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application Example: Lets say you have a column with a 10,000 rows, but the the content is either short, medium, long. The database don’t use index as it finds sequential is faster

Slide 51

Slide 51 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application We should Index • Index Primary key • Index Foreign key • Index all columns you would be passing into where clause • Index the keys used to Join tables • Index the date column (if you are going to call it frequent, like rankings of a particular date) • Add partial index to scopes

Slide 52

Slide 52 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application Do not Index • Do not index tables with a lot of read, write • Do not index tables you know that will remain small, all through out its life time • Do not index columns where you will be manipulating lot of its values.

Slide 53

Slide 53 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application Attribute Preloading

Slide 54

Slide 54 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application A good use of Postgres Array

Slide 55

Slide 55 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application Rails Way

Slide 56

Slide 56 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application select * from tasks select * from tags inner join tasks_tags on tags.id = tasks_tags.tag_id where tasks_tags.task_id in (1,2,3,..) tasks = Task.find(:all, :include => :tags) 2 Queries Object for each tasks Rails Code

Slide 57

Slide 57 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application Fast Postgres Arrays: tasks = Task.find(:all, :select => "*, array(select tags.name from tags inner join tasks_tags on (tags.id = tasks_tags.tag_id) where tasks_tasks.task_id=tasks.id) as tag_names") 1 SQL query Rails doesn't have to create objects >3x faster

Slide 58

Slide 58 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application Materialised View

Slide 59

Slide 59 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application Database views? Database views are like the view in our rails. A rails view(an html page) shows data from multiple model in a single page Similarly we can show data from multiple table as a single table using the concept called views Why would we do that? Because it makes life easier

Slide 60

Slide 60 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application Instead of doing Every time you want the managers SELECT id, name, email FROM companies where role=‘manager’

Slide 61

Slide 61 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application CREATE VIEW company_managers AS SELECT id, name, email FROM companies WHERE role='manager'; You can create a view And simple do SELECT * FROM company_managers;

Slide 62

Slide 62 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application Note: • A schema of view lives in memory of a DB • The result is not stored in memory • Its is actually running our query to get the results • They are called pseudo tables

Slide 63

Slide 63 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application Materialised views are the next evolution of Database views. We store the result as well in a table • This was first introduced by Oracle • But now found in PostgreSQL, MicrosoftSQL, IBM DB2, etc. • MySQL doesn’t have it you can create it using open source extensions.

Slide 64

Slide 64 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application How can we use it in Ruby? Thanks to ActiveRecord its easy to access such pseudo tables

Slide 65

Slide 65 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application Create a migration to record the Materialised view We need a bit of SQL here class CreateAllTimesSalesMatView < ActiveRecord::Migration def up execute <<-SQL CREATE MATERIALIZED VIEW all_time_sales_mat_view AS SELECT sum(amount) as total_sale, DATE_TRUNC('day', invoice_adte) as date_of_sale FROM sales GROUP BY DATE_TRUNC('day', invoice_adte) SQL end def down execute("DROP MATERIALIZED VIEW IF EXISTS all_time_sales_view") end end

Slide 66

Slide 66 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application Create Active Record model I place these views at the location app/models/views class AllTimeSalesMatView < ActiveRecord::Base self.table_name = 'all_time_sales_mat_view' def readonly? true end def self.refresh ActiveRecord::Base.connection.execute('REFRESH MATERIALIZED VIEW CONCURRENTLY all_time_sales_mat_view') end end

Slide 67

Slide 67 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application Now we can do AllTimeSalesMatView.select(:name) AllTimeSalesMatView.where(email: '[email protected]')

Slide 68

Slide 68 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application First, Last and Find • They don’t work in your view as they operate on your tables primary key and a view doesn’t have it • If you want to use it then you need to one of the fields in your table as primary key class Model < ActiveRecord::Base self.primary_key = :id end

Slide 69

Slide 69 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application Benchmark • I created a table with 1 million random sales and random dates in a year. (Dates where bookmarked as well)

Slide 70

Slide 70 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application Take Away • Faster to fetch data. • Capture commonly used joins & filters. • Push data intensive processing from Ruby to Database. • Allow fast and live filtering of complex associations or calculation .fields. • We can index various fields in the table.

Slide 71

Slide 71 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application Pain Points • We will be using more RAM and Storage • Requires Postgres 9.3 for MatView • Requires Postgres 9.4 to refresh concurrently • Can’t have Live data • You can fix this by creating your own table and 
 updating it with the latest information

Slide 72

Slide 72 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application JSON generation in DB

Slide 73

Slide 73 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application • Websites with simple HTML and plain javascript based AJAX is coming to an end • Its the era of new modern day JS frameworks • JSON is the glue that binds the fronted and our backend • So its natural to find more and more DB supporting the generation and storage of JSON.

Slide 74

Slide 74 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application To convert a single row to JSON select row_to_json(users) from users where id = 1 we use row_to_json() method in SQL

Slide 75

Slide 75 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application {“id":1,"email":"[email protected]", "encrypted_password":"iwillbecrazytodisplaythat", "reset_password_token":null,"reset_password_sent_at":null, "remember_created_at":"2016-11-06T08:39:47.983222", "sign_in_count": 11,"current_sign_in_at":"2016-11-18T11:47:01.946542", "last_sign_in_at":"2016-11-16T20:46:31.110257", "current_sign_in_ip":"::1","last_sign_in_ip":"::1", "created_at":"2016-11-06T08:38:46.193417", "updated_at":"2016-11-18T11:47:01.956152", "first_name":"Super","last_name":"Admin","role":3}

Slide 76

Slide 76 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application But for more practical use we write queries like select row_to_json(results) from ( select id, email from users ) as results {"id":1,"email":"[email protected]"}

Slide 77

Slide 77 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application A more complex one select row_to_json(result) from ( select id, email, ( select array_to_json(array_agg(row_to_json(user_projects))) from ( select id, name from projects where user_id=users.id order by created_at asc ) user_projects ) as projects from users where id = 1 ) result

Slide 78

Slide 78 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application { “id":1,"email":"[email protected]", "project":["id": 3, "name": “CSnipp"] } We did data preloading as well, instead of having the need to run another query separate from the first one. We got the data about projects as well.

Slide 79

Slide 79 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application json_build_object • Added in PostgreSQL 9.4 to make JSON creation a bit more simpler select json_build_object('foo',1,'bar',2); {"foo": 1, "bar": 2}

Slide 80

Slide 80 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application So where is Ruby?

Slide 81

Slide 81 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application • For simple JSON creation you can use a gem called Surus • https://github.com/jackc/surus

Slide 82

Slide 82 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application Which lets you write code like User.find_json 1 User.find_json 1, columns: [:id, :name, :email] Post.find_json 1, include: :author User.find_json(user.id, include: {posts: {columns: [:id, :subject]}}) User.all_json User.where(admin: true).all_json User.all_json(columns: [:id, :name, :email], include: {posts: {columns: [:id, :subject]}}) Post.all_json(include: [:forum, :post])

Slide 83

Slide 83 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application But for more complicated queries you might still end up writing SQL

Slide 84

Slide 84 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application But Like me if you want to keep as much stuff as possible in Ruby then. Create a materialised view for your complicated query And then use the gem to generate JSON =)

Slide 85

Slide 85 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application Benchmarks • In our case we saw request to a (.json) url which used to take 2 seconds, coming down to <= 200ms • Some benchmarks I found online mentions

Slide 86

Slide 86 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application • Simple query • More complicate Query • Source: https://github.com/JackC/json_api_bench

Slide 87

Slide 87 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application Synchronous Commit

Slide 88

Slide 88 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application • PostgreSQL sacrifices speed for durability and reliability • PostgreSQL is known for its slow writes and faster readers • It has slow writes as it waits for confirmation that what we inserted has been recorded to the Hard Disk. • You can disable this confirmation check to speed up your inserts if you are inserting a lot of rows every second

Slide 89

Slide 89 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application User.transaction do User.synchronous_commit false @user.save end Surus Gem Provides

Slide 90

Slide 90 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application • Only issue now, is incase your DB crash it can’t recover the lost data not saved to Hard Disk • It won’t corrupt the data, but you might loose some rows of your data • Not to be used in cases when you want data integrity to be 100% • Use it where you don’t mind loosing some information or where you can rebuild it from outside your DB. Like logs, or raw information.

Slide 91

Slide 91 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application Postgres Config that we use # How much memory we have to cache the database, RAM_FOR_DATABASE * 3/4 effective_cache_size = <%= ram_for_database.to_i * 3/4 %>MB # Shared memory to hold data in RAM, RAM_FOR_DATABASE/4 shared_buffers = <%= ram_for_database.to_i / 3 %>MB # Work memory for queries (RAM_FOR_DATABASE/max_connections) ROUND DOWN 2^x work_mem = <%= 2**(Math.log(ram_for_database.to_i / expected_max_active_connections.to_i)/Math.log(2)).floor %>MB # Memory for vacuum, autovacuum, index creation, RAM/16 ROUND DOWN 2^x maintenance_work_mem = <%= 2**(Math.log(ram_for_database.to_i / 16)/Math.log(2)).floor %>MB # To ensure that we don't lose data, always fsync after commit synchronous_commit = on

Slide 92

Slide 92 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application # Size of WAL on disk, recommended setting: 16 checkpoint_segments = 16 # WAL memory buffer wal_buffers = 8MB # Ensure autovacuum is always turned on autovacuum = on # Set the number of concurrent disk I/O operations that PostgreSQL # expects can be executed simultaneously. effective_io_concurrency = 4

Slide 93

Slide 93 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application Summarize

Slide 94

Slide 94 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application • Index data so that we don’t end up scanning the whole DB • Use arrays for data preloading • Simplify the way you fetch data from the DB using views • Move complicated JSON generation to the Databases • Disable synchronous commit when you feel like it won’t cause a problem

Slide 95

Slide 95 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application Conclusions • Know your tech stack • We should have control over all our moving parts • Try to bring about the best with your tech stack before you start throwing more money at it • SQL has been around for 40 years and its planning to say for a while longer =) • There is no golden rule. What worked for me might not work for your specific use case.

Slide 96

Slide 96 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application I blogged about this in detail. • http://blog.redpanthers.co/materialized-views- caching-database-query/ • http://blog.redpanthers.co/create-json-response- using-postgresql-instead-rails/ • http://blog.redpanthers.co/different-types-index- postgresql/ • http://blog.redpanthers.co/optimising-postgresql- database-query-using-indexes/

Slide 97

Slide 97 text

PGConf India 2017 Performance Optimisation in Postgres for Web Application Thank You. Harisankar P S, Chief Solution Artchitect/CEO, Red Panthers I blog about PostgreSQL and Ruby on Rails at http://blog.redpanthers.co