The Elephant in the Room

The Elephant in the Room PostgreSQL For Perl Programmers Brad
Lhotsky http://twitter.com/reyjrar http://github.com/reyjrar

Given NoSQL and MySQL Why bother with PostgreSQL?

NoSQL Redis Cassandra MongoDB CouchDB etc.. •Distributed •Free Form Data
Store •Document Storage •No need for intensive normalization

MySQL for structured data. It's Web Scale.

Fact or Fiction MySQL is faster than PostgreSQL.

Surprisingly, using a technology without knowing how to use it
yields poor performance

Fact or Fiction MySQL is faster than PostgreSQL.

MySQL & PostgreSQL SELECT grade, risk_score, risk_adjusted_value, SUM(risk_adjusted_value) FROM risky_mortgages
GROUP BY grade or "Understanding the Mortgage Crisis of 2008"

testdb=#; ERROR: column "risky_mortgages.risk_score" must appear in the GROUP BY
clause or be used in an aggregate function LINE 1: SELECT grade, risk_score, risk_adjusted_value, ... ^ PostgreSQL

mysql>; +-------+------------+---------------------+--------------------------+ | grade | risk_score | risk_adjusted_value | SUM(risk_adjusted_value)
| +-------+------------+---------------------+--------------------------+ | A | 90 | 90000 | 182000 | | A+ | 95 | 95000 | 95000 | | B | 80 | 80000 | 409000 | | C- | 60 | 60000 | 169000 | +-------+------------+---------------------+--------------------------+ MySQL

MySQL makes SQL easy. That's why it's so successful!

MySQL •Bought by Sun •Sun bought by Oracle •Oracle has
a shining reputation with the Open Source Community, right? ! •What the Fork?!@$ •MariaDB •XtraDB •OurDelta •Drizzle

PostgreSQL learned a lot from MySQL’s success.

* with commercial support available! Open Source and BSD Licensed!

PostgreSQL Features •Reliability •Excellent native types •Constraints •Stored Procedures •Triggers
• Views •Explain / Analyze •Extensible!

Features: Reliability •ACID •Atomicity •Consistency •Isolation •Durability ! •It’s turtles
all the way down.

•Numeric •Strings •Dates, Times, Intervals •Boolean •Array •Geometric •Networking! •JSON!
Features: Native Data Types

SELECT * FROM network WHERE ip << inet ‘10.0.0.0/8’ SELECT
* FROM documents WHERE json->>'priority' > 10 SELECT * FROM events WHERE event_ts > NOW() - interval '30 minutes'

•Simple or complex •Best to keep simple, use triggers for
complex •Validation at point of storage Features: Constraints

CREATE TABLE positivity ( id SERIAL, happy_int INTEGER CHECK (
happy_int > 0 ), entry_ts TIMESTAMP DEFAULT NOW() );

INSERT INTO positivity ( happy_int ) VALUES ( -2 );
ERROR: new row for relation "positivity" violates check constraint "positivity_happy_int_check"

Features: Stored Procedures •PL/PgSQL •Oracle Compatible •Powerful and Transactional! •PL/Perl
•Like mod_perl, shared interpreter •PL/Perl, no ﬁlesystem access •PL/PerlU, ﬁlesystem access •DBD::PgSPI •PL/R •Powerful Statistical Programming inside of Procedures

CREATE FUNCTION lookup_dev_id(character varying) RETURNS integer AS $$ ! DECLARE
in_ip inet := CAST($1 AS inet); out_dev_id INTEGER; ! BEGIN SELECT dev_id INTO out_dev_id FROM devices WHERE ip = in_ip; ! IF NOT FOUND THEN RAISE EXCEPTION 'Unable to locate IP: %', in_ip; END IF; ! RETURN out_dev_id; END$$ LANGUAGE plpgsql;

SELECT lookup_dev_ip(‘12.34.56.78’);

•Use Stored Procedures to do things when something happens •Modify
before or after change happens •Validation at the point of storage •DDL Changes can ﬁre triggers Features: Triggers

CREATE OR REPLACE FUNCTION email_signature() RETURNS trigger AS $$ !
use MIME::Lite; ! my $subject = q{New Signature: } . qq{$_TD->{new}{facility}:$_TD->{new}{native_sig_id}}; ! my $msg = new MIME::Lite( From => ‘[email protected]’, To => ‘[email protected]’, Subject => $subject, Type => 'TEXT', ); $msg->data(<<"EOB"); Description: $_TD->{new}{description} EOB ! $msg->send; return; $$ LANGUAGE plperlu;

CREATE TRIGGER new_signature_email AFTER INSERT ON security_signatures FOR EACH ROW
EXECUTE PROCEDURE email_signature();

•Read only •Use for Reports or Pages •Writable via Triggers
•Static or Dynamic Features: Views

Features: pg_stats •Tracks Usage Data •Used by optimizer •Viewable by
mortals

CREATE VIEW v_admin_index_usage AS SELECT t.relname AS “table”, c.relname AS
index_name, c.relpages, i.idx_scan, t.seq_scan FROM pg_class c JOIN pg_stat_user_indexes i ON c.relname = i.indexrelname JOIN pg_stat_user_tables t ON i.relname = t.relname;

SELECT client.id, client.ip, SUM(queries) AS queries, SUM(nx) AS nx, SUM(answers)
AS answers, SUM(errors) AS errors, COUNT(distinct day) AS days_active FROM client INNER JOIN client_stats ON client.id = client_stats.client_id GROUP BY client.id, client.ip HAVING COUNT(distinct day) > 2

QUERY PLAN ! GroupAggregate (cost=33.25..41.30 rows=230 width=35) Filter: (count(DISTINCT client_stats.day)
> 2) -> Sort (cost=33.25..33.82 rows=230 width=35) Sort Key: client.id, client.ip -> Hash Join (cost=15.77..24.23 rows=230 width=35) Hash Cond: (client_stats.client_id = client.id) -> Seq Scan on client_stats (cost=0.00..5.30 rows=230 width=28) -> Hash (cost=10.34..10.34 rows=434 width=15) -> Seq Scan on client (cost=0.00..10.34 rows=434 width=15) EXPLAIN - What You Think You'll Do

GroupAggregate (cost=33.25..41.30 rows=230 width=35) (actual time=4.498..6.117 rows=8 loops=1) Filter: (count(DISTINCT
client_stats.day) > 2) -> Sort (cost=33.25..33.82 rows=230 width=35) (actual time=4.417..4.925 rows=234 loops=1) Sort Key: client.id, client.ip Sort Method: quicksort Memory: 43kB -> Hash Join (cost=15.77..24.23 rows=230 width=35) (actual time=2.085..3.697 rows=234 loops=1) Hash Cond: (client_stats.client_id = client.id) -> Seq Scan on client_stats (cost=0.00..5.30 rows=230 width=28) (actual time=0.014..0.468 rows=234 loops=1) -> Hash (cost=10.34..10.34 rows=434 width=15) (actual time=2.049..2.049 rows=434 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 20kB -> Seq Scan on client (cost=0.00..10.34 rows=434 width=15) (actual time=0.011..0.995 rows=434 loops=1) Total runtime: 6.242 ms EXPLAIN ANALYZE- What You Did

•Tons of PL/* Languages •PostGIS (GeoSpatial) •tsearch2 full text search
•ltree for representing tree structure in SQL ! •PGXN by David Wheeler •Based on the CPAN Features: Extensibility

Major Features: 8.4 •Windowing Functions SELECT person, dept, salary, avg(salary)
OVER( partition by dept ) FROM employees

SELECT * FROM employees; ! id | name | dept
| gender | salary ----+---------+------------+--------+-------- 1 | bob | it | m | 45000 2 | alice | it | f | 42000 3 | charlie | it | m | 55000 4 | bill | it | m | 46000 5 | jill | it | f | 35000 6 | rob | accounting | m | 35000 7 | jane | accounting | f | 30000 8 | janice | accounting | f | 37000 9 | jack | accounting | m | 40000 (9 rows)

name | dept | salary | salary_dept ---------+------------+--------+------------- rob |
accounting | 35000 | 35500 jane | accounting | 30000 | 35500 janice | accounting | 37000 | 35500 jack | accounting | 40000 | 35500 bob | it | 45000 | 44600 jill | it | 35000 | 44600 alice | it | 42000 | 44600 charlie | it | 55000 | 44600 bill | it | 46000 | 44600 (9 rows) SELECT name, dept, salary, AVG(salary) OVER (partition by dept)::int AS salary_dept FROM employees;

SELECT dept, gender, salary, AVG(salary) OVER (partition by dept)::int AS
salary_dept, AVG(salary) OVER (partition by gender)::int AS salary_gender, AVG(salary) OVER (partition by dept, gender)::int AS salary_dept_gender FROM employees dept | gender | salary | salary_dept | salary_gender | salary_dept_gender ------------+--------+--------+-------------+---------------+-------------------- accounting | f | 30000 | 35500 | 36000 | 33500 accounting | f | 37000 | 35500 | 36000 | 33500 accounting | m | 35000 | 35500 | 44200 | 37500 accounting | m | 40000 | 35500 | 44200 | 37500 it | f | 35000 | 44600 | 36000 | 38500 it | f | 42000 | 44600 | 36000 | 38500 it | m | 46000 | 44600 | 44200 | 48667 it | m | 45000 | 44600 | 44200 | 48667 it | m | 55000 | 44600 | 44200 | 48667 (9 rows)

Major Features: 9.0 •Hot Standby •Streaming Replication •Trigger Improvements •WHEN
•Column-based triggers •Deferrable Constraints •Anonymous Functions •Named Parameter Calls •Exclusion Constraints

CREATE TABLE reservations ( room_id INTEGER NOT NULL, guest_id INTEGER
NOT NULL, during DATERANGE NOT NULL ) ALTER TABLE reservations ADD CONSTRAINT exclude_room_period EXCLUDE USING gist( room_id WITH =, during WITH && )

Major Features: 9.1 •Synchronous Replication •PGXN Support •UNLOGGED tables •SE/Linux
Integration •SQL/MED (Management of External Data) •MySQL, Oracle, Sybase, ODBC, JDBC, Informix •CouchDB, MongoDB, Redis, Neo4j •LDAP, Twitter, S3, Storm, WWW •ETC

Major Features: 9.2 •Index only scans •Native JSON datatype •Native
Range types

Major Features: 9.3 •Custom Background Workers •Native Materialized Views •LATERAL
JOIN() •Native JSON Operators •Recursive View Support •Automatic Updatable Views

Also noteworthy .. •hstore for Key/Value pairs like NoSQL •Custom
Data Types •Composite Types •Partial Indices •Namespaces •Table Partitioning via Views/Triggers

The Elephant in the Room

The Elephant in the Room

More Decks by Brad Lhotsky

Other Decks in Technology

Featured

Transcript