Slide 1

Slide 1 text

Bringing PostgreSQL closer to your Application Layer 1

Slide 2

Slide 2 text

I used to work at an Investment Bank 2

Slide 3

Slide 3 text

...definitely not this! 3

Slide 4

Slide 4 text

...more like this 4

Slide 5

Slide 5 text

...specifically 5

Slide 6

Slide 6 text

Hi I’m Huss Co-founder & CTO @ 9fin Investment Banking Technology ● Equities Trading ● Bespoke Trader Desk tools ● Risk and Control systems Aeronautical Engineering ● Lots of aeroplane stuff ● .. But no Databases! :( 9fin.com @9finHQ angel.co/9fin 6 @HUSSTECH

Slide 7

Slide 7 text

Good choice! Where did it go? DBAs and Devs!? Never the twain shall meet 7 Bringing PostgreSQL closer to your Application Layer

Slide 8

Slide 8 text

Software Archeology ● BigCo EE: Silo groups with big databases ● “Fat Client” architecture ● 3 tier relatively recent ○ Typically endpoints on enterprise message bus ○ SOAP! 8

Slide 9

Slide 9 text

BigCo Org Structure ● Hierarchical ● Long communication and decision chains ● “shared services” idea emerges 2008+ ● DBAs and Sysadmins moved here ○ Away from business aligned teams ● Now hard to diagnose issues quickly ○ Everyone is not sitting together anymore 9

Slide 10

Slide 10 text

WebCo 2.0 ● Flatter structure - mixed teams, practitioners of many technologies ● Databases are out-of-the-box “good enough” ○ Heroku, RDS, et al ● The DBA function merged into Sr. Devs usually ● Explosion in dynamic languages with rich tooling for web development 10

Slide 11

Slide 11 text

Object Relational Mapping ● Model your entities using an ORM library for your application ● Takes care of data retrieval and persistence ● Great for bootstrapping ○ Focus on app code not storage ○ Strong control of your business objects ● Rich libraries in many languages 11

Slide 12

Slide 12 text

That awkward moment when you need SQL ● At some point you’re going to want to take control of the query generation ○ Performance tune ○ Force an index use ○ Use advanced DB features ■ WITH RECURSIVE, Subqueries/CTEs, PARTITION OVER, JSON ● Raw SQL support in ORMs varies Example: Cumulative total of revenue made at a fictional DVD rental business? 12 from flask import Flask from flask_sqlalchemy import SQLAlchemy from sqlalchemy.sql import func # setup SQLAlchemy and model object app = Flask(__name__) app.config['SQLALCHEMY_DATABASE_URI'] = 'postgresql://xxx' db = SQLAlchemy(app) class Payment(db.Model): __tablename__ = 'payment' payment_id = db.Column(db.Integer, nullable=False, primary_key=True) rental_id = db.Column(db.SmallInteger, nullable=False) amount = db.Column(db.Numeric(scale=(4, 2), asdecimal=True), nullable=False) payment_date = db.Column(db.DateTime, nullable=False)

Slide 13

Slide 13 text

That awkward moment when you need SQL 13 # set up two query objects one for a subquery one for the window function subquery = db.session.query(func.DATE(Payment.payment_date).label('payment_date'), func.sum(Payment.amount).label('amount') ).group_by(func.DATE(Payment.payment_date))\ .order_by(func.DATE(Payment.payment_date))\ .subquery('s') window = db.session.query(subquery.c.payment_date, subquery.c.amount, func.sum(subquery.c.amount).over( order_by=subquery.c.payment_date) ).select_entity_from(subquery).order_by(subquery.c.payment_date) ● 10+ (readable) lines ● This is quite nice syntax actually ● But the semantics are halfway between Python object orientation and SQL declarative style How to do the same in SQL? SELECT payment_date, amount, sum(amount) OVER (ORDER BY payment_date) FROM ( SELECT CAST(payment_date AS DATE) AS payment_date, SUM(amount) AS amount FROM payment GROUP BY CAST(payment_date AS DATE)) p ORDER BY payment_date;

Slide 14

Slide 14 text

SQLpy - a (not) new idea ● Write SQL *next* to your application code ● No query strings dotted all over source files ● A light wrapper around your DB2.0 API ● Bases on YeSQL by Kris Jenkins 14

Slide 15

Slide 15 text

SQLpy + Models ● Still like linking behaviours and actions via models? ● Use the DAO pattern - define the models how you want ○ Add extra methods to serialize or validate 15 class Hello(object): def __init__(self, queryfunc, cur, args): self._query = queryfunc self._result = self.result(cur, args) @property def result(self): return self._result @result.setter def result(self, cur, args): self._result = self._query(cur, 0, args) def refresh(self, cur, args): return self._query(cur, 0, args) def jsonout(self): return json.dumps(self.result)

Slide 16

Slide 16 text

SQLpy + Functions ● Prefer your app only make calls to stored procedures? ● At least the list of sprocs used is all in one place 16 -- name: cool_proc SELECT public.cool_proc(); -- name: cooler_proc SELECT public.cooler_proc(%s, %s); sql = sqlpy.Queries('queries.sql') .... resultsA = sql.COOL_PROC(cur, 0) resultsB = sql.COOLER_PROC(cur, 0, args)

Slide 17

Slide 17 text

SQLpy + Composition ● Constructing SQL queries is legitimately dangerous! ● How to do this in SQLpy, but not destroy the valid SQL syntax in queries file? ● BUILT SQL ○ Clauses added based on parameters passed to wrapping function ○ Transparent and lightweight ● Query still passes through the DB2.0 API sanitiser for safe execution 17

Slide 18

Slide 18 text

SQLpy + Composition ● Constructing SQL queries is legitimately dangerous! ● How to do this in SQLpy, but not destroy the valid SQL syntax in queries file? ● BUILT SQL ○ Clauses added based on parameters passed to wrapping function ○ Transparent and lightweight ● Query still passes through the DB2.0 API sanitiser for safe execution 18 -- name: built_sql_statement$ -- a built up sql statement SELECT * FROM hello WHERE 1=1 AND id = %(id_low)s OR id = %(id_high)s AND message = %(msg)s; sql = sqlpy.Queries('queries.sql') .... kwargs = {'id_low': 1} results = sql.BUILT_SQL_STATEMENT(cur, 0, **kwargs) SELECT * FROM hello WHERE 1=1 AND id = 1;

Slide 19

Slide 19 text

SQLpy + Your SQL ● Essentially this is the one central idea behind SQLpy ● Let’s play with PostgreSQL ○ Correlate pg_stat_statements with the queries in SQLpy ○ Because you know what was logged (pg_stat_statements) and what was run (sqlpy) ○ See how your database is actually being used ○ Critical data for indexing and optimization decisions 19 SQLpy pg_stat_statements REGEX All the query context!

Slide 20

Slide 20 text

sqlpy.readthedocs.io ● github.com/9fin/sqlpy (MIT) ● Made for PostgreSQL first ○ Other DBs working in theory, but not tested! ● Still

Slide 21

Slide 21 text

Thank You! References and Credits [slide#] [3] image: By Raiffeisenverband Salzburg reg. Gen. m. b. H., Schwarzstr. 13-15, 5024 Salzburg - Raiffeisenverband Salzburg reg. Gen. m. b. H., Schwarzstr. 13-15, 5024 Salzburg (transferred from de:Image:RVS Handelsraum.jpg), CC BY 2.0 at, https://commons.wikimedia.org/w/index.php?curid=5009720 [4] image: CC0 license [5] image: An unnamed colleague 21