Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Playing Nice with Your Friends: Database Diversity with Postgres FDWs

Playing Nice with Your Friends: Database Diversity with Postgres FDWs

You want to use Postgres. Your data lives outside of Postgres, and you're not ready to migrate it. Or, you want to leverage performance features of other databases like Redis, so you can't migrate it to Postgres just yet. In either case, you can still use Postgres! Foreign Data Wrappers (FDWs) give you the ability to query external data from within Postgres, just as if it were a regular Postgres table. This talk will discuss all about FDWs and how you can leverage this feature for performance and stability.

As presented at PGDay Chicago 2023-04-20

Richard Yen

April 20, 2023
Tweet

More Decks by Richard Yen

Other Decks in Technology

Transcript

  1. © EnterpriseDB Corporation 2023 - All Rights Reserved
    Playing Nice with
    Your Friends
    PGDay Chicago


    2023.04.20


    Richard Yen
    Database Diversity with Postgres FDWs
    1

    View Slide

  2. © EnterpriseDB Corporation 2023 - All Rights Reserved
    2
    ● Software Developer/Support Engineer @ EDB since 2015


    ● Previously a DBA and Web Developer


    ● Been using PostgreSQL since v. 7.4
    About Me

    View Slide

  3. © EnterpriseDB Corporation 2023 - All Rights Reserved
    3
    ● Not ready to go 100% PostgreSQL


    ● 100% PostgreSQL is not working out


    ● Want to leverage features in other engines


    ● Organizational structure does not make 100% PostgreSQL feasible


    ● Don't want to put all your eggs into one basket


    ● Want to leverage more hardware
    Is this you?

    View Slide

  4. © EnterpriseDB Corporation 2023 - All Rights Reserved
    4
    ● What are Foreign Data Wrappers?


    ● Varieties and use cases for FDWs


    ● Setting up FDWs


    ● Demo


    ● Things to consider when implementing FDWs into your infrastructure


    ● How FDWs help achieve diversity


    ● Creating your own FDW


    ● Demo
    Our Roadmap

    View Slide

  5. © EnterpriseDB Corporation 2023 - All Rights Reserved
    5
    ● Extensions that allow access to data outside of a PostgreSQL database
    cluster


    ● Could be another PostgreSQL database


    ● Another RDBMS/NoSQL database


    ● Flat files, REST APIs, etc.


    ● Data from the external data source is represented as a table (called a foreign
    table)


    ● Foreign tables can be used in SQL queries just like ordinary tables


    ● In many cases, foreign tables can receive DML
    What are Foreign Data Wrappers?

    View Slide

  6. © EnterpriseDB Corporation 2023 - All Rights Reserved
    6
    ● Oracle DBLink


    ● PG 7.2 implemented its own version of DBLink


    ● Access to data was not very intuitive (used dblink-specific functions)
    A bit of history
    postgres=# create view foreign_view as select * from dblink('host=pg1 user=postgres dbname=postgres', 'select
    aid,bid,abalance,
    fi
    ller from pgbench_accounts') as t1 (aid int,bid int,abalance int,
    fi
    ller text);


    CREATE VIEW


    postgres=# select * from foreign_view where aid = 1;


    aid | bid | abalance |
    fi
    ller


    -----+-----+----------+--------------------------------------------------------------------------------------


    1 | 1 | 3 |


    (1 row)
    postgres=# select * from dblink('host=pg1 user=postgres dbname=postgres', 'select aid,bid,abalance,
    fi
    ller
    from pgbench_accounts') as t1 (aid int,bid int,abalance int,
    fi
    ller text) where aid = 1;


    aid | bid | abalance |
    fi
    ller


    -----+-----+----------+--------------------------------------------------------------------------------------


    1 | 1 | 3 |


    (1 row)

    View Slide

  7. © EnterpriseDB Corporation 2023 - All Rights Reserved
    7
    ● Oracle DBLink


    ● PG 7.2 implemented its own version of DBLink


    ● Access to data was not very intuitive (used dblink-specific functions)


    ● SQL/MED (SQL Management of External Data) standard introduced in 2003


    ● PG 9.1 introduced file_fdw


    ● PG 9.3 introduced postgres_fdw


    ● Very few database engines implement SQL/MED (and Oracle is not one of
    them!)


    ● Most other database engines don't support as many external sources as PG
    A bit of history

    View Slide

  8. © EnterpriseDB Corporation 2023 - All Rights Reserved
    8
    FDW is Smarter and Faster
    postgres=# explain (analyze, verbose) select * from dblink('host=pg1 user=postgres dbname=postgres', 'select
    aid,bid,abalance,
    fi
    ller from pgbench_accounts') as t1 (aid int,bid int,abalance int,
    fi
    ller text) where aid = 1;


    QUERY PLAN


    ----------------------------------------------------------------------------------------------


    Function Scan on public.dblink t1 (cost=0.00..12.50 rows=5 width=44) (actual time=276.996..315.074 rows=1 loops=1)


    Output: aid, bid, abalance,
    fi
    ller


    Function Call: dblink('host=pg1 user=postgres dbname=postgres'::text, 'select aid,bid,abalance,
    fi
    ller from
    pgbench_accounts'::text)


    Filter: (t1.aid = 1)


    Rows Removed by Filter: 99999


    (...)


    Time: 319.799 ms


    postgres=# explain (analyze, verbose) select * from foreign_table where aid = 1;


    QUERY PLAN


    -----------------------------------------------------------------------------------------------------


    Foreign Scan on public.foreign_table (cost=100.00..4629.02 rows=1 width=97) (actual time=1.220..1.222 rows=1 loops=1)


    Output: aid, bid, abalance,
    fi
    ller


    Remote SQL: SELECT aid, bid, abalance,
    fi
    ller FROM public.pgbench_accounts WHERE ((aid = 1))


    (...)


    Time: 4.259 ms

    View Slide

  9. © EnterpriseDB Corporation 2023 - All Rights Reserved
    9
    FDW Today

    View Slide

  10. © EnterpriseDB Corporation 2023 - All Rights Reserved
    10
    ● Install the extension


    ● Create a foreign server


    ● Create a user mapping (if necessary)


    ● Create foreign tables (or import foreign schema)
    Getting Started with FDW
    postgres=# CREATE EXTENSION postgres_fdw;


    CREATE EXTENSION


    postgres=# CREATE SERVER foreign_server FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 'pg1', port '5432', dbname 'postgres');


    CREATE SERVER


    postgres=# CREATE USER MAPPING FOR postgres SERVER foreign_server OPTIONS (user 'postgres', password 'password');


    CREATE USER MAPPING


    postgres=# CREATE FOREIGN TABLE foreign_table (aid integer NOT NULL, bid integer, abalance integer,
    fi
    ller text) SERVER foreign_server
    OPTIONS (schema_name 'public', table_name 'pgbench_accounts');


    CREATE FOREIGN TABLE


    -- Another option for creating foreign tables


    postgres=# IMPORT FOREIGN SCHEMA public FROM SERVER foreign_server INTO public;


    IMPORT FOREIGN SCHEMA

    View Slide

  11. © EnterpriseDB Corporation 2023 - All Rights Reserved
    11
    ● Useful commands:


    ● \des -- describe external (foreign) servers


    ● \deu -- describe eternal user mappings


    ● \det -- describe external tables
    Getting Started with FDW

    View Slide

  12. © EnterpriseDB Corporation 2023 - All Rights Reserved
    12
    Demo

    View Slide

  13. © EnterpriseDB Corporation 2023 - All Rights Reserved
    13
    Tips
    ● Be mindful of what gets pushed down to the remote server
    postgres=# explain (verbose, analyze) select * from foreign_table ft join pgbench_accounts a on a.aid=ft.aid
    where a.abalance = 10 or ft.abalance = 9;


    QUERY PLAN
    ------------------------------------------------------------------------------------------------------------
    --------------


    Hash Join (cost=7192.00..18422.50 rows=14243 width=194) (actual time=109.947..963.111 rows=14502 loops=1)


    (...)


    -> Foreign Scan on public.foreign_table ft (cost=100.00..6379.00 rows=100000 width=97) (actual
    time=3.742..612.295 rows=100000 loops=1)


    Output: ft.aid, ft.bid, ft.abalance, ft.
    fi
    ller


    Remote SQL: SELECT aid, bid, abalance,
    fi
    ller FROM public.pgbench_accounts


    -> Hash (cost=4279.00..4279.00 rows=100000 width=97) (actual time=105.742..105.744 rows=100000 loops=1)


    (...)


    Planning Time: 0.402 ms


    Execution Time: 966.560 ms


    (16 rows)

    View Slide

  14. © EnterpriseDB Corporation 2023 - All Rights Reserved
    14
    Tips
    ● Be mindful of what gets pushed down to the remote server
    postgres=# explain (verbose, analyze) select * from foreign_table ft join pgbench_accounts a on
    a.aid=ft.aid where ft.abalance = 10 or ft.abalance = 9;


    QUERY PLAN
    ------------------------------------------------------------------------------------------------------
    -------------


    Hash Join (cost=5352.20..10151.80 rows=14560 width=194) (actual time=130.925..193.293 rows=15156
    loops=1)


    (...)


    -> Hash (cost=5170.20..5170.20 rows=14560 width=97) (actual time=128.999..129.003 rows=15156
    loops=1)


    Output: ft.aid, ft.bid, ft.abalance, ft.
    fi
    ller


    Buckets: 16384 Batches: 1 Memory Usage: 2038kB


    -> Foreign Scan on public.foreign_table ft (cost=100.00..5170.20 rows=14560 width=97)
    (actual time=3.912..123.474 rows=15156 loops=1)


    Output: ft.aid, ft.bid, ft.abalance, ft.
    fi
    ller


    Remote SQL: SELECT aid, bid, abalance,
    fi
    ller FROM public.pgbench_accounts WHERE
    (((abalance = 10) OR (abalance = 9)))


    Planning Time: 0.437 ms


    Execution Time: 196.240 ms


    (13 rows)

    View Slide

  15. © EnterpriseDB Corporation 2023 - All Rights Reserved
    15
    Tips
    ● Network will always be your biggest bottleneck, so minimize the amount of
    data sent from the remote server


    ● Performance can depend on foreign server's hardware


    ● DDL changes are not automatically propagated


    ● Upgrades need more care and attention, in case of any incompatibilities


    ● User mappings require thoughtfulness regarding security


    ● Backups of database do not include data in foreign tables
    TL;DR - remember that you're no longer dealing
    with just one machine, but two or more

    View Slide

  16. © EnterpriseDB Corporation 2023 - All Rights Reserved
    16
    Data Diversity
    ● Because FDWs allow you to access external data for 50+ sources, you can
    leverage PostgreSQL to connect two other sources


    ● Good for ETL Pipelines


    ● Good for varied infrastructure


    ● Prevents having to migrate data from one place to another


    ● Demo: MySQL + SQLite

    View Slide

  17. © EnterpriseDB Corporation 2023 - All Rights Reserved
    17
    Building your own FDW
    ● Native


    ● Requires C programming knowledge


    ● See the PostgreSQL docs


    ● Multicorn (https://multicorn.org/)


    ● Wrapper interface for Python apps


    ● Not actively developed, but still works


    ● FDW APIs exist for other languages (i.e., Holycorn for Ruby")

    View Slide

  18. © EnterpriseDB Corporation 2023 - All Rights Reserved
    18
    Building your own FDW

    View Slide

  19. © EnterpriseDB Corporation 2023 - All Rights Reserved
    19
    Demo

    View Slide

  20. © EnterpriseDB Corporation 2023 - All Rights Reserved
    20
    THANK YOU
    6
    Enjoy your stay at PGDay Chicago!

    View Slide