Slide 1

Slide 1 text

Oracle to PostgreSQL Migration Tutorial Alvaro Hernandez

Slide 2

Slide 2 text

ALVARO HERNANDEZ @ahachete DBA and Java Software Developer OnGres and ToroDB Founder PostgreSQL Community active member Database Frequent Speaker

Slide 3

Slide 3 text

About OnGres IT firm specialized on R&D on Databases, more specifically PostgreSQL: ‣ Training ‣ Consulting and development ‣ PostgreSQL Support Developers of ToroDB (www.torodb.com), an open-source, document-store database that works on top of PostgreSQL and is compatible with MongoDB. Partners of www.pythian.com, reference multinational company providing database support and data services. Oracle to PostgreSQL Migration

Slide 4

Slide 4 text

Tutorial Agenda Why migrate to PostgreSQL? Migration assessment. Perf evaluation. Showstoppers DBA differences SQL differences Schema migration Data migration. Offline and online PL/SQL to plpgsql migration Real use-case: BBM 150M+ users live migration Oracle to PostgreSQL Migration

Slide 5

Slide 5 text

Why migrate to PostgreSQL? Cost, cost, cost Licensing terms (“men in black”) Don’t want to store core data on proprietary RDBMS Oracle unavailable as-a-service on most clouds PostgreSQL is a natural (best) fit for Oracle migrations: plpgsql, DBA similar ops, advanced SQL support Oracle to PostgreSQL Migration

Slide 6

Slide 6 text

Migration assessment Performance evaluation Showstoppers Oracle to PostgreSQL Migration

Slide 7

Slide 7 text

Migration Assessment Analysis of the Oracle environment. Goals: ✓Understand environment and what Oracle functionalities are being used ✓Identify potential showstoppers ✓Estimate complexity (time, effort) of the migration Start with a questionnaire to gather information about the database, applications and technologies Use a tool + manual processing for object inspection and cost estimation Use AWR to understand performance Oracle to PostgreSQL Migration

Slide 8

Slide 8 text

Ora2pg assessment Oracle to PostgreSQL Migration

Slide 9

Slide 9 text

AWS SCT (Schema Converstion Tool) Oracle to PostgreSQL Migration Java-based, standalone tool

Slide 10

Slide 10 text

AWR (Automatic Workload Report) Oracle to PostgreSQL Migration

Slide 11

Slide 11 text

Showstoppers Transparent data encryption (unless using Fujitsu PG) Applications without source code or that cannot be changed Non-showstoppers: ✓RAC. PostgreSQL does not have master write scalability (except XL), but it’s rarely needed. HA is implemented externally. ✓Exadata. Unless is the top-of-the-line, properly tuned and configured PostgreSQL can handle the load. Oracle to PostgreSQL Migration

Slide 12

Slide 12 text

DBA differences Oracle to PostgreSQL Migration

Slide 13

Slide 13 text

Architectural and tooling differences Externally, both databases are quite different. Migration requires re-training. It’s hard to deploy PostgreSQL without both of: ✓Deep configuration tuning. ✓Connection pooling. Oracle “stack” is Oracle-centric, and very prolific. PostgreSQL “stack” is ecosystem-centric, and often not easy to grasp (which component to pick, and how to make them cooperate well together). And significantly less developed. Oracle to PostgreSQL Migration

Slide 14

Slide 14 text

Miscelanea SGA and shared_buffers are not tuned similarly. REDO logs and max_wal_size are not the same. Don’t create table spaces unless you have different media. You probably don’t need multi-master à la RAC. No, really! Oracle to PostgreSQL Migration

Slide 15

Slide 15 text

SQL differences Oracle to PostgreSQL Migration

Slide 16

Slide 16 text

Similar SQL Both are SQL databases ✓rich set of datatypes ✓strong SQL ✓strong procedural language Modern SQL is almost “similar” obsolete syntax is problem ✓old + outer join syntax ✓use DECODE instead CASE ✓ROWCOUNT Oracle to PostgreSQL Migration

Slide 17

Slide 17 text

Types Oracle generic types (date, number...) should be mapped to Postgres more specific types date, time, timestamp, int, bigint, numeric, double precision… Oracle clobs should not be used - Postgres varlena types have 1GB limit. Oracle and Postgres are type strict databases. However Oracle has some loose implicit conversions, on operations over types date, number, varchar2, that are not type strict. Oracle to PostgreSQL Migration

Slide 18

Slide 18 text

Empty string is NULL ‘’ IS NULL --> true in Oracle (false in Postgres) ‘’ = ‘’ --> false in Oracle (true in Postgres) fx(int) fx(‘’) --> allowed in Oracle (disallowed in Postgres) “Although Oracle treats zero-length character strings as nulls, concatenating a zero-length character string with another operand always results in the other operand, so null can result only from the concatenation of two null strings. However, this may not continue to be true in future versions of Oracle Database.” Oracle to PostgreSQL Migration

Slide 19

Slide 19 text

Optimizer The result of optimizer is similar, but the default configuration of optimizers in PG and Oracle are different. Oracle optimizer is “slower” - searching bigger space, but has a plan cache. PostgreSQL optimizer is limited (see from_collapse_limit and join_collapse_limit) - so it is faster (and can work without transparent plan cache). Plan caching via prepared statements should be used. Oracle to PostgreSQL Migration

Slide 20

Slide 20 text

Built-in functions and libraries Postgres has large set of built-in functions, but it cannot be compared with the large set of Oracle’s packages. Instead use a extension or move some functionality to application (communication) server. Database should not to touch directly to internet services, send emails… Orafce - can emulate lot of base functions, types, packages. Suppported by Ora2pg. Not designed to do 100% compatibility. Oracle to PostgreSQL Migration

Slide 21

Slide 21 text

Case Oracle folds all identifiers to UPPERCASE. PostgreSQL folds them to lowercase. PostgreSQL allows for quoted “UPPERCASE” identifiers. But it is not a good idea to carry this oracleism after porting the application, unless you really need to. Oracle to PostgreSQL Migration

Slide 22

Slide 22 text

Indexes There’s no 1:1 mapping between indexes in both databases. Also PostgreSQL does not support IOTs. Bitmap indexes are implicit. PostgreSQL creates implicit indexes on UNIQUE (and obviously PK) fields, do not duplicate! You might require to use FTS or specialized indexes for searches on some text fields: ✓… USING gin (xxx gin_trgrm_ops) ✓Consider citext (can be used in expression index) PostgreSQL has many more index types! ✓B-tree, Hash, BRIN, GIN, GIST, SP-GIST, Oracle to PostgreSQL Migration

Slide 23

Slide 23 text

Miscelanea DUAL table does not exist –easy to replace (remove). SYNONYMS do not exist. Easy to replace with schemas + search_path or views (in public). Database link can be replaced by either: ✓postgres_fdw ✓Schema within the same database ROWNUM: normally a bad practice (try to replace). Otherwise you may use window functions. CONNECT BY is basically CTE “WITH RECURSIVE”. Oracle to PostgreSQL Migration

Slide 24

Slide 24 text

Schema and Offline data migration Oracle to PostgreSQL Migration

Slide 25

Slide 25 text

Migration strategy One time migration - after migration the development will be continued on Postgres only. Continual (repeated) migration - after migration the development continue on Oracle, but the cost of migration to postgres is minimized. Both strategies are possible. Probably there are not good tools for migration from Postgres to Oracle (do we care?). Oracle to PostgreSQL Migration

Slide 26

Slide 26 text

The basics Start by oracle_fdw. Let’s you easily import schema and start exploring the data. For the migration, basically use ora2pg. It’s quite good and complete tool: ✓Schema, views, mat views, functions, triggers… ✓Transformation can be to native Postgres or to Orafce Schema import requires post-processing. Don’t do it manually (non-reproducible). Use instead ora2pg flags to adapt your desired conversion rules, effectively automating the whole process. Oracle to PostgreSQL Migration

Slide 27

Slide 27 text

ORA-01555 Snapshot Too Old Your enemy. Oracle’s capability to keep a consistent snapshot of the data is limited by REDO space. If your data copy is not fast enough and the database is active, you may encounter ORA-01555. You need to restart from scratch. Tuning the initial data dump is critical. You may require careful planning of (and ora2pg supports both of): ✓Table parallelism. Dump several tables in parallel. May not be enough (one big table dominates). ✓Intra-table parallelism. Several processes dumping the same table (will result in not physically ordered table in PG). Oracle to PostgreSQL Migration

Slide 28

Slide 28 text

Live data migration Oracle to PostgreSQL Migration

Slide 29

Slide 29 text

Live data migration No downtime requirement discards dump/restore solutions (like ora2pg) ETL is not an option either: requires created_at, updated_at fields; works in batches; requires expensive extra indexes or slow scans A CDC (Change Data Capture) solution is required that supports Oracle as source and PostgreSQL as destination Oracle to PostgreSQL Migration

Slide 30

Slide 30 text

CDC: Change Data Capture Select an arbitrary point in the (source) database timeline (SCN in Oracle, LSN in PG) Phase 1: data load. Obtain a snapshot of the database at the selected SCN. Copy all the data from all tables to destination. Phase 2. Deltas. Database (or external software) mines the logs (changes to the database) happened since that SCN and extracts them to the destination database (as SQL or equivalent) Oracle to PostgreSQL Migration

Slide 31

Slide 31 text

DDL changes for CDC DDL cannot be changed while replication happens. However: Before initial load: DROP constraints, FKs, indexes to speed up load. After initial load: re-create all the missing DDL (indexes may be created concurrently). Create a temporal tombstone table to signal the end of data copy. Oracle to PostgreSQL Migration

Slide 32

Slide 32 text

Sequence management Sequences are not propagated. During cut-over, once tombstone is propagated, query sequence state in Oracle and apply to sequences in PostgreSQL. Needs to be scripted and fully automated. Optional: increment sequence numbers to a rounded number, to identify when migration happened. Oracle to PostgreSQL Migration

Slide 33

Slide 33 text

Oracle Golden Gate Supporting Oracle to PostgreSQL CDC Oracle to PostgreSQL Migration

Slide 34

Slide 34 text

Golden Gate Implements CDC pattern, supports many srcs and destinations Wait, Oracle sw to migrate off of Oracle? $17,500 / core. Typically only acceptable if there’s site-wide license Oracle to PostgreSQL Migration

Slide 35

Slide 35 text

GoldenGate (GG) gotchas PITA to get it working. Took experienced GG admin 2 full- time weeks to get this setup working. Documentation output → 12 pages of commands and config files. Oracle to PostgreSQL use case poorly documented (if at all). Blogs/posts almost all do batch uploads or only deltas, but not real-time migration. Uses ODBC (by Progress), which brings its own gotchas (e.g.: timestamp(6) to timestamp to make it work). Oracle to PostgreSQL Migration

Slide 36

Slide 36 text

GoldenGate (GG) gotchas (II) Synchronization point (between initial data load and CDC) is not enforced. Easy to overlap or miss data. Failure means mostly retry, which requires a lot of manual commands. Ended up fully scripting and automating GG operations. Poor debugging. Key to make it work is a hack: Add (where SCN == CSN)“, sqlpredicate 'AS OF SCN XXXXXXXX';”to table copy definitions config to do initial snapshot. Oracle to PostgreSQL Migration

Slide 37

Slide 37 text

GG configuration snippet examples Oracle to PostgreSQL Migration extract live userid $user password $password rmthost localhost, mgrport 7810 mttrail ./dirdat/ac table accountservice.account; extract load userid $user password $password rmthost localhost, mgrport 7810 rmttask replicat, group load table schema.table, sqlpredicate 'AS OF SCN XXXXXXXX’;

Slide 38

Slide 38 text

Other CDC software Amazon DMS: ✓Not all stories about it are good ✓Only valid if either source or destination are AWS DbVisit. Only very recent PostgreSQL Support. Uses ora2pg internally Attunity. Build your own? ✓(Simpler) Use the LogMiner interface ✓(Harder) Mine Oracle REDO and archive logs Oracle to PostgreSQL Migration

Slide 39

Slide 39 text

Rollback mechanism: PostgreSQL’s CDC What if after cut-over migration fails (i.e. PostgreSQL is not performant enough, application bugs, etc) but writes are already happening in PostgreSQL? A “rollback” (switchback) mechanism is required. PostgreSQL’s logical decoding can be used. raw_decoding plugin. Output to file. If rollback, execute the changes via sqlplus back into Oracle. Required code to adjust timestamp and other data types syntax. Oracle to PostgreSQL Migration

Slide 40

Slide 40 text

PL/SQL to plpgsql migration Oracle to PostgreSQL Migration

Slide 41

Slide 41 text

Procedural languages PL/SQL is ADA + SQL plpgsql is reduced PL/SQL They look similar – but totally different implementation PL/SQL is much more like client code executed on server - clear separation between procedural runtime and SQL engine plpgsql is 100% integrated to SQL engine Oracle to PostgreSQL Migration

Slide 42

Slide 42 text

plpgsql Shares process with SQL engine. Shares datatypes with SQL engine. Every plpgsql expression is SELECT. Protected blocks are implemented with subtransactions of SQL engine. Oracle to PostgreSQL Migration

Slide 43

Slide 43 text

Plpgsql architecture consequences Faster data processing - no inter-process communication, no data type conversion Slower intensive numeric, string calculation - no optimizations Exception are more expensive in plpgsql - implemented with subtransactions Up until PostgreSQL 11 (and only with procedures) it was not possible to use subtransactions (ROLLBACKS, COMMITS were not allowed). Oracle to PostgreSQL Migration

Slide 44

Slide 44 text

Functions PostgreSQL default “volatile” functions can be used like procedures - data can be modified there. Oracle default function is stable function in PG. Oracle deterministic function is immutable function in PG. Oracle to PostgreSQL Migration

Slide 45

Slide 45 text

Plpgsql has no packages There are no a global (session) variables. It is possible to emulate via GUC. Schema in Postgres are different than schema in Oracle. This can be leveraged to emulate packages in Postgres (see Orafce). PostgreSQL schema ≠ Oracle schema PostgreSQL schema ~ Oracle packages ✓Not related to the user ✓Not related to storage area Oracle to PostgreSQL Migration

Slide 46

Slide 46 text

Tales from the trenches well written code is more portable datevar := '1.1.' || TO_CHAR(sysdate, 'YYYY'); (bad) datevar := trunc(sysdate, 'YEAR'); (good) intvar := TO_CHAR(sysdate, ‘MM');(bad) intvar := EXTRACT(MONTH FROM sysdate); (good) Postgres don’t like implicit text<->int conversions Oracle to PostgreSQL Migration

Slide 47

Slide 47 text

How to migrate It normally requires manual intervention and dominates migration costs if LOCs are high. Proportional to LOCs and complexity of the code (assessment). Create a set of unit tests (plpgunit) to reproduce actual Oracle behavior (even with bugs!) before porting the code. Ora2pg may help here – but it’s limited. Use plpgsql_check to verify the generated code. Oracle to PostgreSQL Migration

Slide 48

Slide 48 text

plpgsql_check https://github.com/okbob/plpgsql_check Two modes: Passive. Checks procedures before start. Requires to load extension. Active. Call plpgsql_check_function() Oracle to PostgreSQL Migration

Slide 49

Slide 49 text

Real use case: BBM 150M+ users migration with near-zero downtime Oracle to PostgreSQL Migration

Slide 50

Slide 50 text

How we Live Migrated Millions of BBM Users & its Infrastructure Across the Pacific Mohan Krishnan (Emtek CTO)Google Cloud Next '18 https://www.youtube.com/watch?v=QsvSr3kGxnQ

Slide 51

Slide 51 text

BBM: not your father’s BBM No longer depending on BlackBerry Android & iOS Modernized, being constantly improved Similar to WeChat: a platform, including payments 150M+ users Oracle to PostgreSQL Migration

Slide 52

Slide 52 text

Some facts about BBM’s platform ~ 6.000 servers Previous infra running on-prem in Canada. Current in GCP in Asia region Oracle/PostgreSQL core part of the infrastructure: id (user login and auth services) DB size some TB, high trafficked, OLTP pattern Cassandra used for user messages (+400 nodes) Oracle to PostgreSQL Migration

Slide 53

Slide 53 text

BBM migration requirements No downtime (near-zero downtime) No data loss (obviously). Data verified/checksummed Across the Pacific (200-250ms latency) Long testing period with real datall PostgreSQL stack architecture on GCP:HA, monitoring, backups, DR, configuration tuning.Fully automated deployment Full PostgreSQL stack architecture on GCP:HA, monitoring, backups, DR, configuration tuning.Fully automated deployment Oracle to PostgreSQL Migration

Slide 54

Slide 54 text

Migration steps DDL migration PL/SQL stored procedures (just a few dozen, direct translation) PostgreSQL architecture on GCP Live data migration Rollback mechanism Testing and production cut-o Oracle to PostgreSQL Migration

Slide 55

Slide 55 text

DDL migration Main application was using Hibernate, so application needed few changes Used ora2pg for DDL migration However, required performance-sensitive changes: number in Oracle is a numeric in PostgreSQL. But it was really an integer. Timestamps are painful (well, always!) Oracle to PostgreSQL Migration

Slide 56

Slide 56 text

DDL migration (II) Carefully adjust users and permissions Data constraints, default values from sequences could be refactored from triggers to normal constraints Cleanup time! (remove unused indexes…) Ensure all tables have a PK (required for live data migration) Oracle to PostgreSQL Migration

Slide 57

Slide 57 text

DDL migration (III): partitioning Data in Oracle was not partitioned. Wanted to use the migration to partition Partitioning code and functions/triggers created (9.6, no native partitioning as in 10/11) Transparent to data migration as long as schema is compatible between source/destination Data clean up Oracle to PostgreSQL Migration

Slide 58

Slide 58 text

PostgreSQL Architecture on GCP Unmanaged: GCE instances, not CloudSQL Fully automated deployment with Ansible (+shell, gcloud API for deployment). Env tested with Vagrant Heavily tuned config file(learn more: PostgreSQL Configuration for Humans) PgBouncer for connection pooling Patroni for HA. Consul for DCS. WAL-e for backup and DR, DataDog for monitoring Oracle to PostgreSQL Migration

Slide 59

Slide 59 text

PostgreSQL Architecture on GCP (II) Oracle to PostgreSQL Migration

Slide 60

Slide 60 text

Migration architecture Oracle to PostgreSQL Migration

Slide 61

Slide 61 text

Testing and production cut-over Live migration was running cross-DC (Canada to Asia) for more than 2 months. Re-started frequently for application testing on PG on destination. Cut-over fully scripted, tested 3-5 times and timed. Database-contributed downtime: 5 mins real, another 40 for data checksumming. Oracle to PostgreSQL Migration

Slide 62

Slide 62 text

1177 Avenue of the Americas, Suite 500 New York, 10036, NY United States of America +1 (646) 452 7168 [email protected] Carretera de Fuencarral, 44, Edificio 4B, Loft 33 Alcobendas, 28108, MD España +34 918 675 554 [email protected] www.ongres.com