Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Oracle to PostgreSQL Migration Tutorial

OnGres
February 04, 2019

Oracle to PostgreSQL Migration Tutorial

Stop paying for Larry’s boat!

PostgreSQL is a fully-featured, enterprise-grade and open source database, named two years in a row database of the year! And it’s also the best candidate to migrate off of Oracle, supporting very advanced SQL, easy administration and pl/pgsql, a replacement for Oracle’s PL/SQL. Stop spending hundreds of thousands or millions of $ on database licenses and reinvest them in your team, infrastructure and software.

Join this half-day tutorial to learn the best practices, tricks and tools to perform a successful Oracle to PostgreSQL migration. Learn from the team that performed 150M users Blackberry migration to PostgreSQL on GCP with near-zero downtime, and the key author behind orafce, a set of Oracle compatible functions on top of PostgreSQL.

This talk was presented at PostgreSQL Conference Russia, 2019: https://pgconf.ru/en/2019/248906

OnGres

February 04, 2019
Tweet

More Decks by OnGres

Other Decks in Technology

Transcript

  1. ALVARO HERNANDEZ @ahachete DBA and Java Software Developer OnGres and

    ToroDB Founder PostgreSQL Community active member Database Frequent Speaker
  2. About OnGres IT firm specialized on R&D on Databases, more

    specifically PostgreSQL: ‣ Training ‣ Consulting and development ‣ PostgreSQL Support Developers of ToroDB (www.torodb.com), an open-source, document-store database that works on top of PostgreSQL and is compatible with MongoDB. Partners of www.pythian.com, reference multinational company providing database support and data services. Oracle to PostgreSQL Migration
  3. Tutorial Agenda Why migrate to PostgreSQL? Migration assessment. Perf evaluation.

    Showstoppers DBA differences SQL differences Schema migration Data migration. Offline and online PL/SQL to plpgsql migration Real use-case: BBM 150M+ users live migration Oracle to PostgreSQL Migration
  4. Why migrate to PostgreSQL? Cost, cost, cost Licensing terms (“men

    in black”) Don’t want to store core data on proprietary RDBMS Oracle unavailable as-a-service on most clouds PostgreSQL is a natural (best) fit for Oracle migrations: plpgsql, DBA similar ops, advanced SQL support Oracle to PostgreSQL Migration
  5. Migration Assessment Analysis of the Oracle environment. Goals: ✓Understand environment

    and what Oracle functionalities are being used ✓Identify potential showstoppers ✓Estimate complexity (time, effort) of the migration Start with a questionnaire to gather information about the database, applications and technologies Use a tool + manual processing for object inspection and cost estimation Use AWR to understand performance Oracle to PostgreSQL Migration
  6. Showstoppers Transparent data encryption (unless using Fujitsu PG) Applications without

    source code or that cannot be changed Non-showstoppers: ✓RAC. PostgreSQL does not have master write scalability (except XL), but it’s rarely needed. HA is implemented externally. ✓Exadata. Unless is the top-of-the-line, properly tuned and configured PostgreSQL can handle the load. Oracle to PostgreSQL Migration
  7. Architectural and tooling differences Externally, both databases are quite different.

    Migration requires re-training. It’s hard to deploy PostgreSQL without both of: ✓Deep configuration tuning. ✓Connection pooling. Oracle “stack” is Oracle-centric, and very prolific. PostgreSQL “stack” is ecosystem-centric, and often not easy to grasp (which component to pick, and how to make them cooperate well together). And significantly less developed. Oracle to PostgreSQL Migration
  8. Miscelanea SGA and shared_buffers are not tuned similarly. REDO logs

    and max_wal_size are not the same. Don’t create table spaces unless you have different media. You probably don’t need multi-master à la RAC. No, really! Oracle to PostgreSQL Migration
  9. Similar SQL Both are SQL databases ✓rich set of datatypes

    ✓strong SQL ✓strong procedural language Modern SQL is almost “similar” obsolete syntax is problem ✓old + outer join syntax ✓use DECODE instead CASE ✓ROWCOUNT Oracle to PostgreSQL Migration
  10. Types Oracle generic types (date, number...) should be mapped to

    Postgres more specific types date, time, timestamp, int, bigint, numeric, double precision… Oracle clobs should not be used - Postgres varlena types have 1GB limit. Oracle and Postgres are type strict databases. However Oracle has some loose implicit conversions, on operations over types date, number, varchar2, that are not type strict. Oracle to PostgreSQL Migration
  11. Empty string is NULL ‘’ IS NULL --> true in

    Oracle (false in Postgres) ‘’ = ‘’ --> false in Oracle (true in Postgres) fx(int) fx(‘’) --> allowed in Oracle (disallowed in Postgres) “Although Oracle treats zero-length character strings as nulls, concatenating a zero-length character string with another operand always results in the other operand, so null can result only from the concatenation of two null strings. However, this may not continue to be true in future versions of Oracle Database.” Oracle to PostgreSQL Migration
  12. Optimizer The result of optimizer is similar, but the default

    configuration of optimizers in PG and Oracle are different. Oracle optimizer is “slower” - searching bigger space, but has a plan cache. PostgreSQL optimizer is limited (see from_collapse_limit and join_collapse_limit) - so it is faster (and can work without transparent plan cache). Plan caching via prepared statements should be used. Oracle to PostgreSQL Migration
  13. Built-in functions and libraries Postgres has large set of built-in

    functions, but it cannot be compared with the large set of Oracle’s packages. Instead use a extension or move some functionality to application (communication) server. Database should not to touch directly to internet services, send emails… Orafce - can emulate lot of base functions, types, packages. Suppported by Ora2pg. Not designed to do 100% compatibility. Oracle to PostgreSQL Migration
  14. Case Oracle folds all identifiers to UPPERCASE. PostgreSQL folds them

    to lowercase. PostgreSQL allows for quoted “UPPERCASE” identifiers. But it is not a good idea to carry this oracleism after porting the application, unless you really need to. Oracle to PostgreSQL Migration
  15. Indexes There’s no 1:1 mapping between indexes in both databases.

    Also PostgreSQL does not support IOTs. Bitmap indexes are implicit. PostgreSQL creates implicit indexes on UNIQUE (and obviously PK) fields, do not duplicate! You might require to use FTS or specialized indexes for searches on some text fields: ✓… USING gin (xxx gin_trgrm_ops) ✓Consider citext (can be used in expression index) PostgreSQL has many more index types! ✓B-tree, Hash, BRIN, GIN, GIST, SP-GIST, <your_own> Oracle to PostgreSQL Migration
  16. Miscelanea DUAL table does not exist –easy to replace (remove).

    SYNONYMS do not exist. Easy to replace with schemas + search_path or views (in public). Database link can be replaced by either: ✓postgres_fdw ✓Schema within the same database ROWNUM: normally a bad practice (try to replace). Otherwise you may use window functions. CONNECT BY is basically CTE “WITH RECURSIVE”. Oracle to PostgreSQL Migration
  17. Migration strategy One time migration - after migration the development

    will be continued on Postgres only. Continual (repeated) migration - after migration the development continue on Oracle, but the cost of migration to postgres is minimized. Both strategies are possible. Probably there are not good tools for migration from Postgres to Oracle (do we care?). Oracle to PostgreSQL Migration
  18. The basics Start by oracle_fdw. Let’s you easily import schema

    and start exploring the data. For the migration, basically use ora2pg. It’s quite good and complete tool: ✓Schema, views, mat views, functions, triggers… ✓Transformation can be to native Postgres or to Orafce Schema import requires post-processing. Don’t do it manually (non-reproducible). Use instead ora2pg flags to adapt your desired conversion rules, effectively automating the whole process. Oracle to PostgreSQL Migration
  19. ORA-01555 Snapshot Too Old Your enemy. Oracle’s capability to keep

    a consistent snapshot of the data is limited by REDO space. If your data copy is not fast enough and the database is active, you may encounter ORA-01555. You need to restart from scratch. Tuning the initial data dump is critical. You may require careful planning of (and ora2pg supports both of): ✓Table parallelism. Dump several tables in parallel. May not be enough (one big table dominates). ✓Intra-table parallelism. Several processes dumping the same table (will result in not physically ordered table in PG). Oracle to PostgreSQL Migration
  20. Live data migration No downtime requirement discards dump/restore solutions (like

    ora2pg) ETL is not an option either: requires created_at, updated_at fields; works in batches; requires expensive extra indexes or slow scans A CDC (Change Data Capture) solution is required that supports Oracle as source and PostgreSQL as destination Oracle to PostgreSQL Migration
  21. CDC: Change Data Capture Select an arbitrary point in the

    (source) database timeline (SCN in Oracle, LSN in PG) Phase 1: data load. Obtain a snapshot of the database at the selected SCN. Copy all the data from all tables to destination. Phase 2. Deltas. Database (or external software) mines the logs (changes to the database) happened since that SCN and extracts them to the destination database (as SQL or equivalent) Oracle to PostgreSQL Migration
  22. DDL changes for CDC DDL cannot be changed while replication

    happens. However: Before initial load: DROP constraints, FKs, indexes to speed up load. After initial load: re-create all the missing DDL (indexes may be created concurrently). Create a temporal tombstone table to signal the end of data copy. Oracle to PostgreSQL Migration
  23. Sequence management Sequences are not propagated. During cut-over, once tombstone

    is propagated, query sequence state in Oracle and apply to sequences in PostgreSQL. Needs to be scripted and fully automated. Optional: increment sequence numbers to a rounded number, to identify when migration happened. Oracle to PostgreSQL Migration
  24. Golden Gate Implements CDC pattern, supports many srcs and destinations

    Wait, Oracle sw to migrate off of Oracle? $17,500 / core. Typically only acceptable if there’s site-wide license Oracle to PostgreSQL Migration
  25. GoldenGate (GG) gotchas PITA to get it working. Took experienced

    GG admin 2 full- time weeks to get this setup working. Documentation output → 12 pages of commands and config files. Oracle to PostgreSQL use case poorly documented (if at all). Blogs/posts almost all do batch uploads or only deltas, but not real-time migration. Uses ODBC (by Progress), which brings its own gotchas (e.g.: timestamp(6) to timestamp to make it work). Oracle to PostgreSQL Migration
  26. GoldenGate (GG) gotchas (II) Synchronization point (between initial data load

    and CDC) is not enforced. Easy to overlap or miss data. Failure means mostly retry, which requires a lot of manual commands. Ended up fully scripting and automating GG operations. Poor debugging. Key to make it work is a hack: Add (where SCN == CSN)“, sqlpredicate 'AS OF SCN XXXXXXXX';”to table copy definitions config to do initial snapshot. Oracle to PostgreSQL Migration
  27. GG configuration snippet examples Oracle to PostgreSQL Migration extract live

    userid $user password $password rmthost localhost, mgrport 7810 mttrail ./dirdat/ac table accountservice.account; extract load userid $user password $password rmthost localhost, mgrport 7810 rmttask replicat, group load table schema.table, sqlpredicate 'AS OF SCN XXXXXXXX’;
  28. Other CDC software Amazon DMS: ✓Not all stories about it

    are good ✓Only valid if either source or destination are AWS DbVisit. Only very recent PostgreSQL Support. Uses ora2pg internally Attunity. Build your own? ✓(Simpler) Use the LogMiner interface ✓(Harder) Mine Oracle REDO and archive logs Oracle to PostgreSQL Migration
  29. Rollback mechanism: PostgreSQL’s CDC What if after cut-over migration fails

    (i.e. PostgreSQL is not performant enough, application bugs, etc) but writes are already happening in PostgreSQL? A “rollback” (switchback) mechanism is required. PostgreSQL’s logical decoding can be used. raw_decoding plugin. Output to file. If rollback, execute the changes via sqlplus back into Oracle. Required code to adjust timestamp and other data types syntax. Oracle to PostgreSQL Migration
  30. Procedural languages PL/SQL is ADA + SQL plpgsql is reduced

    PL/SQL They look similar – but totally different implementation PL/SQL is much more like client code executed on server - clear separation between procedural runtime and SQL engine plpgsql is 100% integrated to SQL engine Oracle to PostgreSQL Migration
  31. plpgsql Shares process with SQL engine. Shares datatypes with SQL

    engine. Every plpgsql expression is SELECT. Protected blocks are implemented with subtransactions of SQL engine. Oracle to PostgreSQL Migration
  32. Plpgsql architecture consequences Faster data processing - no inter-process communication,

    no data type conversion Slower intensive numeric, string calculation - no optimizations Exception are more expensive in plpgsql - implemented with subtransactions Up until PostgreSQL 11 (and only with procedures) it was not possible to use subtransactions (ROLLBACKS, COMMITS were not allowed). Oracle to PostgreSQL Migration
  33. Functions PostgreSQL default “volatile” functions can be used like procedures

    - data can be modified there. Oracle default function is stable function in PG. Oracle deterministic function is immutable function in PG. Oracle to PostgreSQL Migration
  34. Plpgsql has no packages There are no a global (session)

    variables. It is possible to emulate via GUC. Schema in Postgres are different than schema in Oracle. This can be leveraged to emulate packages in Postgres (see Orafce). PostgreSQL schema ≠ Oracle schema PostgreSQL schema ~ Oracle packages ✓Not related to the user ✓Not related to storage area Oracle to PostgreSQL Migration
  35. Tales from the trenches well written code is more portable

    datevar := '1.1.' || TO_CHAR(sysdate, 'YYYY'); (bad) datevar := trunc(sysdate, 'YEAR'); (good) intvar := TO_CHAR(sysdate, ‘MM');(bad) intvar := EXTRACT(MONTH FROM sysdate); (good) Postgres don’t like implicit text<->int conversions Oracle to PostgreSQL Migration
  36. How to migrate It normally requires manual intervention and dominates

    migration costs if LOCs are high. Proportional to LOCs and complexity of the code (assessment). Create a set of unit tests (plpgunit) to reproduce actual Oracle behavior (even with bugs!) before porting the code. Ora2pg may help here – but it’s limited. Use plpgsql_check to verify the generated code. Oracle to PostgreSQL Migration
  37. plpgsql_check https://github.com/okbob/plpgsql_check Two modes: Passive. Checks procedures before start. Requires

    to load extension. Active. Call plpgsql_check_function() Oracle to PostgreSQL Migration
  38. How we Live Migrated Millions of BBM Users & its

    Infrastructure Across the Pacific Mohan Krishnan (Emtek CTO)Google Cloud Next '18 https://www.youtube.com/watch?v=QsvSr3kGxnQ
  39. BBM: not your father’s BBM No longer depending on BlackBerry

    Android & iOS Modernized, being constantly improved Similar to WeChat: a platform, including payments 150M+ users Oracle to PostgreSQL Migration
  40. Some facts about BBM’s platform ~ 6.000 servers Previous infra

    running on-prem in Canada. Current in GCP in Asia region Oracle/PostgreSQL core part of the infrastructure: id (user login and auth services) DB size some TB, high trafficked, OLTP pattern Cassandra used for user messages (+400 nodes) Oracle to PostgreSQL Migration
  41. BBM migration requirements No downtime (near-zero downtime) No data loss

    (obviously). Data verified/checksummed Across the Pacific (200-250ms latency) Long testing period with real datall PostgreSQL stack architecture on GCP:HA, monitoring, backups, DR, configuration tuning.Fully automated deployment Full PostgreSQL stack architecture on GCP:HA, monitoring, backups, DR, configuration tuning.Fully automated deployment Oracle to PostgreSQL Migration
  42. Migration steps DDL migration PL/SQL stored procedures (just a few

    dozen, direct translation) PostgreSQL architecture on GCP Live data migration Rollback mechanism Testing and production cut-o Oracle to PostgreSQL Migration
  43. DDL migration Main application was using Hibernate, so application needed

    few changes Used ora2pg for DDL migration However, required performance-sensitive changes: number in Oracle is a numeric in PostgreSQL. But it was really an integer. Timestamps are painful (well, always!) Oracle to PostgreSQL Migration
  44. DDL migration (II) Carefully adjust users and permissions Data constraints,

    default values from sequences could be refactored from triggers to normal constraints Cleanup time! (remove unused indexes…) Ensure all tables have a PK (required for live data migration) Oracle to PostgreSQL Migration
  45. DDL migration (III): partitioning Data in Oracle was not partitioned.

    Wanted to use the migration to partition Partitioning code and functions/triggers created (9.6, no native partitioning as in 10/11) Transparent to data migration as long as schema is compatible between source/destination Data clean up Oracle to PostgreSQL Migration
  46. PostgreSQL Architecture on GCP Unmanaged: GCE instances, not CloudSQL Fully

    automated deployment with Ansible (+shell, gcloud API for deployment). Env tested with Vagrant Heavily tuned config file(learn more: PostgreSQL Configuration for Humans) PgBouncer for connection pooling Patroni for HA. Consul for DCS. WAL-e for backup and DR, DataDog for monitoring Oracle to PostgreSQL Migration
  47. Testing and production cut-over Live migration was running cross-DC (Canada

    to Asia) for more than 2 months. Re-started frequently for application testing on PG on destination. Cut-over fully scripted, tested 3-5 times and timed. Database-contributed downtime: 5 mins real, another 40 for data checksumming. Oracle to PostgreSQL Migration
  48. 1177 Avenue of the Americas, Suite 500 New York, 10036,

    NY United States of America +1 (646) 452 7168 [email protected] Carretera de Fuencarral, 44, Edificio 4B, Loft 33 Alcobendas, 28108, MD España +34 918 675 554 [email protected] www.ongres.com