Upgrade to Pro — share decks privately, control downloads, hide ads and more …

What's Missing From Your RMAN Backup?

What's Missing From Your RMAN Backup?

Oracle Recovery Manager is the gold standard for database backup and recovery. It simplifies and streamlines the effort of creating, validating, and managing backups and helps guarantee database continuity and recoverability.

While that sounds comprehensive, RMAN is not a complete backup solution. It does not fully protect all of the files necessary to recover a database or protect them from disaster!

In this session, you'll delve into the practical aspects of RMAN, learning what is and isn't included in an RMAN backup, techniques for identifying and protecting all of the components critical for full database recovery, and lesser-known and understood RMAN configurations and settings that improve backup reliability and usability.

Sean Scott

February 13, 2025
Tweet

More Decks by Sean Scott

Other Decks in Technology

Transcript

  1. Sean Scott Oracle ACE Director Managing Principal Consultant, Viscosity North

    America What's Missing From Your RMAN Backup? RMOUG Training Days 2025
  2. Database Reliability Engineering MAA ⁘ RAC ⁘ RMAN Data Guard

    ⁘ Sharding ⁘ Partitioning Information Lifecycle Management Exadata & Engineered Systems Database Modernization Upgrades ⁘ Patching ⁘ Migrations Cloud ⁘ Hybrid Automation DevOps ⁘ IaC ⁘ Containers ⁘ Terraform Vagrant ⁘ Ansible Observability AHF ⁘ TFA ⁘ CHA ⁘ CHM
  3. www.viscosityna.com @ViscosityNA Oracle on Docker Running Oracle Databases in Linux

    Containers Free sample chapter: https://oraclesean.com
  4. www.viscosityna.com @ViscosityNA Data Protection vs. High Availability Data Protection Concepts

    • Data protection primarily protects against data loss • RMAN • Data Pump • User-managed backup
  5. www.viscosityna.com @ViscosityNA Data Protection vs. High Availability Data Protection Concepts

    • High availability primarily protects against system or site failure • High availability may also offer data protection • Data Guard • GoldenGate
  6. www.viscosityna.com @ViscosityNA Events that require database recovery: Hardware failure Data

    Protection Concepts • Database servers • Storage • Networking • Data centers
  7. www.viscosityna.com @ViscosityNA Events that require database recovery: Physical data loss

    or corruption Data Protection Concepts • Intra-block corruption only • Corruption is confined to individual block(s) • Mismatched block header and/or footer • Invalid checksum • Empty block • Examples: damaged media, block overwritten with zeroes
  8. www.viscosityna.com @ViscosityNA Events that require database recovery: Logical data loss

    or corruption Data Protection Concepts • Intra-block and inter-block corruption • Intra-block: Confined to one block • Inter-block: Exists between blocks • Block header & footer match • Checksum is valid • Yet data is logically inconsistent • Example: lost write
  9. www.viscosityna.com @ViscosityNA Events that require database recovery: User and application

    error Data Protection Concepts • Database level: • Bad DDL or DML • Bad application logic • Software bugs • OS-level: • Deleting/changing database files • May be unintentional or malicious
  10. www.viscosityna.com @ViscosityNA Events that require database recovery: User and application

    error Data Protection Concepts • Database level: • Bad DDL or DML • Bad application logic • Software bugs • OS-level: • Deleting/changing database files • May be unintentional or malicious RMAN backups can address scenarios beyond "vanilla" full database recovery. Recovery KPIs and enterprise risks may require a layered backup approach beyond the standard weekly level 0/daily level 1/hourly archivelogs.
  11. www.viscosityna.com Cool! "We have documentation" Is it clear and understandable?

    How current is it? Is it still accurate and relevant? Has it been tested? If so, how recently? Was it useful?
  12. www.viscosityna.com Cool! "We have documentation" Does it cover scenarios besides

    a full restore? Restore individual datafiles/tablespaces Restore SYSTEM tablespace, control file, etc. Tablespace or PDB point-in-time recovery Restore individual PDB
  13. www.viscosityna.com Cool! "We have documentation" Does it cover scenarios besides

    a full restore? Restore/recover from different media (disk, tape) Restore to a different host, change DB ID/name Data Guard switchover, failover Reinstantiate a Data Guard standby
  14. www.viscosityna.com @ViscosityNA Backups ≠ Recovery! “We have backups” • Are

    they tested and validated regularly? • Are the recovery procedures clear and well-documented? • Do they support multiple restore/recovery scenarios? • Do they meet standards for RTO/RPO? • What are the essential dependencies? • What factors might affect recovery time? • Is database recovery validation an isolated or coordinated exercise?
  15. www.viscosityna.com @ViscosityNA Common oversights/shortcomings in DR/HA plans Lessons Learned from

    DR/HA Postmortems • Over-reliant on teams "instinctively" knowing what to do • Over-optimistic RTO/RPO without empirical basis • Narrow focus (full restore only) • Isolated scope (DBA steps only) • Strong cognitive bias/blindness • Written once and rarely/never updated
  16. www.viscosityna.com @ViscosityNA Common oversights/shortcomings of recovery tests Lessons Learned from

    DR/HA Postmortems • Low criteria for success • Database is up • Meets minimal status checks • Establish basic connectivity • Rarely considers • Infrastructure availability, provisioning steps • Consistent topology • OS/software reinstallation, patching, configuration
  17. www.viscosityna.com @ViscosityNA Common oversights/shortcomings of recovery tests Lessons Learned from

    DR/HA Postmortems • Intangible and human elements • Chaos • Communication • Chain of command • Dependencies • Cross-training
  18. www.viscosityna.com @ViscosityNA Common oversights/shortcomings of recovery tests Lessons Learned from

    DR/HA Postmortems • Post-recovery performance • Alternate recovery requirements • PDB/Datafile/Tablespace only • Block repair • PITR, Flashback
  19. www.viscosityna.com @ViscosityNA Common oversights/shortcomings of recovery tests Lessons Learned from

    DR/HA Postmortems • Post-recovery reconfiguration • Hostname, IP, client changes • Networking, firewall reconfiguration • Security lists, certificates • Does everything still work? • cron jobs • Backups
  20. www.viscosityna.com @ViscosityNA Common oversights/shortcomings of recovery tests Lessons Learned from

    DR/HA Postmortems • Run critical post-recovery steps • Capture new database backups! • Validate the system: • OraCHK, DBSAT • Validate HA • Data Guard • GoldenGate • Validate ETL/integrations
  21. www.viscosityna.com @ViscosityNA Important database con fi guration fi les! What

    RMAN Doesn't Back Up • Password file • Data Guard Broker configurations • GoldenGate files • Block change tracking file • /etc/oratab
  22. www.viscosityna.com @ViscosityNA Important database con fi guration fi les! What

    RMAN Doesn't Back Up • Networking files • tnsnames.ora • listener.ora • sqlnet.ora • Wallets, certificates • Temporary configurations • eg, parameter files for starting RMAN duplication
  23. www.viscosityna.com @ViscosityNA Data! Code! What RMAN Doesn't Back Up •

    Contents of dba_directories, utl_file_dir including: • External tables • BFILE data • Data Pump parameter, log and dump files • SQL*Loader control files • Compiled Pro*C/C++, Pro*COBOL, etc. • External procedures (e.g. EXTPROC) • OS files/executables called via UTL_FILE
  24. www.viscosityna.com @ViscosityNA Recovery Area! What RMAN Doesn't Back Up •

    Flashback logs • When using BACKUP RECOVERY AREA: • Current control file • Online redo logs
  25. www.viscosityna.com @ViscosityNA Important non-database fi les! What RMAN Doesn't Back

    Up • Scripts • cron jobs • Passwords • .profile, .bashrc, .bash_profile • Environment files and configurations
  26. www.viscosityna.com @ViscosityNA Diagnostic data What RMAN Doesn't Back Up •

    diagnostic_dest • audit_file_dest • background_dump_dest • core_dump_dest • user_dump_dest
  27. www.viscosityna.com @ViscosityNA Cluster Ready Services fi les, contents of GRID_HOME

    What RMAN Doesn't Back Up • ASM storage configurations, locations (Oracle Cluster Registry: OCR) • Node-specific resources (Oracle Local Registry: OLR) • Networking files • tnsnames.ora • listener.ora • sqlnet.ora
  28. www.viscosityna.com @ViscosityNA CRS setup What RMAN Doesn't Back Up •

    srvctl configuration: • Database and instance settings • Services and service configurations • Environment variables (eg TNS_ADMIN for EBS) • Listener configurations and endpoints
  29. www.viscosityna.com @ViscosityNA Software and inventory What RMAN Doesn't Back Up

    • oraInventory • Database software and patches, gold images • Patch manifests • .patch_storage directories • Media Management Libraries (MML) and drivers • Client software • AHF, OEM, GoldenGate, etc.
  30. www.viscosityna.com @ViscosityNA Operating system What RMAN Doesn't Back Up •

    Operating system software and patches • Kernel settings and host configurations • Application software and configuration • Agent software
  31. www.viscosityna.com @ViscosityNA Aim to simplify Disaster Recovery Procedures • Complexity

    increases the potential of: • Failure • Exceptions/variations between prod/non-prod • Meaningful configuration and parameter differences • Simple procedures limit the scope of QA
  32. www.viscosityna.com @ViscosityNA Aim to automate Disaster Recovery Procedures • Automation

    reduces manual effort • Cognitive load is a finite resource • Automation takes care of the (technical) "How to" • Allows teams to focus on "What, Why, and (conceptual) How" • Automation addresses • Dependencies, sequence • Trivial activities (easily undervalued/missed/run out of order)
  33. www.viscosityna.com @ViscosityNA Aim to automate Disaster Recovery Procedures • Add

    sanity checks • Confirm environments • "Are you sure..." checks • Preserve output and timing via logging • Include measurable pass/fail checks
  34. www.viscosityna.com @ViscosityNA Write abstract documentation Disaster Recovery Procedures • Don't

    use "common" variables in scripts • Avoid the dreaded "Oops! I ran that in the wrong window!" • Make scripts and documentation "copy/paste-proof" • Copy/paste should work correctly: • ...for every command! • ...in every database! • ...in every environment!
  35. www.viscosityna.com @ViscosityNA Don't use "common" variables in scripts/documents # This:

    srvctl stop database -d $__db_name # Not: srvctl stop database -d $ORACLE_SID
  36. www.viscosityna.com @ViscosityNA Don't include complete variable declarations # This: #

    Set recovery parameters with the correct values export __db_name= # Add the database name # Not: # Set recovery parameters; change values as needed export __db_name=test # Change the database name!
  37. www.viscosityna.com @ViscosityNA Make documentation "copy/paste-proof" # This: srvctl stop database

    -d $__db_name # Not: srvctl stop database -d testdb # Harder to spot values that must be changed!
  38. www.viscosityna.com @ViscosityNA Tips Disaster Recovery Procedures • Consider adding metadata

    to shell prompts: • user • pid • date/time • $PWD • $ORACLE_SID/$ORACLE_PDB • Send session output to a file • Increase SSH terminal scrollback/history
  39. www.viscosityna.com @ViscosityNA Tips Disaster Recovery Procedures • Include base configuration

    settings in scripts • env | sort • whoami • date • $PWD • ...etc.
  40. www.viscosityna.com @ViscosityNA Consolidate and parameterize Disaster Recovery Procedures • Things

    are easier to manage when they're the same • Differentiate systems via parameters only • Procedures that work across multiple environments are easier to: • Test • Validate • Practice