Data Guard Best Practices

Data Guard Best Practices June 26, 2025 Sean Scott Managing
Principal Consultant — Viscosity North America Oracle ACE Director

Database Reliability Engineering MAA ⁘ RAC ⁘ RMAN Data Guard
⁘ Sharding ⁘ Partitioning Information Lifecycle Management Exadata & Engineered Systems Database Modernization Upgrades ⁘ Patching ⁘ Migrations Cloud ⁘ Hybrid Automation DevOps ⁘ IaC ⁘ Containers ⁘ Terraform Vagrant ⁘ Ansible Observability AHF ⁘ TFA ⁘ CHA ⁘ CHM

www.viscosityna.com @ViscosityNA Oracle on Docker Running Oracle Databases in Linux
Containers Free sample chapter: https://oraclesean.com

www.viscosityna.com @ViscosityNA Are You a Good DBA?

www.viscosityna.com @ViscosityNA Background

www.viscosityna.com @ViscosityNA Planning Data Guard Implementations

www.viscosityna.com @ViscosityNA Never name a database "Standby" Roles are temporary.
db_unique_name is forever.

Oracle, you OK?

www.viscosityna.com @ViscosityNA Choose a "Good" DB Unique Name As a
general rule, use the SID + a location identifier ORACLE_SID + LOCATION

www.viscosityna.com @ViscosityNA Choose a "Good" DB Unique Name But be
careful when selecting the location. Where is this database located? PRODDWSTL

www.viscosityna.com @ViscosityNA Choose a "Good" DB Unique Name Where is
this database located? Seattle? St. Louis? PRODDWSTL

www.viscosityna.com @ViscosityNA Choose a "Good" DB Unique Name How about
now? Would an outsider understand the difference? PRODDWSTL - Primary PRODDWSEA - Standby

this database? PRODDWSAN

this database? San Diego? San Francisco? San Jose? San Antonio? PRODDWSAN

this database? Say it's San Francisco. PRODDWSAN

this database? What's an obvious name for a new data center in San Diego? PRODDWSAN - San Francisco PRODDW??? - San Diego

general rule, airport codes make good location identifiers. PRODDWSAN - San Diego PRODDWSFO - San Francisco

general rule, airport codes make good location identifiers. These are easily identified at a glance (different last letters) PRODDWSAN - San Diego PRODDWSFO - San Francisco

www.viscosityna.com @ViscosityNA Choose a "Good" DB Unique Name What if
the data centers are in Los Angeles and Phoenix? The unique names aren't very unique! PRODTXPHX - Phoenix PRODTXLAX - Los Angeles

www.viscosityna.com @ViscosityNA Choose a "Good" DB Unique Name What about
this: PRODTX_PHX - Phoenix PRODTX_LAX - Los Angeles

www.viscosityna.com @ViscosityNA Choose a "Good" DB Unique Name Still, an
airport code isn't always the best choice. PRODTX_PHX - Phoenix PRODTX_PDX - Portland

www.viscosityna.com @ViscosityNA Choose a "Good" DB Unique Name Which is
better? PRODTX_PHX or PRODTX_AZ - Phoenix PRODTX_PDX or PRODTX_OR - Portland

www.viscosityna.com @ViscosityNA Choose a "Good" DB Unique Name What about
numbers? PRODTX1 PRODTX2

www.viscosityna.com @ViscosityNA Choose a "Good" DB Unique Name OMF wasn't
in use An RMAN duplicate attempted to restore the control files ORA-01503: CREATE CONTROLFILE failed ORA-00200: control file could not be created ORA-00202: control file: '/u03/SID/controlfile/control01.ctl' ORA-27038: created file already exists Additional information: 1

www.viscosityna.com @ViscosityNA Make Documentation "Copy/Paste-Proof" # This: switchover to $target_db
switchover to <target_db> # Not: switchover to prod_stdby # Harder to spot values that must be changed!

www.viscosityna.com @ViscosityNA Choose a "Good" DB Unique Name But zip
codes? PRODTX_10012 - New York PRODTX_90120 - Beverly Hills

www.viscosityna.com @ViscosityNA Set an Archive Log Deletion Policy Set a
policy that includes both: • An applied on/shipped to clause • A backed up to device condition configure archivelog deletion policy to [applied on|shipped to] all standby backed up 2 times to <device>;

www.viscosityna.com @ViscosityNA shipped to alone is not a guarantee! •
Daily log backups on primary • Hourly log purge with 'sysdate - 1/24' on primary, standby • All logs less than an hour old were deleted • Standby apply was suspended for several hours • Logs were shipped but not applied • The standby couldn't recover locally • Log backups weren't available on the primary Set an Archive Log Deletion Policy

www.viscosityna.com @ViscosityNA Standby redo logs are not online redo logs!
• They can be pulled from the primary • (Data Guard does this automatically) • Assuming there's an appropriate archive log deletion policy! • Multiplexing standby redo logs makes Data Guard slower! Don't Multiplex Standby Redo Logs

www.viscosityna.com @ViscosityNA Most common recommendation from DB review: Increase redo
log size • Strive for no more than four log switches per hour • Every log switch creates a corresponding action on the standby • For RAC environments using a single standby apply instance: • Frequent switching @ primary can saturate the standby Size Redo Logs Properly

www.viscosityna.com @ViscosityNA • Checkpoints can overload a standby • If
the alert log frequently "Checkpoint not complete" messages • Add redo log groups • Check storage performance Checkpoints and fast_start_mttr_target

www.viscosityna.com @ViscosityNA Low values for fast_start_mttr_target increase checkpoint activity •
Sets the desired mean time to recover • Limits roll-forward after instance failure by issuing checkpoints • Forces DBWR to flush dirty blocks to disk • If set to 0, like issuing a checkpoint after every commit! • Oracle recommendations (from MOS Note 1095774.1) • Minimum of 300 • 3600 or the desired Recovery Time Objective Checkpoints and fast_start_mttr_target

www.viscosityna.com @ViscosityNA • When primary & standby lose contact, the
primary still advances • Initiating a failover will incur data loss (redo not shipped) • Reinstating the failed primary requires rolling back lost transactions • ...by using Flashback • ...by completely rebuilding the former primary • Flashback is a Data Guard requirement! Turn on Flashback!

www.viscosityna.com @ViscosityNA • Without Flashback, a partial restore may be
possible if: • Primary and standby were at the same SCN at failover • No changes were made at the primary after failover Recovering from a Failover

www.viscosityna.com @ViscosityNA Recovering from a Failover Identify the failover SCN
for performing a partial incremental backup Without a format directive, the backup will use ORACLE_HOME/dbs backup device type disk incremental from scn XXXXXX database format '/tmp/incr_standby_%U';

www.viscosityna.com @ViscosityNA Recovering from a Failover Copy and catalog the
backupset to the failed database catalog start with '/some/directory/';

www.viscosityna.com @ViscosityNA Recovering from a Failover Recover the failed database
using the noredo option RMAN automatically uses the special incremental backup recover database noredo;

www.viscosityna.com @ViscosityNA Multiplex Broker Files Locate Broker configurations on separate
disk groups! dg_broker_config_file1 dg_broker_config_file2

www.viscosityna.com @ViscosityNA Comment Parameter Changes Helps people understand the parameter
is Data Guard related Future you thanks you! alter system set parameter=value comment='For Data Guard' scope=both sid='*';

www.viscosityna.com @ViscosityNA Use the Force Broker, Luke!

www.viscosityna.com @ViscosityNA Use the Force Broker, Luke! edit database DB_UNQNAME
set state='APPLY-OFF'; alter database recover managed standby database cancel;

www.viscosityna.com @ViscosityNA Use the Force Broker, Luke! Configure the environment
for improved script output: set time on set echo on

www.viscosityna.com @ViscosityNA Manage Configurations

www.viscosityna.com @ViscosityNA • Changes to a configuration aren't immediately active
• They must be applied via enable configuration Enable the Configuration Following Changes

www.viscosityna.com @ViscosityNA Export, Edit, & Import Configurations as XML DGMGRL>
export configuration to my_config.txt DGMGRL> import configuration my_config.txt

www.viscosityna.com @ViscosityNA Export, Edit, & Import Configurations as XML File
location is the Diagnostic trace directory (same as alert log). You cannot change this! DGMGRL> export configuration to '/home/oracle/dgcdb.txt'; ORA-16540: invalid argument ORA-06512: at "SYS.DBMS_DRS", line 1947 ORA-06512: at line 1

www.viscosityna.com @ViscosityNA <?xml version="1.0" encoding="UTF-8"?> <DRC Version="19.0.0.0.0" CurrentPath="True" Name="dgcdb"> <DefaultState>ONLINE</DefaultState>
<DRC_UNIQUE_ID>705151955</DRC_UNIQUE_ID> ... <Member MemberID="1" CurrentPath="True" Enabled="True" MultiInstanced="True" Name="dgcdb_san"> <IntendedState>PRIMARY</IntendedState> <DefaultState>PRIMARY</DefaultState> <Status> <Severity>Success</Severity> <Error>0</Error> <Timestamp>1750881160</Timestamp> </Status> <StandbyType>PhysicalStandby</StandbyType> <DGConnectIdentifier>DGSAN.orapub.com:1521/dgcdb_san</DGConnectIdentifier> <DbDomain>orapub.com</DbDomain> <ResourceType>Database</ResourceType> <Instance InstanceID="1" CurrentPath="True" Enabled="True" MultiInstanced="True" DefaultWriteOnce="True" Name="dgcdb"> <PlannedState/> <HostName Default="True">DGSAN.orapub.com</HostName> <StaticConnectIdentifier Default="True">(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=DGSAN.orapub.com)(PORT=1521)) (CONNECT_DATA=(SERVICE_NAME=dgcdb_san_DGMGRL.orapub.com)(INSTANCE_NAME=dgcdb)(SERVER=DEDICATED))) </StaticConnectIdentifier>

www.viscosityna.com @ViscosityNA Data Guard and Networking

www.viscosityna.com @ViscosityNA Shipping and apply continue working; role transitions fail
• Using the wrong Connect Identifiers • TNS changes to aliases used for Connect Identifiers • Incorrect static listener definitions • Creating TNS aliases for reserved Data Guard services • Manually configuring or changing log_archive_dest_n parameters • Adding Data Guard services to Clusterware • Changes to the local_listener parameter How to Quietly Break Data Guard

www.viscosityna.com @ViscosityNA Instance-speci fi c setting Data Guard uses to
start databases • Used for switchover, convert, and reinstate operations • This connection is exclusively for the Data Guard Broker! • Oracle sets StaticConnectIdentifier using local_listener • ...appends _DGMGRL to the service • Oracle automatically manages this value unless users change: • The Broker's StaticConnectIdentifier • The local_listener parameter StaticConnectIdentifier

www.viscosityna.com @ViscosityNA • NEVER use the StaticConnectIdentifier for other purposes,
including: • The DGConnectIdentifier • RMAN connections (eg, duplicate) • OEM or monitoring • NEVER define static entries for the Broker's _DGB service! • NEVER create TNS entries for the _DGB or _DGMGRL services! • NEVER register Data Guard connections with a SCAN listener! Data Guard Connection No-Nos

www.viscosityna.com @ViscosityNA Rules for the DGConnectIdentifier • All members must
be able to reach every other member • Members must be able to reach themselves • Allow connections to all instances of RAC databases • Uses a service that dynamically registers with listeners • Allows connect-time failover on RAC • Must not be defined or managed by Clusterware • Failover attributes allow Redo Transport to ship to any RAC instance DGConnectIdentifier

www.viscosityna.com @ViscosityNA Risks associated with using TNS aliases in Data
Guard environments • Using TNS aliases adds a dependency on the local tnsnames.ora • Users may not realize/remember the aliases used by Data Guard • tnsnames.ora may be updated frequencly, not under change control • Transport may survive changes; role transitions may not • iFiles add another layer of risk • local_listener changes prevent Broker from maintaining StaticConnectIdentifier Data Guard and EZConnect

www.viscosityna.com @ViscosityNA Data Guard and EZConnect EZConnect strings disconnect Broker
configurations from TNS DGMGRL> show database verbose dgcdb_san Database - dgcdb_san Role: PRIMARY Intended State: TRANSPORT-ON Instance(s): dgcdb Properties: DGConnectIdentifier = 'DGSAN.orapub.com:1521/dgcdb_san' ...

www.viscosityna.com @ViscosityNA Monitoring and Validation

www.viscosityna.com @ViscosityNA Useful show commands show configuration verbose; show configuration
lag verbose; show configuration when primary is <DB_UNQNAME>; show database verbose <DB_UNQNAME>; show database <DB_UNQNAME> logxptstatus; show database <DB_UNQNAME> InconsistentProperties; show database <DB_UNQNAME> InconsistentLogXptProps; show instance verbose '<INSTANCE>' on database <DB_UNQNAME>;

www.viscosityna.com @ViscosityNA show configuration vs. show configuration lag DGMGRL> show
configuration verbose; Configuration - dgcdb Protection Mode: MaxPerformance Members: dgcdb_san - Primary database dgcdb_lax - Physical standby database DGMGRL> show configuration lag verbose; Configuration - dgcdb Protection Mode: MaxPerformance Members: dgcdb_san - Primary database dgcdb_lax - Physical standby database Transport Lag: 0 seconds (computed 0 seconds ago) Apply Lag: 0 seconds (computed 0 seconds ago)

www.viscosityna.com @ViscosityNA show configuration when primary is ... DGMGRL> show
configuration when primary is dgcdb_san Configuration when dgcdb_san is primary - dgcdb Members: dgcdb_san - Primary database dgcdb_lax - Physical standby database DGMGRL> show configuration when primary is dgcdb_lax Configuration when dgcdb_lax is primary - dgcdb Members: dgcdb_lax - Primary database dgcdb_san - Physical standby database

www.viscosityna.com @ViscosityNA Useful validate commands validate database verbose <PRIMARY>; validate
database verbose <STANDBY>; validate database verbose <STANDBY> spfile; validate network configuration for all; validate static connect identifier for all; validate dgconnectidentifier <CONNECT_IDENTIFIER>;

www.viscosityna.com @ViscosityNA validate database verbose <primary> DGMGRL> validate database verbose
dgcdb_san Database Role: Primary database Ready for Switchover: Yes Flashback Database Status: dgcdb_san: On Capacity Information: Database Instances Threads dgcdb_san 1 1 Managed by Clusterware: dgcdb_san: NO Validating static connect identifier for the primary database dgcdb_san... Connecting to instance "dgcdb_san" on database "dgcdb_san" ... Connected to "dgcdb_san" Succeeded.

www.viscosityna.com @ViscosityNA validate database verbose <standby> DGMGRL> validate database verbose
dgcdb_lax Database Role: Physical standby database Primary Database: dgcdb_san Ready for Switchover: Yes Ready for Failover: Yes (Primary Running) Flashback Database Status: dgcdb_san: On dgcdb_lax: On ... Standby Apply-Related Information: Apply State: Running Apply Lag: 0 seconds (computed 1 second ago) Apply Delay: 0 minutes

www.viscosityna.com @ViscosityNA validate database verbose <standby> (cont) Transport-Related Information: Transport
On: Yes Gap Status: No Gap Transport Lag: 0 seconds (computed 1 second ago) Transport Status: Success ... Current Log File Groups Configuration: Thread # Online Redo Log Groups Standby Redo Log Groups Status (dgcdb_san) (dgcdb_lax) 1 7 8 Sufficient SRLs Future Log File Groups Configuration: Thread # Online Redo Log Groups Standby Redo Log Groups Status (dgcdb_lax) (dgcdb_san) 1 7 8 Sufficient SRLs

www.viscosityna.com @ViscosityNA validate database verbose <standby> spfile DGMGRL> validate database
verbose dgcdb_lax spfile; Command requires a connection that uses database or external credentials. DGMGRL> connect [email protected]:1521/dgcdb_lax Password: Connected to "dgcdb_lax" Connected as SYSDG.

www.viscosityna.com @ViscosityNA validate database verbose <standby> spfile DGMGRL> validate database
verbose dgcdb_lax spfile; Connecting to "dgcdb_san". Connected to "dgcdb_san" Connecting to "dgcdb_lax". Connected to "dgcdb_lax" Parameter Settings: audit_file_dest: dgcdb_san (PRIMARY) : /u01/app/oracle/admin/dgcdb_san/adump dgcdb_lax : /u01/app/oracle/admin/dgcdb_lax/adump audit_sys_operations: dgcdb_san (PRIMARY) : false dgcdb_lax : false ...

www.viscosityna.com @ViscosityNA • Data Guard is broken if show/validate reports
anything but SUCCESS • There is no situation where a status of WARNING or ERROR is OK! • SUCCESS from individual components ≠ SUCCESS from all • A configuration can succeed even though a database doesn't • You must check status in the Broker for ALL members! • SUCCESS in the Broker still doesn't guarantee everything is OK! Accept Nothing Less Than SUCCESS

www.viscosityna.com @ViscosityNA When SUCCESS = FAILURE DGMGRL> show configuration Configuration
- dgcdb Protection Mode: MaxPerformance Members: dgcdb_san - Primary database dgcdb_lax - Physical standby database Fast-Start Failover: Disabled Configuration Status: SUCCESS (status updated 47 seconds ago)

www.viscosityna.com @ViscosityNA When SUCCESS = FAILURE DGMGRL> show database dgcdb_lax
Database - dgcdb_lax Role: PHYSICAL STANDBY Intended State: APPLY-ON Transport Lag: 0 seconds (computed 0 seconds ago) Apply Lag: 0 seconds (computed 0 seconds ago) Average Apply Rate: 125.00 KByte/s Real Time Query: OFF Instance(s): dgcdb Database Status: SUCCESS

www.viscosityna.com @ViscosityNA When SUCCESS = FAILURE DGMGRL> show database verbose
dgcdb_san Database - dgcdb_san ... Log file locations: Alert log : /u01/app/oracle/diag/rdbms/dgcdb_san/dgcdb/trace/alert_dgcdb.log Data Guard Broker log : /u01/app/oracle/diag/rdbms/dgcdb_san/dgcdb/trace/drcdgcdb.log Database Status: SUCCESS

www.viscosityna.com @ViscosityNA When SUCCESS = FAILURE Yet the Broker log
shows: Data Guard Broker Status Summary: Type Name Severity Status Configuration dgcdb Warning ORA-16608: one or more members have warnings Primary Database dgcdb_san Success ORA-0: normal, successful completion Physical Standby Database dgcdb_lax Warning ORA-16809: multiple warnings detected for the member

www.viscosityna.com @ViscosityNA • Redo shipping and apply continues normally, without
lag • Usually occurs following: • Host migration • Database upgrade • Changes to: • global_name, domain_name • log_archive_dest_* show = SUCCESS, log = FAILURE?

www.viscosityna.com @ViscosityNA Solution • validate database verbose <standby> spfile •
Check, correct inconsistencies • Export the configuration • Check for bad entries in the XML, import fixes, enable the config • Drop and recreate the configuration • Dropping the config shouldn't remove redo routes; ship/apply will continue show = SUCCESS, log = FAILURE?

www.viscosityna.com @ViscosityNA An example of why you can't trust everything
on the web! • Customer running EBS with Active Data Guard for eight years • They performed semi-annual switchover test • Switched to standby and immediately returned • Confirmed connections with simple tests • No data validation performed in the new database Why Robust Tests are Essential for Data Guard

www.viscosityna.com @ViscosityNA Why Robust Tests are Essential for Data Guard
-- From the standby alert log: Fri Mar 15 15:43:13 2019 ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION Attempt to start background Managed Standby Recovery process (xxx) Fri Mar 15 15:43:13 2019 MRP0 started with pid=72, OS id=9446 MRP0: Background Managed Standby Recovery process started (xxx) started logmerger process Fri Mar 15 15:43:18 2019 Managed Standby Recovery not using Real Time Apply Warning: Datafile 1 (/u02/app/oracle/oradata/xxx/XXX/datafile/o1_mf_system_u3thdhe7_.dbf) is infinitely media recovery fuzzy <snip> Media Recovery Log /u03/app/oracle/fast_recovery_area/XXXXXX/archivelog/2019_03_15/o1_mf_1_24590_g8r20y3w_.arc Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION

Tue Jun 25 13:40:46 2019 alter database open Data Guard Broker initializing... Data Guard - stopping apply to allow Active Data Guard enabled database to open ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL Tue Jun 25 13:40:46 2019 MRP0: Background Media Recovery cancelled with status 16037 <snip> Some recovered datafiles maybe left media fuzzy Media recovery may continue but open resetlogs may fail Tue Jun 25 13:40:55 2019 MRP0: Background Media Recovery process shutdown (xxx) Tue Jun 25 13:40:56 2019 Managed Standby Recovery Canceled (xxx) Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL Data Guard Broker initialization complete

Beginning Standby Crash Recovery. Serial Media Recovery started Managed Standby Recovery starting Real Time Apply Warning: Datafile 1 (/u02/app/oracle/oradata/xxx/XXX/datafile/o1_mf_system_u3thdhe7_.dbf) is infinitely media recovery fuzzy Standby database will not open with this datafile online! Standby Crash Recovery aborted due to error 10554. Errors in file /logs/ora/diag/rdbms/xxx/xxx/trace/xxx_ora_725.trc: ORA-10554: Media recovery failed to bring datafile 1 to a consistent point ORA-01110: data file 1: '/u02/app/oracle/oradata/xxx/XXX/datafile/o1_mf_system_u3thdhe7_.dbf' Completed Standby Crash Recovery. Errors in file /logs/ora/diag/rdbms/xxx/xxx/trace/xxx_ora_725.trc: ORA-10458: standby database requires recovery ORA-01196: file 1 is inconsistent due to a failed media recovery session ORA-01110: data file 1: '/u02/app/oracle/oradata/xxx/XXX/datafile/o1_mf_system_u3thdhe7_.dbf' ORA-10458 signalled during: alter database open ... Tue Jun 25 13:41:07 2019 alter database open resetlogs Data Guard Broker initializing... Data Guard Broker initialization complete ORA-1666 signalled during: alter database open resetlogs...

www.viscosityna.com @ViscosityNA An example of why you can't trust everything
on the web! • Switchover and connection tests "worked" • TNS errors in the alert log were in a monitoring ignore list • The "fuzzy datafile" messages aren't "ORA-XXXXX" errors • The Broker log wasn't monitored or checked • DBAs weren't using the Broker's built-in lag monitoring • Instead, they monitored lag using a query from a blog post • See: https://oraclesean.com/blog/how-not-to-find-data-guard-gaps Why Robust Tests are Essential for Data Guard

www.viscosityna.com @ViscosityNA Make Documentation "copy/paste-proof" # This: switchover to $target_db
switchover to <target_db> # Not: switchover to prod_stdby # Harder to spot values that must be changed!

www.viscosityna.com @ViscosityNA Resources • Oracle Data Guard Broker and Static
Service Registration (1387859.1) • Oracle 12.2 – Simplified OBSERVER Management for Multiple Fast-Start Failover Configurations (2285891.1) • Health Check Alert: Set FAST_START_MTTR_TARGET greater than or equal to 300 (1095774.1) • Using RMAN Effectively In A Dataguard Environment (848716.1) • https://docs.oracle.com/en/database/oracle/oracle-database/19/dgbkr/oracle-data-guard- broker-properties.html

Questions? Contact Me! [email protected] https://linktr.ee/oraclesean www.viscosityna.com

Data Guard Best Practices

Data Guard Best Practices

More Decks by Seán Scott

Other Decks in Technology

Featured

Transcript