don't teach in Data Guard school) CASOUG 25 - Tashkent 27 October 2025 Seàn Scott Oracle ACE Director Managing Principal Consultant Viscosity North America
general rule, airport codes make good location identifiers. These are easily identified at a glance (different last letters) PRODDWSAN - San Diego PRODDWSFO - San Francisco
help separate prefix from suffix This is helpful in mixed/overlapping environments PRODDWSAN PRODDWSFO PRODTXSAN PRODTXSFO PRODDW_SAN PRODDW_SFO PRODTX_SAN PRODTX_SFO San Diego Data Warehouse San Francisco Data Warehouse San Diego OLTP San Francisco OLTP
in use An RMAN duplicate attempted to restore the control files DBA thought SID2 was standby; deleted the "offending" controlfiles ORA-01503: CREATE CONTROLFILE failed ORA-00200: control file could not be created ORA-00202: control file: '/u03/SID2/controlfile/control01.ctl' ORA-27038: created file already exists Additional information: 1
policy that includes both: • An applied on/shipped to clause • A backed up to device condition configure archivelog deletion policy to [applied on|shipped to] all standby backed up 2 times to <device>;
Daily log backups on primary • Hourly log purge with 'sysdate - 1/24' on primary, standby • All logs less than an hour old were deleted • Standby apply was suspended for several hours • Logs were shipped but not applied • The standby couldn't recover locally • Log backups weren't available on the primary Set an Archive Log Deletion Policy
• They can be pulled from the primary • (Data Guard does this automatically) • Assuming there's an appropriate archive log deletion policy! • Multiplexing standby redo logs makes Data Guard slower! Don't Multiplex Standby Redo Logs
log size • Strive for no more than four log switches per hour • Every log switch creates a corresponding action on the standby • For RAC environments using a single standby apply instance: • Frequent switching @ primary can saturate the standby Size Redo Logs Properly
Sets the desired mean time to recover • Limits roll-forward after instance failure by issuing checkpoints • Forces DBWR to flush dirty blocks to disk • If set to 0, like issuing a checkpoint after every commit! • Oracle recommendations (from MOS Note 1095774.1) • Minimum of 300 • 3600 or the desired Recovery Time Objective Checkpoints and fast_start_mttr_target
primary still advances • Initiating a failover will incur data loss (redo not shipped) • Reinstating the failed primary requires rolling back lost transactions • ...by using Flashback • ...by completely rebuilding the former primary • Flashback is a Data Guard requirement! Turn on Flashback!
possible if: • Primary and standby were at the same SCN at failover • No changes were made at the primary after failover Recovering from a Failover without Flashback
failover SCN for performing a partial incremental backup Without a format directive, the backup will use ORACLE_HOME/dbs backup device type disk incremental from scn XXXXXX database format '/tmp/incr_standby_%U';
location is the Diagnostic trace directory (same as alert log). You cannot change this! DGMGRL> export configuration to '/home/oracle/dgcdb.txt'; ORA-16540: invalid argument ORA-06512: at "SYS.DBMS_DRS", line 1947 ORA-06512: at line 1
be able to reach every other member • Members must be able to reach themselves • Allow connections to all instances of RAC databases • Uses a service that dynamically registers with listeners • Allows connect-time failover on RAC • Must not be defined or managed by Clusterware • Failover attributes allow Redo Transport to ship to any RAC instance DGConnectIdentifier
start databases • Used for switchover, convert, and reinstate operations • This connection is exclusively for the Data Guard Broker! • Oracle sets StaticConnectIdentifier using local_listener • ...appends _DGMGRL to the service • Oracle automatically manages this value unless users change: • The Broker's StaticConnectIdentifier • The local_listener parameter StaticConnectIdentifier
Guard environments • Using TNS aliases adds a dependency on the local tnsnames.ora • Users may not realize/remember the aliases used by Data Guard • tnsnames.ora may be updated frequencly, not under change control • Transport may survive changes; role transitions may not • iFiles add another layer of risk • local_listener changes prevent Broker from maintaining StaticConnectIdentifier Data Guard and EZConnect
including: • The DGConnectIdentifier • RMAN connections (eg, duplicate) • OEM or monitoring • NEVER define static entries for the Broker's _DGB service! • NEVER create TNS entries for the _DGB or _DGMGRL services! • NEVER register Data Guard connections with a SCAN listener! Data Guard Connection No-Nos
• Using the wrong Connect Identifiers • TNS changes to aliases used for Connect Identifiers • Incorrect static listener definitions • Creating TNS aliases for reserved Data Guard services • Manually configuring or changing log_archive_dest_n parameters • Adding Data Guard services to Clusterware • Changes to the local_listener parameter How to Quietly Break Data Guard
lag verbose; show configuration when primary is <DB_UNQNAME>; show database verbose <DB_UNQNAME>; show database <DB_UNQNAME> logxptstatus; show database <DB_UNQNAME> InconsistentProperties; show database <DB_UNQNAME> InconsistentLogXptProps; show instance verbose '<INSTANCE>' on database <DB_UNQNAME>;
On: Yes Gap Status: No Gap Transport Lag: 0 seconds (computed 1 second ago) Transport Status: Success ... Current Log File Groups Configuration: Thread # Online Redo Log Groups Standby Redo Log Groups Status (dgcdb_san) (dgcdb_lax) 1 7 8 Sufficient SRLs Future Log File Groups Configuration: Thread # Online Redo Log Groups Standby Redo Log Groups Status (dgcdb_lax) (dgcdb_san) 1 7 8 Sufficient SRLs
verbose dgcdb_lax spfile; Command requires a connection that uses database or external credentials. DGMGRL> connect [email protected]:1521/dgcdb_lax Password: Connected to "dgcdb_lax" Connected as SYSDG.
anything but SUCCESS • There is no situation where a status of WARNING or ERROR is OK! • SUCCESS from individual components ≠ SUCCESS from all • A configuration can succeed even though a database doesn't • You must check status in the Broker for ALL members! • SUCCESS in the Broker still doesn't guarantee everything is OK! Accept Nothing Less Than SUCCESS
shows: Data Guard Broker Status Summary: Type Name Severity Status Configuration dgcdb Warning ORA-16608: one or more members have warnings Primary Database dgcdb_san Success ORA-0: normal, successful completion Physical Standby Database dgcdb_lax Warning ORA-16809: multiple warnings detected for the member
Check and correct inconsistencies • Export the configuration • Check for bad entries in the XML, import fixes, enable the config • Drop and recreate the configuration • Dropping the config shouldn't remove redo routes; ship/apply will continue show = SUCCESS, log = FAILURE?
on the web! • Customer running EBS with Active Data Guard for eight years • They performed semi-annual switchover test • Switched to standby and immediately returned • Confirmed connections with simple tests • No data validation performed in the new database Don't Reinvent the Wheel!
alert log: Fri Mar 15 15:43:13 2019 ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION Attempt to start background Managed Standby Recovery process (xxx) Fri Mar 15 15:43:13 2019 MRP0 started with pid=72, OS id=9446 MRP0: Background Managed Standby Recovery process started (xxx) started logmerger process Fri Mar 15 15:43:18 2019 Managed Standby Recovery not using Real Time Apply Warning: Datafile 1 (/u02/app/oracle/oradata/xxx/XXX/datafile/o1_mf_system_u3thdhe7_.dbf) is infinitely media recovery fuzzy <snip> Media Recovery Log /u03/app/oracle/fast_recovery_area/XXXXXX/archivelog/2019_03_15/o1_mf_1_24590_g8r20y3w_.arc Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION
2019 alter database open Data Guard Broker initializing... Data Guard - stopping apply to allow Active Data Guard enabled database to open ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL Tue Jun 25 13:40:46 2019 MRP0: Background Media Recovery cancelled with status 16037 <snip> Some recovered datafiles maybe left media fuzzy Media recovery may continue but open resetlogs may fail Tue Jun 25 13:40:55 2019 MRP0: Background Media Recovery process shutdown (xxx) Tue Jun 25 13:40:56 2019 Managed Standby Recovery Canceled (xxx) Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL Data Guard Broker initialization complete
Serial Media Recovery started Managed Standby Recovery starting Real Time Apply Warning: Datafile 1 (/u02/app/oracle/oradata/xxx/XXX/datafile/o1_mf_system_u3thdhe7_.dbf) is infinitely media recovery fuzzy Standby database will not open with this datafile online! Standby Crash Recovery aborted due to error 10554. Errors in file /logs/ora/diag/rdbms/xxx/xxx/trace/xxx_ora_725.trc: ORA-10554: Media recovery failed to bring datafile 1 to a consistent point ORA-01110: data file 1: '/u02/app/oracle/oradata/xxx/XXX/datafile/o1_mf_system_u3thdhe7_.dbf' Completed Standby Crash Recovery. Errors in file /logs/ora/diag/rdbms/xxx/xxx/trace/xxx_ora_725.trc: ORA-10458: standby database requires recovery ORA-01196: file 1 is inconsistent due to a failed media recovery session ORA-01110: data file 1: '/u02/app/oracle/oradata/xxx/XXX/datafile/o1_mf_system_u3thdhe7_.dbf' ORA-10458 signalled during: alter database open ... Tue Jun 25 13:41:07 2019 alter database open resetlogs Data Guard Broker initializing... Data Guard Broker initialization complete ORA-1666 signalled during: alter database open resetlogs...
on the web! • Switchover and connection tests "worked" • TNS errors in the alert log were in a monitoring ignore list • The "fuzzy datafile" messages aren't "ORA-XXXXX" errors • The Broker log wasn't monitored or checked • DBAs weren't using the Broker's built-in lag monitoring • Instead, they monitored lag using a query from a blog post • See: https://oraclesean.com/blog/how-not-to-find-data-guard-gaps Don't Reinvent the Wheel!