general rule, airport codes make good location identifiers. These are easily identified at a glance (different last letters) PRODDWSAN - San Diego PRODDWSFO - San Francisco
in use An RMAN duplicate attempted to restore the control files ORA-01503: CREATE CONTROLFILE failed ORA-00200: control file could not be created ORA-00202: control file: '/u03/SID/controlfile/control01.ctl' ORA-27038: created file already exists Additional information: 1
policy that includes both: • An applied on/shipped to clause • A backed up to device condition configure archivelog deletion policy to [applied on|shipped to] all standby backed up 2 times to <device>;
Daily log backups on primary • Hourly log purge with 'sysdate - 1/24' on primary, standby • All logs less than an hour old were deleted • Standby apply was suspended for several hours • Logs were shipped but not applied • The standby couldn't recover locally • Log backups weren't available on the primary Set an Archive Log Deletion Policy
• They can be pulled from the primary • (Data Guard does this automatically) • Assuming there's an appropriate archive log deletion policy! • Multiplexing standby redo logs makes Data Guard slower! Don't Multiplex Standby Redo Logs
log size • Strive for no more than four log switches per hour • Every log switch creates a corresponding action on the standby • For RAC environments using a single standby apply instance: • Frequent switching @ primary can saturate the standby Size Redo Logs Properly
Sets the desired mean time to recover • Limits roll-forward after instance failure by issuing checkpoints • Forces DBWR to flush dirty blocks to disk • If set to 0, like issuing a checkpoint after every commit! • Oracle recommendations (from MOS Note 1095774.1) • Minimum of 300 • 3600 or the desired Recovery Time Objective Checkpoints and fast_start_mttr_target
primary still advances • Initiating a failover will incur data loss (redo not shipped) • Reinstating the failed primary requires rolling back lost transactions • ...by using Flashback • ...by completely rebuilding the former primary • Flashback is a Data Guard requirement! Turn on Flashback!
for performing a partial incremental backup Without a format directive, the backup will use ORACLE_HOME/dbs backup device type disk incremental from scn XXXXXX database format '/tmp/incr_standby_%U';
location is the Diagnostic trace directory (same as alert log). You cannot change this! DGMGRL> export configuration to '/home/oracle/dgcdb.txt'; ORA-16540: invalid argument ORA-06512: at "SYS.DBMS_DRS", line 1947 ORA-06512: at line 1
• Using the wrong Connect Identifiers • TNS changes to aliases used for Connect Identifiers • Incorrect static listener definitions • Creating TNS aliases for reserved Data Guard services • Manually configuring or changing log_archive_dest_n parameters • Adding Data Guard services to Clusterware • Changes to the local_listener parameter How to Quietly Break Data Guard
start databases • Used for switchover, convert, and reinstate operations • This connection is exclusively for the Data Guard Broker! • Oracle sets StaticConnectIdentifier using local_listener • ...appends _DGMGRL to the service • Oracle automatically manages this value unless users change: • The Broker's StaticConnectIdentifier • The local_listener parameter StaticConnectIdentifier
including: • The DGConnectIdentifier • RMAN connections (eg, duplicate) • OEM or monitoring • NEVER define static entries for the Broker's _DGB service! • NEVER create TNS entries for the _DGB or _DGMGRL services! • NEVER register Data Guard connections with a SCAN listener! Data Guard Connection No-Nos
be able to reach every other member • Members must be able to reach themselves • Allow connections to all instances of RAC databases • Uses a service that dynamically registers with listeners • Allows connect-time failover on RAC • Must not be defined or managed by Clusterware • Failover attributes allow Redo Transport to ship to any RAC instance DGConnectIdentifier
Guard environments • Using TNS aliases adds a dependency on the local tnsnames.ora • Users may not realize/remember the aliases used by Data Guard • tnsnames.ora may be updated frequencly, not under change control • Transport may survive changes; role transitions may not • iFiles add another layer of risk • local_listener changes prevent Broker from maintaining StaticConnectIdentifier Data Guard and EZConnect
lag verbose; show configuration when primary is <DB_UNQNAME>; show database verbose <DB_UNQNAME>; show database <DB_UNQNAME> logxptstatus; show database <DB_UNQNAME> InconsistentProperties; show database <DB_UNQNAME> InconsistentLogXptProps; show instance verbose '<INSTANCE>' on database <DB_UNQNAME>;
On: Yes Gap Status: No Gap Transport Lag: 0 seconds (computed 1 second ago) Transport Status: Success ... Current Log File Groups Configuration: Thread # Online Redo Log Groups Standby Redo Log Groups Status (dgcdb_san) (dgcdb_lax) 1 7 8 Sufficient SRLs Future Log File Groups Configuration: Thread # Online Redo Log Groups Standby Redo Log Groups Status (dgcdb_lax) (dgcdb_san) 1 7 8 Sufficient SRLs
verbose dgcdb_lax spfile; Command requires a connection that uses database or external credentials. DGMGRL> connect [email protected]:1521/dgcdb_lax Password: Connected to "dgcdb_lax" Connected as SYSDG.
anything but SUCCESS • There is no situation where a status of WARNING or ERROR is OK! • SUCCESS from individual components ≠ SUCCESS from all • A configuration can succeed even though a database doesn't • You must check status in the Broker for ALL members! • SUCCESS in the Broker still doesn't guarantee everything is OK! Accept Nothing Less Than SUCCESS
shows: Data Guard Broker Status Summary: Type Name Severity Status Configuration dgcdb Warning ORA-16608: one or more members have warnings Primary Database dgcdb_san Success ORA-0: normal, successful completion Physical Standby Database dgcdb_lax Warning ORA-16809: multiple warnings detected for the member
Check, correct inconsistencies • Export the configuration • Check for bad entries in the XML, import fixes, enable the config • Drop and recreate the configuration • Dropping the config shouldn't remove redo routes; ship/apply will continue show = SUCCESS, log = FAILURE?
on the web! • Customer running EBS with Active Data Guard for eight years • They performed semi-annual switchover test • Switched to standby and immediately returned • Confirmed connections with simple tests • No data validation performed in the new database Why Robust Tests are Essential for Data Guard
-- From the standby alert log: Fri Mar 15 15:43:13 2019 ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION Attempt to start background Managed Standby Recovery process (xxx) Fri Mar 15 15:43:13 2019 MRP0 started with pid=72, OS id=9446 MRP0: Background Managed Standby Recovery process started (xxx) started logmerger process Fri Mar 15 15:43:18 2019 Managed Standby Recovery not using Real Time Apply Warning: Datafile 1 (/u02/app/oracle/oradata/xxx/XXX/datafile/o1_mf_system_u3thdhe7_.dbf) is infinitely media recovery fuzzy <snip> Media Recovery Log /u03/app/oracle/fast_recovery_area/XXXXXX/archivelog/2019_03_15/o1_mf_1_24590_g8r20y3w_.arc Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT FROM SESSION
Tue Jun 25 13:40:46 2019 alter database open Data Guard Broker initializing... Data Guard - stopping apply to allow Active Data Guard enabled database to open ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL Tue Jun 25 13:40:46 2019 MRP0: Background Media Recovery cancelled with status 16037 <snip> Some recovered datafiles maybe left media fuzzy Media recovery may continue but open resetlogs may fail Tue Jun 25 13:40:55 2019 MRP0: Background Media Recovery process shutdown (xxx) Tue Jun 25 13:40:56 2019 Managed Standby Recovery Canceled (xxx) Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL Data Guard Broker initialization complete
Beginning Standby Crash Recovery. Serial Media Recovery started Managed Standby Recovery starting Real Time Apply Warning: Datafile 1 (/u02/app/oracle/oradata/xxx/XXX/datafile/o1_mf_system_u3thdhe7_.dbf) is infinitely media recovery fuzzy Standby database will not open with this datafile online! Standby Crash Recovery aborted due to error 10554. Errors in file /logs/ora/diag/rdbms/xxx/xxx/trace/xxx_ora_725.trc: ORA-10554: Media recovery failed to bring datafile 1 to a consistent point ORA-01110: data file 1: '/u02/app/oracle/oradata/xxx/XXX/datafile/o1_mf_system_u3thdhe7_.dbf' Completed Standby Crash Recovery. Errors in file /logs/ora/diag/rdbms/xxx/xxx/trace/xxx_ora_725.trc: ORA-10458: standby database requires recovery ORA-01196: file 1 is inconsistent due to a failed media recovery session ORA-01110: data file 1: '/u02/app/oracle/oradata/xxx/XXX/datafile/o1_mf_system_u3thdhe7_.dbf' ORA-10458 signalled during: alter database open ... Tue Jun 25 13:41:07 2019 alter database open resetlogs Data Guard Broker initializing... Data Guard Broker initialization complete ORA-1666 signalled during: alter database open resetlogs...
on the web! • Switchover and connection tests "worked" • TNS errors in the alert log were in a monitoring ignore list • The "fuzzy datafile" messages aren't "ORA-XXXXX" errors • The Broker log wasn't monitored or checked • DBAs weren't using the Broker's built-in lag monitoring • Instead, they monitored lag using a query from a blog post • See: https://oraclesean.com/blog/how-not-to-find-data-guard-gaps Why Robust Tests are Essential for Data Guard
Service Registration (1387859.1) • Oracle 12.2 – Simplified OBSERVER Management for Multiple Fast-Start Failover Configurations (2285891.1) • Health Check Alert: Set FAST_START_MTTR_TARGET greater than or equal to 300 (1095774.1) • Using RMAN Effectively In A Dataguard Environment (848716.1) • https://docs.oracle.com/en/database/oracle/oracle-database/19/dgbkr/oracle-data-guard- broker-properties.html