Skip to main content
Delphix

Delphix Snapsync of an Oracle RAC dSource fails reporting ORA-01138 (KBA1778)

 

 

KBA

KBA#1778

Applicable Delphix Versions

This article applies to the following versions of the Delphix Engine:

Major Release

All Sub Releases

5.2

5.2.2.0. 5.2.2.1, 5.2.3.0 

5.1

5.1.0.0, 5.1.1.0, 5.1.2.0, 5.1.3.0, 5.1.4.0, 5.1.5.0, 5.1.5.1, 5.1.6.0, 5.1.7.0, 5.1.8.0, 5.1.8.1, 5.1.9.0

5.0

5.0.1.0, 5.0.1.1, 5.0.2.0, 5.0.2.1, 5.0.2.2, 5.0.2.3, 5.0.3.0, 5.0.3.1, 5.0.4.0, 5.0.4.1 ,5.0.5.0, 5.0.5.1, 5.0.5.2, 5.0.5.3, 5.0.5.4

Issue 

During Delphix Snapsync operations against Oracle RAC dSources, RMAN is used to capture an SCN based incremental backup or a full backup at the first snapsync. In RAC environments RMAN requires all instances in the cluster be in the same startup state.

Oracle database can be in one of 4 states:

  • Stopped: the instance is completely down.
  • No Mount: the instance is started, processes are seen running and the shared memory segment is created.
  • Mounted: the instance is started, the controlfile has been opened and the datafile locations identified.
  • Open: The database is completely open and available to end users.

In order for RMAN to attach to an Oracle database and perform the backup associated with the dSource snapsync the database must be mounted or opened.  In a RAC database all instances must in the same state, all must be mounted OR all must be open.

  • For a non-standby dSource the database  and all its instances will be in the open state.
  • For standby database dSources running Active Data Guard all RAC instances would open read only with one instance performing managed recovery
  • For regular standby database dSources (no active Data Guard) all RAC instances would be sitting in the mounted state.

Where RAC is running and the instances are not in the same startup phase or state and an RMAN backup is attempted RMAN will report the error ORA-01138.

Oracle describes this error as the following:

$ oerr ora 1138
01138, 00000, "database must either be open in this instance or not at all"
// *Cause:  The requested operation can not be done when the database is
//          mounted but not open in this instance, and another instance has
//          the database open.
// *Action: Execute the operation in an open instance, open the database in
//          this instance, or close the database in the other instances.

The following SQL can be used to determine the state (STATUS) of each instance in the RAC database.  The error appears when one instance's status does not match the others in the cluster.  

SQL> select instance_number,instance_number,instance_name,status,database_status from gv$instance;

INSTANCE_NUMBER INSTANCE_NUMBER INSTANCE_NAME     STATUS       DATABASE_STATUS
--------------- --------------- ---------------- ------------ -----------------
          1           1         orcl121          MOUNTED       ACTIVE
          2           2         orcl122          OPEN          ACTIVE

Troubleshooting 

Reproducing the behaviour best demonstrates the problem, diagnostics for troubleshooting it and how the issue can be overcome.

A bit of background information

  • A 2 node RAC dSource is linked to the Delphix Engine.
  • The first Snapsync/snapshot is captured.
  • A shutdown of instance 1 was performed and the instance is restarted to mount (either deliberately or by mistake).
  • The database user in place within Delphix is the SYS user (as we need to be able to connect to mounted instances and only SYSDBA users can do this)

A Delphix snapsyc operation is initiated against the dSource database and fails reporting the following error:

clipboard_e5fa02eb2a30171ce758b7e128e33169c.png

Executing the following SQL against one of the RAC databases instances shows the problem where the status does no match.

SQL> select instance_number,instance_number,instance_name,status,database_status from gv$instance;

INSTANCE_NUMBER INSTANCE_NUMBER INSTANCE_NAME     STATUS       DATABASE_STATUS
--------------- --------------- ---------------- ------------ -----------------
          1           1          orcl121          MOUNTED       ACTIVE
          2           2          orcl122          OPEN          ACTIVE

Resolution 

To overcome the failure in snapsync one of the database instances must be restarted to a state (STATUS) that matches the other.  In the example provided instance 2 is shutdown and restarted to mount (MOUNTED) to match instance 1.  The direction taken to achieve the same state is dependent on the requirements of the site.  It may be that this is a primary database where the intended state is OPEN or it could be a standby site where the intended state is MOUNTED.

In this case srvctl is used to stop instance 2 (orcl122) and restart it to mount.

$ srvctl stop instance -i orcl122 -d orcl12 -o immediate
$ srvctl start instance -i orcl122 -d orcl12 -o mount

To ensure the next snapsync attempt is successful checking the state of each instance is performed.

$ sqlplus / as sysdba

SQL*Plus: Release 12.1.0.2.0 Production on Wed Sep 19 14:22:18 2018

Copyright (c) 1982, 2014, Oracle.  All rights reserved.

Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options

SQL> select instance_number,instance_number,instance_name,status,database_status from gv$instance;

INSTANCE_NUMBER INSTANCE_NUMBER INSTANCE_NAME     STATUS       DATABASE_STATUS
--------------- --------------- ---------------- ------------ -----------------
          1           1         orcl121           MOUNTED      ACTIVE
          2           2         orcl122           MOUNTED      ACTIVE

Now that the STATUS of both instances is aligned, in this case to MOUNTED, the dSource snapsync can be attempted once more and should succeed without reporting ORA-01138.

clipboard_ebb3e8e21ab257943253c520473dc39ca.png