Skip to main content
Delphix

Oracle virtual database (VDB) provisioning or refresh fails reporting "Failed to recreate control file"

Issue

Virtual database (VDB) provisioning and VDB refreshes are failing reporting the following errors:

Failed to recreate control file.
Review the Oracle alert log for more details.

and

SQL*Plus: Release 11.2.0.4.0 Production on Thu Nov 17 15:04:32 2016^M

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

SQL> ORACLE instance shut down.
SQL> ORACLE instance started.

Total System Global Area 7.4826E+10 bytes
Fixed Size                  2261048 bytes
Variable Size            6442454984 bytes
Database Buffers         6.8183E+10 bytes
Redo Buffers              199049216 bytes
SQL>

ERROR at line 12:
ORA-01967: invalid option for CREATE CONTROLFILE

Troubleshooting

Examining the VDB's alert log shows that the recovery that Delphix performs during a provision or refresh operation has not completed successfully.

During the recovery phase of the provision the instance can be seen terminating abnormally:

Thu Nov 17 15:04:06 2016
Media Recovery Log /var/opt/delphix/delphix_mount/vplb/source-archive/arch_1_480925_799509181.log
Media Recovery Log /var/opt/delphix/delphix_mount/vplb/source-archive/arch_2_483606_799509181.log
Media Recovery Log /var/opt/delphix/delphix_mount/vplb/source-archive/arch_2_483607_799509181.log
Thu Nov 17 15:04:27 2016
PMON (ospid: 7159): terminating the instance due to error 471
Thu Nov 17 15:04:27 2016
System state dump requested by (instance=1, osid=7159 (PMON)), summary=[abnormal instance termination].
System State dumped to trace file /prod/sys/oracle/software/base1120406/diag/rdbms/vplb/vplb/trace/vplb_diag_7169_20161117150427.trc
Dumping diagnostic data in directory=[cdmp_20161117150427], requested by (instance=1, osid=7159 (PMON)), summary=[abnormal instance termination].
Instance terminated by PMON, pid = 7159

Normally during recovery of the VDB messages similar to the following should be seen appearing in the alert log when the recovery has completed successfully.

The exact message will depend on the provision type and release of Oracle.

Recovery completed through change 9645137 time 11/15/2016 22:29:27
Media Recovery Complete (vplb)
Completed: alter database recover if needed
 start until change 9645155

The provision or refresh attempt is actually failing as a result of the instance termination so the question becomes what is causing this?

PMON terminating the instance is typically associated with a critical background process being killed or dying for some reason.

Looking into the Linux OS messages log (/var/log/messages) shows that Linux is terminating Oracle through its out of memory killer functionality.

Nov 17 15:04:27 db20p03dx kernel: [10202] 32989 10202    59880      431   1       0             0 oracle
Nov 17 15:04:27 db20p03dx kernel: [10204]     0 10204    56208    11289   3       0             0 TaniumClient
Nov 17 15:04:27 db20p03dx kernel: Out of memory: Kill process 7179 (oracle) score 78 or sacrifice child
Nov 17 15:04:27 db20p03dx kernel: Killed process 7179, UID 32989, (oracle) total-vm:73763960kB, anon-rss:101920kB, file-rss:13149952kB

The Linux kernel will decide to kill off processes when memory resource issues start appearing in the OS based on the value of sysctl vm.overcommit_memory.

When overcommit_memory is enabled (is set to 0 or 1) programs are allowed to allocate more memory than is really available within the OS.

When memory resources on the target node are used up to the extent where the resource shortage is threatening the stability of the system then the Out Of Memory Killer (OOM) can take action. OOM's role is to kill processes until enough memory is freed to allow the OS to operate normally.

As Oracle System Global Area's (SGA) and Oracle processes are typically large users of memory they can be targeted and terminated by OOM.

Resolution

Allocate sufficient memory resources to the target node to ensure memory exhaustion does not occur during the provision process. The amount of memory available to a target node will need to be greater than the sum of all the Oracle VDBs SGA's that are expected to run concurrently on the host.

External Links (if applicable)

Linux Out of Memory Killer