Skip to main content
Delphix

TB034 Delphix Engine May Fail to Restart Following Device Removal

 

 

Alert Type

Availability

Impact

The Delphix Engine includes the Delphix Storage Migration feature that allows for the removal of storage devices if they are no longer needed. Once a device is removed using this feature, the storage configuration of the Delphix Engine may be left in an inconsistent state. This may result in a failure of an affected Delphix Engine to start properly.

The failure to start will extend any Virtual Database (VDB) service outage associated with a restart or crash and will render the Delphix Engine administrative interfaces (Command Line Interface (CLI), Web API, and Graphical User Interface (GUI)) inoperative. 

Contributing Factors

The problem can only occur in the following Delphix Engine Releases:

  • Delphix Engine 4.2.0.0 and 4.2.0.3

  • Delphix Engine 4.2.1.0 and 4.2.1.1

  • Delphix Engine 4.2.2.0 and 4.2.2.1

  • Delphix Engine 4.2.3.0

  • Delphix Engine 4.2.4.0

  • Delphix Engine 4.2.5.0 and 4.2.5.1

  • Delphix Engine 4.3.1.0

  • Delphix Engine 4.3.2.0 and 4.3.2.1

  • Delphix Engine 4.3.3.0

  • Delphix Engine 4.3.4.0 and 4.3.4.1

  • Delphix Engine 4.3.5.0

  • Delphix Engine 5.0.1.0 and 5.0.1.1

  • Delphix Engine 5.0.2.0 and 5.0.2.1 and 5.0.2.2

The problem will only occur following a restart or crash of the Delphix Engine appliance from which a storage device has been successfully removed using the Storage Migration Feature.

If the conditions specified in this section are satisfied, there is a high probability of experiencing the problem. 

Symptoms

  • The Delphix Engine will begin the normal boot / restart process but will not fully progress to a normal state. When viewing the Delphix Engine guest console, e.g. using ESXi, the Delphix Management Service and Delphix Boot Service services will persistently show as offline, as below:

Screen Shot 2016-04-27 at 7.05.35 PM.png

  • The system may appear reachable on the network. For example, a ping command might get a successful response. Otherwise the system will be unresponsive.
  • If VDBs were not shutdown prior to the incident, they may hang or crash. In the system log of affected Oracle target hosts, messages like:

    NFS server <ip address> not responding

    may be seen on the console or in the system log.
  • Attempts to initiate SSH connections to an affected Delphix Engine will fail with "connection refused."
  • Attempts to initiate new browser connections to an affected Delphix Engine will fail with "This site cannot be reached", "refused to connect", or "connection refused" errors.
  • Web API calls will timeout or fail

Relief/Workaround

Avoid using the Delphix Storage Migration feature.

If device removal has been completed, but the problem has not occurred because the Delphix Engine has not been restarted, defer any planned restarts/reboots of the affected Delphix Engine and contact Delphix Support. 

If this problem has already occurred, contact Delphix Support for remediation. Further reboots or restarts of the Delphix Engine will not be effective.

Resolution

The issue is fully resolved in Delphix Engine release 5.0.3.0, Delphix OS Version 5.0.2016.04.29, and later OS releases.  

Fresh Installations and Full Upgrades to Delphix Engine 5.0.3.0 will run OS Version 5.0.2016.04.29.

Deferred OS upgrades to Delphix 5.0.3.0 will run a prior version of the Delphix OS and are still susceptible to the problem described in this bulletin. 

See the "Deferred OS Upgrade" section of "Upgrading to a New Version of the Delphix Engine"  for information about how to determine the current OS version of a Delphix Engine. 

Delphix Engine Release
Included OS Version
Minimum OS Version
5.0.3.0

5.0.2016.04.29

5.0.2016.01.28

Additional Information

Delphix Storage Migration