Skip to main content
Delphix

TB062 Delphix Engine May Crash After Device Removal

 

 

This article applies to the following versions of the Delphix Engine:

Major Release

All Sub Releases

5.3

5.3.0.0, 5.3.0.1, 5.3.0.2, 5.3.0.3, 5.3.1.0, 5.3.1.1, 5.3.1.2, 5.3.3.0, 5.3.3.1, 5.3.4.0

Alert Type

 

Availability

Impact

In rare instances, following one or more device removal operations, a Delphix Engine may crash and enter a persistent crash loop preventing the affected system from restarting.

Any Delphix Virtualizatoin jobs running at the time of the crash will be abnormally terminated. This includes, but is not limited to, Refresh, Snapsync, Replication, et cetera.  Virtual Databases (VDBs) running at the time of a Delphix Engine will hang, and may crash.

The issue may result in a protracted and continuous outage requiring Delphix Support intervention to recover.

Contributing Factors

The issue can only occur when using one of the following Delphix Engine releases:

Major Release

All Sub Releases

5.3

5.3.0.0, 5.3.0.1, 5.3.0.2, 5.3.0.3, 5.3.1.0, 5.3.1.1, 5.3.1.2, 5.3.3.0, 5.3.3.1, 5.3.4.0

This issue does not affect Delphix Engines being used for Masking. The issue can only occur during or after the Delphix Storage Migration feature has been used to remove one or more storage pool devices. The removal of a storage device need not have been performed with a 5.3 version but may have occurred at any time in the past.

The issue will only occur at the time a VDB is being deleted, either through an explicit delete operation, a VDB refresh, or Self-Service operations that perform a delete or refresh.

Symptoms

You may see the error when navigating with a browser to the Delphix Engine or when an existing Delphix Admin or Server Setup application is disrupted by the issue:

Delphix Engine Communication Error

When an Engine crash occurs, Oracle target hosts may experience messages in their system log console, or tty output like:

NFS server <ip address> not responding

where <ip address> is the network address of the affected Delphix server.   Attempts to access files under the mount point for Delphix-host remote filesystems may hang.

On hypervisor platforms (e.g. VMware’s ESXi) where virtual console access is available, the console will show recurring and continuous crashes and reboots.

Relief/Workaround

Defer use of the Delphix Storage Migration feature. 

For instances where the feature has already been used on one or more of the susceptible Delphix versions, it is possible to contact Delphix Support via a support case for additional screening.  Delphix Support can use a support bundle to confirm if the Delphix Storage Migration feature has been used and to provide a workaround. 

Once the issue has occurred, the only remedy is to open a support case with Delphix Support to help recover the system.

Resolution

The issue will be resolved in a future Delphix Engine release.

Additional Information

The issue is related to a product defect that may cause a filesystem freelist to become corrupted. There is no risk of customer data being lost or corrupted as a result of the issue.

Related Articles

The following articles may provide more information or related information to this article: