Availability, Data Corruption
In rare instances, following one or more device removal operations, a Delphix Engine may crash and enter a persistent crash loop preventing the affected system from restarting.
Any Delphix Virtualization jobs running at the time of the crash will be abnormally terminated. This includes, but is not limited to, Refresh, Snapsync, Replication, et cetera. Virtual Databases (VDBs) running at the time of a Delphix Engine will hang, and may crash.
The issue may result in a protracted and continuous outage requiring Delphix Support intervention. Although unlikely, unrecoverable data corruption or data loss is possible.
The issue can only occur when using one of the following Delphix Engine releases:
All Sub Releases
18.104.22.168, 22.214.171.124, 126.96.36.199, 188.8.131.52, 184.108.40.206, 220.127.116.11, 18.104.22.168, 22.214.171.124, 126.96.36.199, 188.8.131.52, 184.108.40.206
This issue does not affect Delphix Engines being used only for Masking. The issue can only occur during or after the Delphix Storage Migration feature has been used to remove one or more storage pool devices. The removal of a storage device need not have been performed with a 5.3 version but may have occurred at any time in the past.
The issue will only occur at the time a VDB is being deleted, either through an explicit delete operation, a VDB refresh, or Self-Service operations that perform a delete or refresh.
You may see the error when navigating with a browser to the Delphix Engine or when an existing Delphix Admin or Server Setup application is disrupted by the issue:
Delphix Engine Communication Error
When an Engine crash occurs, Oracle target hosts may experience messages in their system log console, or tty output like:
NFS server <ip address> not responding
where <ip address> is the network address of the affected Delphix server. Attempts to access files under the mount point for Delphix-host remote filesystems may hang.
On hypervisor platforms (e.g. VMware’s ESXi) where virtual console access is available, the console will show recurring and continuous crashes and reboots.
Defer use of the Delphix Storage Migration feature.
For instances where the feature has already been used on one or more of the susceptible Delphix versions, it is possible to contact Delphix Support via a support case for additional screening. Delphix Support can use a support bundle to confirm if the Delphix Storage Migration feature has been used and to provide a workaround.
Once the issue has occurred, the only remedy is to open a support case with Delphix Support to help recover the system.
This issue is fully resolved in Delphix 220.127.116.11 and later releases.
The issue is related to a product defect that may cause a filesystem freelist to become corrupted. Depending on the extent of the corruption, there is risk of customer data being lost or corrupted as a result of the issue.
The following article may provide more information or related information to this article: