Skip to main content
Delphix

TB114 Delphix Engine May Fail to Reboot Following Upgrade

 

 

 

Alert Type

Availability

Impact

An affected Delphix Engine, Continuous Data Engine or Continuous Compliance Engine, may fail to reboot following an upgrade. The upgrade process may corrupt the boot loader, necessitating an involved and manual process to recover, which includes needing to deploy an additional Delphix Engine.  This may result in a protracted outage of the affected system.

Contributing Factors

This article applies to the following versions of the Delphix engine:

Date Release
Dec 20, 2023 18.0.0.0
Nov 21, 2023 17.0.0.0

The issue can only occur as the result of a Delphix Engine upgrade to one of the affected versions enumerated above. 

Delphix Engines deployed on the Microsoft Azure platform are thought to be more susceptible than other platforms. 

The issue can only occur after performing an Apply Now type of upgrade in which a Delphix Engine reboot occurs immediately following the application of the upgrade. The issue will not occur for a Delay the Reboot type of upgrade. 

Once a Delphix Engine has successfully rebooted following an upgrade, there is no chance that the issue will occur on a subsequent reboot.  If a Delphix Engine has already upgraded to an affected release, the problem will not occur.

Symptoms

During a reboot following an upgrade, an affected Delphix Engine may fail to start.  On the system console, the following message may be seen:

error: file 'bufio.mod' not found
Entering rescue mode_
grub rescue>

Relief/Workaround

If you have already downloaded an upgrade image but have not yet upgraded, you may avoid the issue by selecting a Delay the Reboot type of upgrade.  Otherwise, defer upgrading until 18.0.0.1, or a later release.

Resolution

Resolved in DevOps Data Platform 18.0.0.1 and later releases for both the Continuous Data Engine and the Continuous Compliance Engine.

Additional Information

The issue occurs rarely. On systems thought to be most susceptible to the problem, i.e. Delphix Engines deployed on Azure, the issue appears to occur only once in approximately ten upgrade events. 

Although rare, the impact of the issue is so severe, Delphix has removed the upgrade images for affected releases from the download.delphix.com site.