Skip to main content
Delphix

Delphix Engine Offline After Reboot When Disks Are Improperly Configured in VMware (KBA4680)

 

KBA

KBA# 4680

 

Issue

In some VMware configurations, a Delphix Engine may not come online successfully after a reboot. The problem described in this article generally only occurs after the first Engine reboot post-deployment, or the first reboot after a new disk is added to the Engine.

Exhaustion of datastore free space may also be encountered, leading to VMware events being logged, and possibly a hang of the Delphix VM at the same time.

Prerequisites

The problem details described in this document will only apply to VMware deployments, though other hypervisor platforms have similar configuration concerns if non-persistent storage is used.

Applicable Delphix Versions

Major Release

Sub Releases

5.3 5.3.0.0, 5.3.0.1, 5.3.0.2, 5.3.0.3, 5.3.1.0, 5.3.1.1, 5.3.1.2, 5.3.2.0, 5.3.3.0, 5.3.4.0, 5.3.5.0, 5.3.6.0
5.2 5.2.2.0, 5.2.2.1, 5.2.3.0, 5.2.4.0, 5.2.5.0, 5.2.5.1, 5.2.6.0, 5.2.6.1

5.1

5.1.0.0, 5.1.1.0, 5.1.2.0, 5.1.3.0, 5.1.4.0, 5.1.5.0, 5.1.5.1, 5.1.6.0, 5.1.7.0, 5.1.8.0, 5.1.8.1, 5.1.9.0, 5.1.10.0

5.0

5.0.1.0, 5.0.1.1, 5.0.2.0, 5.0.2.1, 5.0.2.2, 5.0.2.3, 5.0.3.0, 5.0.3.1, 5.0.4.0, 5.0.4.1 ,5.0.5.0, 5.0.5.1, 5.0.5.2, 5.0.5.3, 5.0.5.4

4.3

4.3.1.0, 4.3.2.0, 4.3.2.1, 4.3.3.0, 4.3.4.0, 4.3.4.1, 4.3.5.0

4.2

4.2.0.0, 4.2.0.3, 4.2.1.0, 4.2.1.1, 4.2.2.0, 4.2.2.1, 4.2.3.0, 4.2.4.0 , 4.2.5.0, 4.2.5.1

4.1

4.1.0.0, 4.1.2.0, 4.1.3.0, 4.1.3.1, 4.1.3.2, 4.1.4.0, 4.1.5.0, 4.1.6.0

4.0

4.0.0.0, 4.0.0.1, 4.0.1.0, 4.0.2.0, 4.0.3.0, 4.0.4.0, 4.0.5.0, 4.0.6.0, 4.0.6.1

3.2

3.2.0.0, 3.2.1.0, 3.2.2.0, 3.2.2.1, 3.2.3.0, 3.2.4.0, 3.2.4.1, 3.2.4.2, 3.2.5.0, 3.2.5.1, 3.2.6.0, 3.2.7.0, 3.2.7.1

3.1

3.1.0.1, 3.1.1.0, 3.1.2.0,  3.1.2.1, 3.1.3.0 , 3.1.3.1, 3.1.3.2, 3.1.4.0, 3.1.5.0, 3.1.6.0

3.0

3.0.0.3, 3.0.0.4, 3.0.1.0, 3.0.1.1, 3.0.1.2, 3.0.1.3, 3.0.2.0, 3.0.2.1, 3.0.3.0, 3.0.3.1, 3.0.4.0, 3.0.4.1, 3.0.5.0, 3.0.6.0, 3.0.6.1

Troubleshooting

The Delphix VM console will appear RED indicating one or more critical services cannot start (PostgreSQL will be indicated in maintenance, and Delphix Management Service will be indicated offline):

Additional reboot attempts will result in the same status.

Reviewing VMware disk configuration, one or more disk devices allocated to the Delphix VM are configured as "Independent - Nonpersistent".  The Disk Mode configuration options are highlighted below from legacy vSphere desktop (C) client, and web app.

clipboard_ee8c8688b39c75d09998ba7936681195c.png

clipboard_e343780cbbf40f827c1caedafbbc7714d.png

As described by VMware: "Changes to disks in nonpersistent mode are discarded when you turn off or reset the virtual machine. With nonpersistent mode, you can restart the virtual machine with a virtual disk in the same state every time. Changes to the disk are written to and read from a redo log file that is deleted when you turn off or reset the virtual machine."

Subsequently, the presence of redo log files can also be observed in the datastore:

clipboard_e6f480d784352f71f472aec65bd75034e.png

Resolution

To ultimately resolve, or avoid this issue, the mode of every disk allocated to the Delphix Engine must be set to a Persistent mode, either Dependent, or Independent Persistent.  The selection of one vs. the other is up to the VMware admin based on the need for snapshot capabilities in the future.

Once this behavior is encountered, without VMDK snapshots or other backups of the disk devices, there is no recovery option for the Delphix Engine. The Engine will need to be redeployed as a new installation, as the platform does not offer data redundancy, and the loss of one or more data devices is catastrophic, once the disk has been added to the Engine as a DATA device in sysadmin (System Setup) interface.

Please contact Delphix Customer Support if you believe this issue has been encountered for confirmation, prior to destruction and redeployment of the Engine.