TB028 Delphix Virtualization Engine Admin Interfaces May Become Unresponsive
The internal application on the Delphix Engine that supports the browser interface (GUI), command line interface (CLI) and Web APIs may become very slow. In pathological cases administrative interfaces may hang completely. Virtual Databases (VDBs) will continue to run, but it may become impossible to administer the Delphix Engine. Scheduled jobs, e.g. including SnapSync jobs that generate new snapshots from dSources, may be delayed, may not run, or may not complete. Other administrative functions of the Delphix Engine may be similarly affected.
The issue may occur in the following Delphix Engine Releases:
Delphix Engine 220.127.116.11
Delphix Engine 18.104.22.168 and 22.214.171.124
Both the prevalence and severity of the issue is related to the number of TCP network connections to and from the Delphix Engine. Delphix Engines with a large number of environments (hosts) and/or with a large number of dSources and VDBs are more likely to be impacted and to be impacted more severely. The probability of experiencing the issue increases over time.
Customers may experience some or all of the following:
- Excessive or unexpected CPU utilization by the Delphix Engine
- CLI prompt will take a very long time to appear after connecting to the Delphix Engine.
- Using tab completion in the CLI will appear to hang the session.
- Commands issued in CLI may not complete and appear to hang.
- When entering the URL for the Delphix GUI in a web browser a blank or empty screen appears while the browser spins. Eventually the dialogue box for user and password may appear.
- Logging in after entering the username and password will cause the browser to spin and entering the GUI may take a long time.
- Navigating within the GUI will be very slow and sluggish.
The below workaround will prevent the problem from occurring and will improve operation even if there has already been an onset of symptoms.
If the CLI is already non-responsive, contact Delphix Support for an alternate workaround that can be implemented with Delphix Support assistance.
The workaround for this issue consists of two parts, the second of which is optional but recommended. There are two roles defined on the Delphix Engine. The delphix_admin role is used for managing dSources and VDBs, and the sysadmin role is used for managing the system administration console.
Disable the collection of TCP statistics in the Delphix Engine.
Disabling TCP statistics is persistent, meaning once disabled statistics will continue to be disabled even after a Delphix Engine restart. Disabling statistics does not have any negative impact on the use of the product.
- ssh into the Delphix Engine using a user with the delphix_admin role. (Windows users may use an ssh utility like putty)
Run the following command to pause the collection of statistics
analytics select 'default.tcp' pause ; commit
Confirm that statics are paused by running the following command:
analytics select 'default.tcp' get state
and output similar to the following should be seen:
delphix> analytics select 'default.tcp' get state PAUSED delphix>
where "delphix" is the name of the Delphix Engine being used.
Restart the delphix_admin application (optional)
Performing this step may interrupt any running Delphix jobs, like SnapSync or Provisioning jobs, and such jobs will not automatically be restarted. VDBs that are already running will not be impacted.
- ssh into the Delphix Engine using a user with the sysadmin role. (Windows users may use an ssh utility like putty)
Run the following CLI command
system restart ; commit
Command output similar to the following should be seen:
delphix> system restart ; commit Restarting the management service. The current session will be re-established once the "service is available.
The issue is fully resolved in Delphix Engine 126.96.36.199 and later releases.