Skip to main content
Delphix

TB105 Delphix Cloud Engines May Experience Data Loss and Hangs

 

 

 

Alert Type

Data Loss, Availability

Impact

Delphix Cloud Engines, where storage is provisioned on AWS S3 or Azure Blob Storage, may rarely encounter an issue that leads to data corruption or loss. Although rare, if the problem is triggered, data loss and protracted engine outages could occur. 

If corruption occurs on an affected engine, no suitable recovery mechanism is available other than to restore an engine from backup. 

 

Contributing Factors

This article applies to the following versions of Delphix Continuous Data:

Date Release
Apr 13, 2023 10.0.0.0
Mar 13, 2023 | Mar 20, 2023 9.0.0.0 | 9.0.0.1
Feb 13, 2023 8.0.0.0

The issue can only occur on Continuous Data Cloud Engines, i.e. where persistent storage uses object storage, specifically Amazon S3 storage or Azure Blob Storage.  Engines using only traditional block storage such as Amazon EBS volumes, are not impacted. 

The issue is thought to occur only rarely.  Susceptible systems where there is also a low number of write I/O operations may be at increased risk of encountering the issue.

Restarting, Rebooting, Powering Off, or Upgrading a susceptible system can significantly increase risk of encountering the issue.

Symptoms

If data corruption occurs, it is possible that no symptoms will occur if corrupted blocks are not subsequently accessed.

If corrupted blocks are accessed, an affected engine will become unresponsive:

  • The Delphix Admin application will not respond

  • Virtual Databases (VDBs) may become unresponsive and crash

Avoid upgrades to any affected Delphix releases.

Do not perform any operations on susceptible engines that may trigger the problem, including engine reboots, restarts, or powering off.  

Resolution

The issue is resolved DevOps Data Platform 10.0.0.1 and later releases for Continuous Data Engines.

Additional Information

To check if a Delphix Engine is configured as a Cloud Engine:

Either:

  1. Navigate to the Engine using a browser.

  2. Login to the SETUP app using sysadmin credentials.

  3. Examine the center panel labeled Storage and note the Object Storage for Data parameter.  If this parameter shows Enabled, then the Engine is configured as a Cloud Engine. 

Or:

  1. Connect to the engine using ssh, putty, or equivalent utility, using credentials with the sysadmin role.

  2. Enter the command “storage objectStorage ls” at the command prompt. If the engine is configured as a Cloud Engine, the displayed properties will have non-null values, e.g.

sample.acme.com> storage objectStorage ls

Properties
    type: S3ObjectStore
    accessCredentials:
        type: S3ObjectStoreAccessInstanceProfile
    bucket: smpl-prod-dlpx-10000-qar-80192-27a4593a
    cacheDevices: xvdb
    configured: true
    endpoint: https://s3.us-west-2.amazonaws.com
    region: us-west-2
    size: 12TB
Operations
update
testConnection
cacheHitsReport
clearCacheHits

 

Related Documents

Continuous Data Installation and Setup Configurations (see Delphix Cloud Engines section)

Starting, Stopping, and Restarting Your Engine