NFSv4 and Concerns Regarding recover_lost_locks Parameter (KBA9973)
KBA
KBA# 9973
Issue
With the introduction of support for NFSv4 in the Continuous Data Engine, the requirement for a target Environment to enable recover_lost_locks is documented. This alters the default NFSv4 behavior for most modern platforms, as the Continuous Data Engine platform requires a client to attempt to reclaim file locks due to lease expiration. This is applied as the NFS grace time is 20 seconds; as such, NFSv4 mounts need to renew any lock leases within that period. If some network disruption occurs lasting longer than 20 seconds, we want the NFSv4 client to retry to renew the lease, rather than sending an IO to the application layer and risk crashing the database.
Ultimately, this behavior is intended to mitigate any issues that can be encountered in a transient network disruption event.
Some System Administrators may raise concerns as there are references to this behavior being undesirable in some Environments; https://access.redhat.com/solutions/1179643 is one such example. Although this is a valid concern in some Environments, Delphix is not a generic NFS filesharing platform. The nature of Delphix operations is such that we can safely enable this configuration parameter without concern for file corruption, etc.
In a non-clustered Virtual Database (VDB) scenario, only one NFS client will be accessing and locking the datafiles on a given NFS share. Therefore, the concerns of a lease expiration followed by another node obtaining file lock would be negated, as no other client would be permitted to lock and write to the file in this transient state.
In a clustered Environment, where the same NFS share is made accessible to multiple nodes, Delphix relies on the application to coordinate file access and prevent file corruption (Oracle clusterware, for example). These instances where a node corrupts a file due to a bad write would ultimately be a bug in that application.
There have been no indications of instability, data loss or data corruption in practice on the Continuous Data Engine platform as a result of this change, and the parameter has undergone extensive testing in every Continuous Data Engine release.
Applicable Delphix Versions
- Click here to view the versions of the Delphix engine to which this article applies
-
Date Release Apr 13, 2023 10.0.0.0 Mar 13, 2023 9.0.0.0, 9.0.0.1 Feb 13, 2023 8.0.0.0 Jan 12, 2023 7.0.0.0 Releases Prior to 2023 Major Release All Sub Releases 6.0 6.0.2.0, 6.0.2.1, 6.0.3.0, 6.0.3.1, 6.0.4.0, 6.0.4.1, 6.0.4.2, 6.0.5.0, 6.0.6.0, 6.0.6.1, 6.0.7.0, 6.0.8.0, 6.0.8.1, 6.0.9.0, 6.0.10.0, 6.0.10.1, 6.0.11.0, 6.0.12.0, 6.0.12.1, 6.0.13.0, 6.0.13.1, 6.0.14.0, 6.0.15.0, 6.0.16.0, 6.0.17.0, 6.0.17.1, 6.0.17.2
Related Articles
The following articles may provide more information or related information to this article:
- RedHat Knowledge Base - https://access.redhat.com/solutions/1179643