Oracle Virtual Databases (VDBs) running on Delphix use the Network File System (NFS) for storing and accessing Oracle-related files such as tablespace data files, control files, redo logs, etc. In rare circumstances, writes to Oracle 19 database files using NFS with dNFS may cause corrupted database blocks. Over time, multiple blocks and multiple databases could be affected.
Subsequent to data block corruption, Oracle database queries, DML commands, or DDL commands may fail with an ORA-01578 error. This may disrupt applications using affected databases and could result in lost data.
The following table provides the versions of the Delphix engine which exhibit the issue:
|Major Release||All Sub Releases|
|6.0||184.108.40.206, 220.127.116.11, 18.104.22.168, 22.214.171.124|
The issue only occurs with VDBs running Oracle Database Release 19 versions. Delphix Engines deployed on ESX, Google Cloud (GCP), and Oracle Cloud (OCI) are impacted. Note that Delphix Engines deployed on AWS and Azure are not impacted.
Versions later than Oracle Database Release 19 have not been tested and are not yet supported with Delphix.
The issue can only occur when using the Direct NFS (dNFS) feature with Oracle Database (RDBMS) software.
- Both NFSv3 and NFSv4 versions of NFS are susceptible when used with dNFS
The issue is more likely to occur on Oracle 19 VDBs with a write-intensive workload.
Affected VDB alert logs may contain one or more messages similar to the following:
Hex dump of (file <file number>, block <block number>) in trace file /oracle/admin/product/19c/diag/rdbms/dbname/DNNAME/trace/DBNAME_j000_28697.trcCorrupt block relative dba: 0x9e0ad97c (file <file number>, block <block number>) Fractured block found during user buffer read Data in bad block: type: 6 format: 2 rdba: 0x9e0ad97c last change scn: 0x0000.0dc8.e39dfdf1 seq: 0x2 flg: 0x06 spare3: 0x0 consistency value in tail: 0x0018000a check value in block header: 0x91e5 computed block checksum: 0x32f6
Errors in file /oracle/admin/product/19c/diag/rdbms/dbname/DBNAME/trace/DBNAME_j001_2178552.trc ORA-01578: ORACLE data block corrupted (file # <file number>, block #<block number>) ORA-01110: data file <file number>: '<file path name>'
One or more Oracle DBWR trace files may be generated containing errors similar to the following:
 kgnfs_flushmsg: CH OUT of ORDER SEND m->order 21839973 ch->order 21842199 ch 0x68651098  kgnfs_flushmsg: CH OUT of ORDER SEND m->order 21839974 ch->order 21842199 ch 0x68651098  kgnfs_flushmsg: CH OUT of ORDER SEND m->order 21839975 ch->order 21842199 ch 0x68651098  kgnfs_flushmsg: CH OUT of ORDER SEND m->order 21839976 ch->order 21842199 ch 0x68651098  kgnfs_flushmsg: CH OUT of ORDER SEND m->order 21839977 ch->order 21842199 ch 0x68651098  kgnfs_flushmsg: CH OUT of ORDER SEND m->order 21839978 ch->order 21842199 ch 0x68651098  kgnfs_flushmsg: CH OUT of ORDER SEND m->order 21839979 ch->order 21842199 ch 0x68651098  kgnfs_flushmsg: CH OUT of ORDER SEND m->order 21839980 ch->order 21842199 ch 0x68651098  kgnfs_flushmsg: CH OUT of ORDER SEND m->order 21839981 ch->order 21842199 ch 0x68651098
Running the Oracle dbv command can identify one or more corrupted data blocks, for example:
$ dbv userid=... file=<file name> blocksize=8192 DBVERIFY: Release 126.96.36.199.0 - Production on Fri, Apr 30 11:21:52 2021 Copyright (c) 1982, 2019, Oracle and/or its affiliates. All rights reserved. DBVERIFY - Verification starting : FILE = <file name> Page <block number> is influx - most likely media corrupt Corrupt block relative dba: 0xa628d086 (file <file number>, block <block number) Fractured block found during dbv: Data in bad block: type: 6 format: 2 rdba: 0xa628d086 last change scn: 0x0000.0332.8d105da7 seq: 0x2 flg: 0x00 spare3: 0x0 consistency value in tail: 0x78124205 check value in block header: 0x0 block checksum disabled ... DBVERIFY - Verification complete ...
Disable use of the dNFS feature for susceptible Oracle 19 VDBs.
The issue is fully resolved in Delphix software release 188.8.131.52.
Oracle bug/patch 32931941 can also be applied to resolve the issue.
When running Oracle and using NFS for storage of Oracle Database files, there are two options for the NFS client that will be used by Oracle. One is to use the host operating system's (for example, Linux, AIX, or Oracle Solaris) NFS client. The other is to use the NFS client built into the Oracle database itself. This built-in client is referred to as Direct NFS (dNFS).
The performance of dNFS can surpass the performance of the host operating system's NFS client in many circumstances, depending on the specific OS platform and database workload. The dNFS client can utilize more TCP/IP network connections when communicating with the NFS server, potentially enabling a higher degree of parallelization.
The issue described in this bulletin results due to an interoperability issue between the Oracle dNFS client and the NFS server in Delphix. The issue occurs when the Delphix NFS server responds to an NFS write request with a message to retry the operation later. When the dNFS client retries the operation, it clobbers a different in-progress write request. This leads to the block corruption.
How to Tell if an Oracle Database is Configured with dNFS
Examining the alert log for a VDB will show a message similar to:
Oracle instance running with ODM: Oracle Direct NFS ODM Library Version 6.0
It's also possible to query dNFS-related views in running VDBs, for example:
SQL> select count(*) from v$dnfs_files; COUNT(*) -------- 1013 SQL> select count(*) from v$dnfs_servers; COUNT(*) -------- 2
The presence of rows in the dNFS-related views shown above demonstrates that dNFS is actively in use.
Enabling and Disabling Direct NFS Client Control of NFS (external link)
About Direct NFS Client Storage Mounts to NFS Storage Devices (external link)