Oracle 10g databases running on Red Hat Linux v5.5, with database files exported over NFS, could experience suboptimal read performance due to a Linux bug. This issue will cause multi-block scattered reads to be penalized with high latency, potentially having a negative impact on application performance.
Oracle relies on two types of requests for reading in large chunks of data – Direct Path Reads (DPR) and Scattered Reads (SCAT). Scattered Read request is issued by Oracle to bring in large chunks of data into the buffer cache, assuming data will be re-used or is not large enough to bypass the cache. Direct Path Reads do not bring data into the Cache and are designed for data which Oracle believes will not have re-use. If you observe performance issues on your Oracle 10g databases, caused specifically due to db file scattered read requests showing high IO latency, you could be effected by this bug.
The block size used for multiblock reads is determined by the following two parameters:
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
db_file_multiblock_read_count integer 16
db_block_size integer 8192
The number of blocks requested for each multi-block read is determined by “db_file_multiblock_read_count”. So, for the above database, with db_block_size of 8KB, the multi-block read requests will be 128KB. The maximum values allowed for db_file_multiblock_read_count is 128 blocks. For a typical Oracle installation with block size of 8KB, multiblock reads can be as high as 1MB in size.
For Oracle 10g databases running on R.H Linux 5.5 with datafiles mounted over NFS, a multi-block SCAT read request issued by Oracle is broken into individual block reads by the time it reaches the NFS server. Oracle uses the readv(2) system call with a vector of blocks to be fetched into the SGA. The blocks are likely contiguous on disk but non-contiguous in memory, so a readv(2) system call is designed to perform a scatter-gather operation and fetch all the blocks as a single read request. In Red Hat Linux 5.5, it is observed that the readv(2) system call is issuing individual requests for each block. So the break of the large block read into multiple small block reads was occurring at the Linux OS. Even if the NFS server latency for the individual 8k reads is low, network and RPC latency for each request accumulate to make the total response time for Oracle significantly higher compared to the actual disk read latency. For example, even if NFS server responses never exceed 1ms, the Oracle latency for SCAT requests reached 20+ms.
This issue has been verified to be caused by Linux and not the NFS client or Oracle 10g. When running the same environment, Oracle 10G database on RHEL 6.4, it was observed that SCAT requests of size 128KB were issued as-is by the NFS Client to the NFS server. There was almost a 20x improvement in overall latency as the 128KB blocks were being serviced by the NFS server within 1ms.
Please note that direct path reads are not affected by this bug.
Ways to mitigate this issue
Upgrade to a newer version of Linux. (Issue is not present when using Red Hat Linux 6.4)
Force multi-block reads to use direct path rather than scattered read
Set “_serial_direct_read” to TRUE.
This forces all serial scans that exceed the small table threshold to use direct path read
This forces 11g style read semantics on the 10g databases
Change the Oracle init parameter “filesystemio_options”
The problem manifests only when this parameter is set to “SETALL”. Using “NONE” or “ASYNC” will mitigate the problem.
“ASYNC” is the preferred setting in the absence of “SETALL”
Users should evaluate the three options and choose the best one suited for their environments.
The following Linux bugs discuss this issue in Linux 5.5 and the fixes made earlier this year: