NFS Performance Checking and Optimizing
- Mounting parameters
- soft/hard
hard: If the NFS service of the server becomes invalid after the client is successfully mounted, the client attempts to resend the service request infinitely. Therefore, programs that access the file system, such as running the cd, ls, or df command, are suspended and services are not responded. After the NFS service is restarted on the server, the client waits the returned operation result for a period of time.
soft: When soft is used for mounting, an error is returned after the client attempts to resend the request in limited times.
Note: If this parameter is not set, hard is used by default.
- timeo/retrans
timeo: specifies the waiting time before the request is retransmitted, and retrans specifies the number of retransmission times after each request fails. Both parameters are used to control the behavior of the client after the request times out. This parameter is valid only when retrans and soft are used together.
- wsize
wsize: If this parameter is not set, the value is negotiated between the client and server.
Note: Generally, the default value is used. The value is negotiated between the client and server. For a congested low-speed network, you can reduce the value of this parameter so that smaller request packets are sent to the server to improve NFS performance. For a high-speed network, you can increase the value of this parameter to reduce the number of request packets sent to the server and improve performance.
- rsize
rsize: If this parameter is not set, the value is negotiated between the client and server.
Note: Generally, the default value is used. The value is negotiated between the client and server. For a congested low-speed network, you can reduce the value of this parameter so that smaller request packets are sent to the server to improve NFS performance. For a high-speed network, you can increase the value of this parameter to reduce the number of request packets sent to the server and improve performance.
- ac/noac
ac/noac: To improve performance, the NFS client caches file attributes (ac by default), checks the file attributes periodically, and updates the file attributes. When the ac parameter is used to cache file attributes, the acregmin, acregmax, acdirmin, acdirmax, or actimeo parameter can also be used. acregmin/acregmax: specifies the minimum and maximum duration (in seconds) for caching common file attributes on the NFS client. After the duration expires, the attributes are updated. By default, the minimum duration is 3s and the maximum duration is 60s.
acdirmin/acdirmax: specifies the minimum and maximum duration (in seconds) for setting the attributes of the cache directory on the NFS client. After the duration expires, the attributes are updated. By default, the minimum duration is 3s and the maximum duration is 60s.
actimeo: sets acregmin, acregmax, acdirmin, and acdirmax to the same value, in seconds.
Note: When the attributes of the shared file on the server are frequently changed by multiple clients, you are advised to use the noac option, or ac with a smaller acregmin, acregmax, acdirmin, or acdirmax value to achieve better attribute consistency. If the attributes of the shared files on the server are not frequently changed, for example, the file sharing is read-only or the network delivers good performance, you are advised to use the default ac option and set acregmin, acregmax, acdirmin, or acdirmax based on the actual network status.
- sharecache/nosharecache
sharecache/nosharecache: When a client uses different local directories to mount the same NFS share and sharecache is used during mounting, the client shares the NFS data cache.
nosharecache causes multiple cache copies for a file, and leads to data inconsistency.
Note: This parameter is used when a client mounts the same shared directory for multiple times. You are advised to use the default sharecache option.
- lookupcache=mode
lookupcache=mode: The value of mode can be all, none, or pos. all caches both results that file exists and the file does not exist. Therefore, when the same file is queried for the second time, the LOOKUP command word is not sent again because the client has cached the result.
This option can quickly detect files created or deleted by other clients, but affects server performance.
Note: The LOOKUP command word is used to convert a file name into a file handle. If multiple clients frequently create or delete files, none is recommended. In other cases, all or pos is recommended.
- cto/nocto
cto/nocto: Linux implements the cache consistency feature "close-to-open" by comparing the GETATTR query results when a file is closed and opened next time (http://nfs.sourceforge.net/). If the query results are the same, the cache data on the client is still valid. Otherwise, the cache data should be cleared.
Using cto to mount and read the same file: Before the file is read for the second time, the client sends GETATTR to obtain the attribute and compares it with the cached result and finds that the file does not change. Therefore, the client does not send READ for the second time and directly reads the file content from its cache. If the file content is changed on the server before the file is read for the second time, the client sends GETATTR and compares the result with the cached result. As the change is detected, the client sends READ for the second time to read the file content from the server. cto ensures data consistency.
Using nocto to mount and read the same file: Before the file is read for the second time, the client does not send GETATTR and directly reads the file content from its cache. If the file content is changed on the server before the file is read for the second time, the client does not detect the change. Therefore, the client reads outdated data for the second time. After a period of time, the client sends a READ message to the server to read the file again and obtain the new file content.
Note: If the file content seldom changes, for example, the server provides the read-only share permission (the file system is exported with the read-only permission) for customers, you are advised to use the nocto option to improve performance. If the file content changes frequently and the client has high requirements on file cache consistency, you are advised to use the cto option.
- tcp/udp
tcp/udp: The TCP protocol ensures the stability, correctness, and reliability of the transmission. The UDP protocol is faster and provides higher response performance for the client.
Note: In an unstable and complex network environment, you are advised to use tcp. In a stable network environment, you can use udp. NFSv3/NFSv4 supports tcp/udp, while NFSv2 supports only udp.
- soft/hard
- NFS performance tuning options
- Protocol options
Specifies the size of an NFS packet block, that is, the maximum payload that can be carried in a packet sent by a protocol client. The default value is 1 MB (1,048,576 bytes), which is also the maximum value.

Run the echo 1048576 > /proc/fs/nfsd/max_block_size command to configure this parameter. This parameter does not take effect after the device is restarted.
Number of communication threads
Sets the number of protocol communication threads. Increase the value of this parameter when the processing capability of the communication thread is insufficient.

Run the echo 32 > /proc/fs/nfsd/threads command to configure this parameter.
To check whether the number of communication threads is insufficient, use the netstat tool to check the stacking of the buffer for sending and receiving packets on the NFS connection. Pay attention to the TCP (used by default) status. Port 2049 is the NFS service port.


Number of service threads
Sets the number of protocol service threads. Increase the value of this parameter when the processing capability of the service thread is insufficient.

Run the echo 32 > /proc/fs/nfsd/pool_threads command to configure this parameter.
- TCP/IP options
net.ipv4.tcp_rmem = 10000000 20000000 40000000
Size of the TCP receive buffer, which is the default value of the TCP receive window. For a 10GE NIC, you are advised to change the value to 10 MB, 20 MB, or 40 MB.
echo 10000000 20000000 40000000 > /proc/sys/net/ipv4/tcp_rmem net.ipv4.tcp_wmem = 10000000 20000000 40000000 echo 10000000 20000000 40000000 > /proc/sys/net/ipv4/tcp_wmem
Size of the TCP send buffer. For a 10GE NIC, you are advised to change the value to 10 MB, 20 MB, or 40 MB.
net.ipv4.tcp_mem = 400000 800000 1600000
Number of memory pages that can be used by the TCP protocol stack. The size of each page is 4 KB. You are advised to change the value to 400 MB, 800 MB, or 1.6 GB.
net.ipv4.tcp_window_scaling = 1
The scaling factor option of the TCP window. The TCP window larger than 64 KB is supported only after this option is enabled.
Note: The values of tcp_rmem and tcp_wmem need to be changed only on the client.
- NIC options
- 1: Support jumbo frames.
ip link set all xxx mtu 9000 ifconfig xxx mtu 9000
- Enable flow control.
ethtool –A xxx rx on ethtool –A xxx tx on
- Interrupt aggregation.
ethtool –C xxx rx-usecs 32
- 1: Support jumbo frames.
- System options
- Number of concurrent RPC requests.
echo 128 > /proc/sys/sunrpc/tcp_slot_table_entries
Note: After the value is changed, unmount the file system and remount it for the change to take effect.
- Number of concurrent RPC requests.
- Protocol options
- Performance problems location and analysis
- Latency analysis
The following figure shows the layers of the NFS application. The left part indicates the client side, and the right part indicates the server side.

Numbers 1 to 8 have corresponding delay statistics. You can obtain the delay between two adjacent points to accurately locate the time-consuming layer.
Application layer statistics
Application latency statistics (such as Vdbench) output corresponding information such as latency, rate, and OPS.
Client NFS/SunRPC statistics
- nfsiostat
The output is real-time statistics, but only the read and write command words can be collected.

- op/s This is the number of operations per second. - rpc bklog This is the length of the backlog queue. - kB/s This is the number of KB written/read per second. - kB/op This is the number of KB written/read per operation. - retrans This is the number of retransmissions. - avg RTT (ms) This is the duration between the time when client's kernel sends the RPC request and when it receives the reply. - avg exe (ms) This is the duration between the time when NFS client sends the RPC request to its kernel and when the request is completed. It includes the RTT time above.
- mountstats
The output is the accumulated statistics value, which can be used to collect statistics on all command words. The command output contains detailed description. Pay attention to the backlog wait, RTT, and total execute time corresponding to the command word statistics.

- /proc/1/mountstats
Source data of the nfsiostat and mountstats commands. Pay attention to the statistics of Xprt and statistics corresponding to commands.

- Xprt statistics
xprt: tcp 734 0 1 0 0 2173669 2173156 0 904604335 0 10 1165672 2493580 l 1. srcport: Ephemeral port l 2. bind_count: Number of rpcbind operations l 3. connect_count: Number of TCP connects l 4. connect_time: Time taken by connects l 5. idle_time: Transport idle duration. l 6. rpcsends: Number of sent sockets l 7. rpcrecvs: Number of received sockets l 8. badxids: Number of unmatchable XIDs received l 9. req_u: Average requests on the wire (slot table utilization) l 10. bklog_u: Backlog queue utilization (average length of backlog queue) l 11. max_slots: Max rpc_slots used l 12. sending_u: Send q utilization l 13. pending_u: Pend q utilization
Items 10 and 6: indicate the average number of queuing requests.
- per-op statistics
READ: 276305 276305 0 54882824 197516342140 16993798 857262 18130986 l 1.operations: Number of requests done for the operation l 2.transmissions: Number of times that an RPC request is actually transmitted for the operation. As you may have collected from the last entry, this can exceed the operation count due to timeouts and retries. l 3.major timeouts: Number of times that a request has a major timeout. The "nfs: server X not responding, still trying" message is displayed upon major timeouts. Timeouts and retries can exist without major timeouts, similar to the example lines. l 4.bytes sent: It includes not only the RPC payload but also the RPC headers and closely matches the on-the-wire size. l 5.bytes received: The same as bytes sent, it is the full size. l 6.cumulative queue time: Time taken (in milliseconds) by all requests to queue for transmission before they are sent. l 7.cumulative response time: Time taken (in milliseconds) to get a reply back after the request is transmitted. The kernel comments call this the RPC RTT. l 8.cumulative total request time: Time taken (in milliseconds) by all requests to queue, which starts at the initial queue and ends after all requests are handled. The kernel calls this the RPC execution time.
Item 8 and 1: indicate the average processing delay in total.
Item 7 and 1: indicate the average delay between the time when the client sends a request and when it receives a response.
Item 6 and 1: indicates the average queuing delay of the client.
Client ETH statistics
Use tcpdump to capture packets, and then use Wireshark to analyze the packets.
- Xprt statistics
- nfsiostat
- Concurrent analysis
The Linux NFS client controls the number of concurrent NFS requests. A small value of this parameter deteriorates the I/O performance. /proc/sys/sunrpc/tcp_slot_table_entries: The default value is 2 on Ubuntu 18.04.
Run the echo 128 > /proc/sys/sunrpc/tcp_slot_table_entries command to configure this parameter.
Note: After the configuration is complete, mount the NFS share again.
- Network analysis
- netstat -nap
Pay attention to the second and third columns, which indicate the receiving queue and sending queue respectively. The unit is byte. In the long term, if its size equals the TCP buffer size, the lower layer may be faulty. In this case, locate the fault of the system or NIC module.

- netstat -s
Pay attention to the statistics of the Abort and Drop fields. If the values increase continuously during the service process, locate the system fault.
- netstat -nap
- Latency analysis
soft and hard Parameters
Parameter |
Description |
|---|---|
hard |
If the NFS service of the server becomes invalid after the client is successfully mounted, the client attempts to resend the service request infinitely. Therefore, programs that access the file system, such as running the cd, ls, or df command, are suspended and services are not responded. After the NFS service is restarted on the server, the client waits the returned operation result for a period of time. |
soft |
When soft is used for mounting, an error is returned after the client attempts to resend the request in limited times. |


