Performance Tuning
You can enable the CPU high-performance mode, Transparent Huge Pages (THP), and jemalloc optimization to improve performance. The three modes are independent of each other. You can enable one or more of them.
When a 192-core server processes low-concurrency long-sequence jobs, the CPU load tends to be high. As a result, the CPU becomes a system bottleneck, causing TPOP performance fluctuation and deterioration. You are advised to perform optimization by referring to this section.
Enabling the CPU High-Performance Mode and THP
Run the following commands on the bare-metal server (BMS) to enable high-performance CPU mode and THP to improve performance.
- Enabling high-performance CPU mode increases TPS by approximately 3% while maintaining the same latency constraint.
cpupower -c all frequency-set -g performance
- Enabling THP results in more stable throughput, as demonstrated by multiple tests.
echo always > /sys/kernel/mm/transparent_hugepage/enabled
The service process may compete with model execution processes for CPU resources, leading to fluctuations in performance and latency. To mitigate the impact of CPU contention, you can manually bind the service process to an odd-numbered CPU core when starting the service. The detailed method is as follows:
- Run the lscpu command to view the CPU configuration of the system.
lscpu
Information similar to the following is displayed:
NUMA: NUMA node(s): 8 NUMA node0 CPU(s): 0-23 NUMA node1 CPU(s): 24-47 NUMA node2 CPU(s): 48-71 NUMA node3 CPU(s): 72-95 NUMA node4 CPU(s): 96-119 NUMA node5 CPU(s): 120-143 NUMA node6 CPU(s): 144-167 NUMA node7 CPU(s): 168-191
- Run the taskset -c command to bind the service process to an odd-numbered CPU core and start the process.
taskset -c $cpus ./bin/mindieservice_daemon
$cpus corresponds to the value of node1, node3, node5, or node7 in the CPU configuration command output.
- Run the lscpu command to view the CPU configuration of the system.
Enabling jemalloc Optimization
To optimize jemalloc, you need to compile the jemalloc dynamic link library and import the compiled dynamic link library to the script. The procedure is as follows:
- Click the link to download the jemalloc source code, and refer to the INSTALL.md file for compilation and installation.
- Before starting the service, import the jemalloc dynamic link library to the environment by running the following command:
export LD_PRELOAD="{$path_to_lib}/libjemalloc.so:$LD_PRELOAD"In the preceding command, path_to_lib indicates the path of libjemalloc.so.