Memory Allocator Configuration
As a memory allocator, jemalloc is designed to outperform traditional options like glibc. It specifically reduces memory fragmentation and boosts allocation efficiency during multi-threaded, high-concurrency tasks, which enables the system to fully realize the performance benefits of multi-core architectures and high-concurrency environments.
During memory allocation, locks cause thread waiting, which affects performance. The jemalloc function prevents threads from contending for locks by using thread-specific variables. Each thread maintains its own dedicated memory manager, and allocations occur locally within the thread itself. This eliminates the need for threads to contend for locks with one another.
Enabling or Disabling jemalloc
You can configure environment variables to enable or disable the jemalloc memory allocator.
Enable jemalloc:
1 | export LD_PRELOAD=/usr/local/Ascend/cann/lib64/libjemalloc.so |
Disable jemalloc:
unset LD_PRELOAD
Replace /usr/local/Ascend/cann in the example with the actual CANN software installation path. The default installation path of Toolkit is used as an example. For the root user, the path is /usr/local/Ascend/cann. For a non-root user, the path is ${HOME}/Ascend/cann.
Applicable Scenarios
According to developer testing, jemalloc can improve the inference performance to some extent in the inference scenario based on the MindIE framework. The following table shows the model performance test results based on the MindIE Benchmark tool. Because hardware and software configurations vary, the following test data is intended for reference only and should not be considered a performance standard.
Model |
Concurrency |
Input Length |
Experiment No. |
jemalloc Disabled (Tokens/s) |
jemalloc Enabled (Tokens/s) |
Performance Gains (%) |
|---|---|---|---|---|---|---|
Qwen2-7B |
1 |
128 |
Experiment 1 |
151.123 |
155.3789 |
2.82 |
Experiment 2 |
151.2461 |
155.362 |
2.72 |
|||
Experiment 3 |
151.5677 |
154.433 |
1.89 |
|||
Average value |
151.3122667 |
155.0579667 |
2.48 |