ACLNN_CACHE_LIMIT
Description
Sets the number of operator information entries cached on the host for a single-operator API. The cached operator information includes the workspace size, operator executor, and tiling information.
The value range is [1,10000000]. The default value is 10000.
It is not advisable to manually set ACLNN_CACHE_LIMIT in common scenarios. Retain the default value. In the dynamic shape scenario, if the operator shape range is large, you can increase the value of this environment variable to improve the scheduling performance, which however will increase the host memory overhead. For details, see Restrictions.
Example
export ACLNN_CACHE_LIMIT=10000
Restrictions
- The single-operator cache is managed by thread. Threads use different caches and are independent of each other. ACLNN_CACHE_LIMIT specifies the number of operator cache entries of each thread. Therefore, the number of operator cache entries increases with that of threads.
Each operator cache entry occupies about 2 KB host memory. The total single-operator cache is calculated as follows: ACLNN_CACHE_LIMIT x Number of threads x 2 KB.
For example, there are 10 threads and ACLNN_CACHE_LIMIT is set to 100000. The total single-operator cache is as follows: 10 x 100000 x 2 KB = 2 GB.
- The cache of a fusion operator (large kernel operator) is managed in an independent memory pool at the process level. A single cache occupies about 20 KB host memory. The total fusion operator cache is calculated as follows: ACLNN_CACHE_LIMIT x 20 KB.
- You are advised setting ACLNN_CACHE_LIMIT based on the total host memory size, number of threads, and size of each operator cache entry. A too large value of ACLNN_CACHE_LIMIT may cause high usage of host memory and scheduling performance deterioration.