ACLNN_CACHE_LIMIT

Description

Sets the number of operator information entries cached on the host for an aclnn API. The cached operator information includes the workspace size, operator executor, and tiling information.

The value range is [1,10000000]. The default value is 10000.

It is advised that the default value be used for ACLNN_CACHE_LIMIT in common scenarios. In the dynamic shape scenario, if the operator shape range is large, you can increase the value of this environment variable to improve the scheduling performance, which however will increase the host memory overhead. For details, see Restrictions.

Example

export ACLNN_CACHE_LIMIT=10000

Restrictions

  • The single-operator cache is managed by thread. Threads use different caches and are independent of each other. ACLNN_CACHE_LIMIT specifies the number of operator cache entries of each thread. Therefore, the number of operator cache entries increases with that of threads.

    Each operator cache entry occupies about 2 KB host memory. The total single-operator cache is calculated as follows: ACLNN_CACHE_LIMIT × Number of threads × 2 KB.

    For example, there are 10 threads and ACLNN_CACHE_LIMIT is set to 100000. In this case, the total single-operator cache is as follows: 10 × 100000 × 2 KB = 2 GB.

  • The cache of a fusion operator (large kernel operator) is managed in an independent memory pool at the process level. A single cache entry occupies about 20 KB host memory. The total fusion operator cache is calculated as follows: ACLNN_CACHE_LIMIT × 20 KB.
  • You are advised to set ACLNN_CACHE_LIMIT based on the total host memory size, number of threads, and size of each operator cache entry. A too large value of ACLNN_CACHE_LIMIT may cause high usage of host memory and scheduling performance deterioration.

Applicability

Atlas inference products

Atlas training products

Atlas A2 training products/Atlas A2 inference products

Atlas A3 training products/Atlas A3 inference products