Setting Shape Data Cache to Improve Performance
To improve performance in the model inference scenario with dynamic shape inputs, you can set the environment variable HOST_CACHE_CAPACITY to configure data cache during dynamic shape execution. The default value is 0, indicating that data cache is disabled. If this variable is set to a non-zero positive integer, for example, 10, the system caches execution data of the most frequently occurred 10 input shapes. The cache is hit when those shapes appear again, thus to improve the host execution performance. However, the host memory usage will increase, in proportion to the environment variable value and the model size.
export HOST_CACHE_CAPACITY=10
Note that the HOST_CACHE_CAPACITY value is within the range of [1, maximum value of the INT32 type]. The maximum value is 2147483647. If it is exceeded, the data cache function is disabled.