Issue Description

Migrating a model from an external device to Ascend devices for inference can cause performance issues. These issues differ from those encountered during training. Common inference issues often relate to out-of-the-box performance optimization. When users run model inference on Ascend devices, the performance is poor (worse than other products or showing lower throughput).

The possible issues are computing and scheduling issues.

  • Computation issues: Some cards take much longer time than the normal range to compute because they handle complex models or process large amounts of data.
  • Scheduling issues: High free time on a compute card indicates abnormal data transfer from the host to the device. This may occur due to limited CPU power or background processes consuming CPU resources while the model runs.