Execution Timeout Error of the AI CPU Operator
Symptom
Any of the timeout errors is reported during operator execution.
- Symptom 1
- The error code E39999 is reported during Runtime execution. The Runtime error messages "PrintAicpuErrorInfo" and "ErrCode=507018, desc=[aicpu exception]" are printed in the plog file on the host.
- In addition, the device log of the AI CPU contains the error message "HandleTaskTimeout".
This symptom is the same as the error message in Possible Cause > Example 3 in Kernel Execution Error of the AI CPU Operator.
- Symptom 2
An error is reported during Runtime execution. The Runtime error messages "PrintAicpuErrorInfo" and "ErrCode=507017, desc=[aicpu timeout" are printed in the plog file.
The plog file is stored in $HOME/ascend/log/[run|debug]/plog by default, in the format of plog-pid_yyymmddhhmmss.log.
1[ERROR] RUNTIME(16243,msame):2022-09-22-11:27:01.794.510 [api_c.cc:661]16243 rtStreamSynchronize:[EXEC][DEFAULT]ErrCode=507017, desc=[aicpu timeout], InnerCode=0x715002a
Possible Cause
- The operator input/output shape is too large, resulting in slow operator execution.
- The hardware performance is poor and insufficient to support complex computation of a large number of operators.
Solution
- Call the aclrtSetOpExecuteTimeOut API to increase the operator execution timeout interval.
The API prototype is defined as follows:
1aclError aclrtSetOpExecuteTimeOut(uint32_t timeout) // timeout, in seconds.
- If the error persists, contact technical support for troubleshooting. After obtaining the logs, click here to contact technical support.
Parent topic: Operator Execution Issues