Failure to Restart a User Process After Unexpected Exit
Symptom
A user process cannot be restarted after unexpected exit. The log message similar to the following is displayed.
AscendCL log message: aclrtProcessReport failed
aclrtProcessReport failed, ret = 107012 aclrtProcessReport failed, ret = 107012
Runtime log message: halResourceIdAlloc xxx failed
[ERROR] DRV(2086,rtstest_host):2021-06-09-02:14:46.034.368 [ascend][curpid: 2086, 2086][drv][tsdrv][halResourceIdAlloc 477]id is exhausted, type(0 stream), range[0, 1024), dev_id(0), tsid(0). [ERROR] RUNTIME(2086,rtstest_host):2021-06-09-02:14:46.034.380 [npu_driver.cc:285]2086 StreamIdAlloc:[driver interface] halResourceIdAlloc streamid failed: device_id=0, tsId=0, drvRetCode=48! [ERROR] RUNTIME(2086,rtstest_host):2021-06-09-02:14:46.034.401 [stream.cc:448]2086 Setup:Failed to alloc stream id, retCode=0x702001a. [ERROR] RUNTIME(2086,rtstest_host):2021-06-09-02:14:46.034.416 [context.cc:1251]2086 StreamCreate:Setup stream failed, retCode=0x702001a. [ERROR] RUNTIME(2086,rtstest_host):2021-06-09-02:14:46.034.440 [logger.cc:211]2086 StreamCreate:Create stream failed, priority=7 ,flags=0. [ERROR] RUNTIME(2086,rtstest_host):2021-06-09-02:14:46.034.458 [api_c.cc:461]2086 rtStreamCreateWithFlags:ErrCode=207008, desc=[driver error:no stream resource], InnerCode=0x702001a [ERROR] RUNTIME(2086,rtstest_host):2021-06-09-02:14:46.034.469 [error_message_manage.cc:26]2086 ReportFuncErrorReason:rtStreamCreateWithFlags execute failed, reason=[driver error:no stream resource]
Possible Cause
According to the log, the allocation of resources such as public task IDs, stream IDs, and event IDs fails. The possible causes are as follows:
- Resources are used up by other processes.
- Resources are not destroyed when the previous process exits.
Solution
To rectify the fault, perform the following steps:
- Wait for one minute and restart the process to ensure that the resources of the previous process are destroyed.
- Stop other processes or restart the process after other processes are complete.
- If the resource allocation failure persists, check whether the number of available resources exceeds the upper limit. If no, restart the environment to forcibly destroy resources and restore the environment.
Parent topic: Abnormal Resources at Runtime