FAQ
Symptom
The following error information is displayed when the npu-smi info command is executed or a service problem occurs:

Possible Causes
- NPU card being occupied
- Incorrect ownership configuration
Method 1: Use an idle NPU card for restart.
- Query the NUP cards.
docker inspect $(docker ps -q) |grep davinci
The command output is:
root@ubuntu:~# docker inspect $(docker ps -q) |grep davinci "PathOnHost": "/dev/davinci4", "PathInContainer": "/dev/davinci4", "PathOnHost": "/dev/davinci5", "PathInContainer": "/dev/davinci5", "PathOnHost": "/dev/davinci6", "PathInContainer": "/dev/davinci6", "PathOnHost": "/dev/davinci7", "PathInContainer": "/dev/davinci7", "PathOnHost": "/dev/davinci_manager", "PathInContainer": "/dev/davinci_manager", - Modify the container startup script.
--device=/dev/davinci0 \ --device=/dev/davinci1 \ --device=/dev/davinci2 \ --device=/dev/davinci3 \ ...
- Use an NPU card that is not in the queried list to start the container.
bash docker_start_infer.sh ascendhub.huawei.com/public-ascendhub/XXX:XXX /data/ /home/HwHiAiUser/XXX/
Method 2: Delete the privileged container parameter from the container startup script.
- Delete parameter --privileged from the container startup script.
docker run -it \ --device=/dev/davinci0 \ --device=/dev/davinci_manager \ --device=/dev/devmm_svm \ --device=/dev/hisi_hdc \ --privileged \ # Needs to be deleted. -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \ -v /usr/local/Ascend/driver:/usr/local/Ascend/driver \ - Start the container.
bash docker_start_infer.sh ascendhub.huawei.com/public-ascendhub/XXX:XXX /data/ /home/HwHiAiUser/XXX/
Method 3: Modify the ownership configuration.
- Run the id HwHiAiUser command on the physical machine to check whether the owner and owner group ID of the container user are consistent with those configured on the physical machine. If they are different, change the user ID of the physical machine.
- The driver and firmware need to be uninstalled after the user ID of a physical machine is changed. Reinstall the driver and firmware after restarting the physical machine. Then deploy the owner and owner group IDs of the HwHiAiUser and hwMindX users.