No NPU Is Mounted to a Job Container

Symptom

When the following command is run in the job container, no content is displayed. It indicates no NPU device is mounted to the container.

ls /dev/davinci*

Causes

The default value of the startup parameter useAscendDocker of Ascend Device Plugin is true. This parameter must be used together with Ascend Docker Runtime.

  • Ascend Docker Runtime may not be installed in the environment.
  • The tool has been installed, but the Docker service is not restarted.

Solution

For cause 1: Install Ascend Docker Runtime by referring to Installing the Ascend Docker Runtime in "Installing Cluster Scheduling Components", restart the Docker service, delete the old job, and deliver the job again.

For cause 2: Restart the Docker service, delete the old job, and deliver the job again.

Ascend Docker Runtime can proactively mount a specified NPU to a container. You can run the following command to query the Docker configuration:

docker info 2>&1 | grep "Default Runtime"

If ascend is displayed in the command output, Docker uses Ascend Docker Runtime. Example:

Default Runtime: ascend