0x00131003 NPU Resource Allocation Failure on a Container (Major Alarm)

Alarm Description

Alarm message: The NPU resources allocated for the container are abnormal.

This alarm is generated when NPU resources fail to be allocated for a container; this alarm is cleared when the NPU resources are successfully allocated or the container is deleted.

Module that generates this alarm: AtlasEdge

Attribute

Table 1 Alarm information

Alarm ID

Alarm Severity

Auto Clear

0x00131003

Major

Yes

Impact on the System

Containers cannot use NPU resources.

Possible Cause

The NPU is abnormal or the number of applied NPU resources exceeds the upper limit.

Procedure

  1. Log in to the CLI of the device, run the docker inspect $(docker ps -q) | grep Devices -n3 command to view the Devices information of the container, and check whether the number of containers whose Devices information is not empty exceeds that of NPUs.
    • If yes, properly allocate NPU resources and delete unnecessary NPU containers. Check whether the alarm is cleared.
    • If no, go to 2.
  2. Contact Huawei technical support.