Real-Time Device Status
Function
Check the status of the device in running.
Parameters
You can run either of the following commands to see parameters of the command for querying the real-time device status:
ascend-dmi -i -h
ascend-dmi -i --help
Table 1 lists only test-specific parameters. For details about other common parameters, see Common Parameters.
Parameter |
Description |
Mandatory |
|---|---|---|
[-i, --info] |
Displays the real-time status of a device. |
Yes |
[-b, --brief] |
Displays basic information about a chip. |
No |
[-dt, --dt, --detail] |
Displays detailed information about a chip. |
No |
Leave --dt and -b unspecified |
Displays basic information about a chip by default. |
No |
Example
- Query detailed information about a chip.
The following are examples of the queried chip details returned by each type of server. If the corresponding information is returned, the tool is running properly.
- Inference serverFigure 1 Example of querying real-time device status (inference server)

- When you run the ascend-dmi -i -dt command to query the real-time device status, the value of the Memory Information field is DDR or on-chip memory information.
- When you run the ascend-dmi -i command to query the real-time device status, the value of the Used Memory field is DDR or on-chip memory information.
- Training serverFigure 2 Example of querying real-time device status (training server)

- Training cardFigure 3 Example of querying real-time device status (Atlas 300T training card (model 9000))

- Atlas 200I A2 accelerator moduleFigure 4 Example of querying real-time device status (Atlas 200I DK A2 developer kit)

- Atlas 200 AI accelerator moduleFigure 5 Example of querying real-time device status (Atlas 200 AI accelerator module (RC))
Figure 6 Example of querying real-time device status (Atlas 200 AI accelerator module (EP))
Table 2 describes the server parameters in the preceding figures.
Table 2 Parameters Parameter
Description
Product
Type
Indicates the chip model.
Training server
NPU Count
Indicates the number of NPUs.
Card Quantity
Indicates the number of cards.
Standard card
Type
Indicates the standard card model.
Card Manufacturer
Indicates the card manufacturer.
Card Serial Number
Indicates the serial number of the card.
Card ID
Indicates the ID of the card.
Real-time Card Power (W)
Indicates the real-time power consumption in W.
Device Count
Indicates the number of devices (NPUs).
Chip Name
Indicates the chip name.
Standard card and training server
Device ID
Indicates the logic chip ID.
Chip ID
Indicates the chip ID.
DIE ID
Indicates the DIE ID of a chip.
AI Core Information
Indicates the AI Core information, including:
- AI Core Count: number of AI Cores
- AI Core Usage (%): AI Core usage
- Cube Count: number of cubes
- Vector Count: number of vectors
CPU Information
Indicates the CPU information, including:
- AI CPU Count: number of AI CPUs
- AI CPU Usage (%): AI CPU usage
- Control CPU Count: number of Ctrl CPUs
- Control CPU Usage (%): Ctrl CPU usage
- Control CPU Frequency (MHz): frequency of the Ctrl CPU
Memory Information
Indicates the memory information, including:
- Total (MB): total memory capacity in MB
- Used (MB): used memory
- Bandwidth Usage (%): memory bandwidth usage.
- Frequency (MHz): memory frequency in MHz
Power Information
Indicates the power consumption information, including:
- Real-time Power (W): real-time power consumption (available only when the command is executed on a training server)
- Rated Power (W): rated power of the processorNOTE:
Atlas A3 training product andAtlas A3 inference product contain multiple NPUs. Their power consumption shown in the JSON file is displayed at the device-level, which actually specifies the power consumption of the entire NPU.
Temperature (°C)
Indicates the chip temperature.
voltage(V)
Indicates the voltage, in volt.
health
Displays the health information.
PCIe Information
Indicates the PCIe information, including:
- Domain: PCIe domain
- Bus: PCIe bus number
- Device: PCIe device ID
- Bus ID: PCIe bus address
- Subvendor ID: subsystem vendor ID
- Subdevice ID: subdevice ID
- LnkCap Speed: maximum link speed
- LnkCap Width: maximum link bandwidth
- LnkSta Speed: current speed of the link
- LnkSta Width: current bandwidth of the link
- CPU Affinity: CPU affinity
Error Information
Displays error information.
Error Count
Indicates the number of errors.
ECC Information
Displays ECC information.
DDR/SRAM/HBM/NPU
Indicates the memory type of the card. The options are as follows:
- DDR
- SRAM
- HBM
- NPU
The following information is also contained:
- Single-Bit Error Count: number of single-bit errors
- Double-Bit Error Count: number of double-bit errors
Standard card and training server
When you run the ascend-dmi -i --dt command, the following situations may occur:
- If you run this command as a non-root user, "<Access denied. Please switch to root and try again.>" is displayed for some check items.
- If you run this command in a container, "Unknown" is displayed for some check items. To obtain the information, exit the container and run the command again.
- Inference server
- Query basic information about a chip.
The following are examples of basic information about the queried chip returned by each type of server. If the corresponding information is returned, the tool is running properly.
- Inference serverFigure 7 Example of querying real-time device status (inference server)

- Training serverFigure 8 Example of querying real-time device status (training server)

- Atlas 300T training cardFigure 9 Example of querying real-time device status (Atlas 300T Pro training card (model 9000))

- Atlas 200I A2 accelerator moduleFigure 10 Example of querying real-time device status (Atlas 200I DK A2 developer kit)

- Atlas 200 AI accelerator moduleFigure 11 Example of querying real-time device status (Atlas 200 AI accelerator module)

Table 3 describes the server parameters in the preceding figures.
Table 3 Parameter description Parameter
Description
Product
Type
Indicates the standard card model.
Standard card
Card
Card ID
NPU Count
Indicates the number of NPUs.
Real-time Card Power
Indicates the actual power consumption of the card.
Chip
Indicates the chip number.
Name
Indicates the chip name.
Type
Indicates the chip model.
Training server
NPU Count
Indicates the number of NPUs.
Chip Name
Indicates the chip name.
Power
Indicates the power consumption.
Health
Indicates the chip health status.
Standard card and training server
Used Memory
Indicates the memory used.
Temperature
Indicates the current temperature of the chip.
Voltage
Indicates the current voltage of the chip.
Device ID
Indicates the logic chip ID.
Bus ID
PCIe bus address
AI Core Usage
Indicates the AI Core usage of the chip.
- Inference server