Performing a Computing Power Test
Function
Create a matrix multiplication A(m,k)*B(k,n) and execute it multiple times. Then, calculate the computing power of the AI Core in the entire card or processor and the real-time power of processor in full computing power based on the computation amount and the time used for performing matrix multiplications.
Operator Operation Type |
Parameter |
Description |
Value |
|---|---|---|---|
fp 16 (inference and training servers) int8 (inference servers) |
m |
Rows in matrix A |
256 |
k |
Columns in matrix A and rows in matrix B |
32 |
|
n |
Columns in matrix B |
128 |
|
int8 (training servers) |
m |
Rows in matrix A |
256 |
k |
Columns in matrix A and rows in matrix B |
64 |
|
n |
Columns in matrix B |
128 |
Precautions
To prevent frequent log output from affecting the test result, ensure that log levels on the host and device are set to ERROR before the test. The method is as follows:
- Check the log level.
- Host: Run the echo $GLOBAL_LOG_LEVEL command. If the query result is invalid or empty, the log level is ERROR (corresponding to the value 3).
- Device: Check the global log level, module log level, and whether the event log function is enabled in "Appendixes > msnpureport Instructions" in the CANN Log Reference.
- If the log level is not ERROR, set the log level on the host and device by referring to "Setting Log Levels" in the CANN Log Reference.
Commands for Querying Test Parameters
You can run either of the following commands to list the parameters of the computing power test command:
ascend-dmi -f -h
ascend-dmi -f --help
Table 2 describes the parameters.
Parameter |
Description |
Mandatory |
|---|---|---|
[-f, --flops] |
Measures the computing power of the entire card or processor. |
Yes |
[-t, --type] |
Specifies the operator operation type, which can be fp16 or int8. If this parameter is not specified, fp16 is used by default. |
No |
[-d, --device] |
Specifies the ID of the device whose computing power is to be tested. The device ID is the ID of the Ascend AI Processor. You can run the ascend-dmi --info command to obtain the number of processors from the Chip parameter displayed. For example, if an Atlas 300I inference card is configured with four Ascend AI Processors, the value of Device ID ranges from 0 to 3. If the device ID is not specified, the computing power information of device 0 is returned by default.
|
No |
[-et, --et, --execute-times] |
Specifies the number of times that matrix multiplication is performed on a single AI Core on a specified processor.
|
No |
[-fmt, --fmt, --format] |
Specifies the output format. The value can be normal or json. If this parameter is not specified, the default value normal is used. |
No |
- Assuming the same number of matrix multiplications are performed: in an inference card, the computing power in int8 mode is doubled and the execution time is halved compared with that in fp16 mode. However, in a training card, the int8 mode doubles the size of a single matrix multiplication to fill up the processor data. As a result, the execution time is the same as that of the fp16 mode, but the computing power is still doubled.
- If you need to perform the computing power test for a long time, see Computing Power Test Script for Cyclical Calling. If you also need to collect the output of the ascend-dmi -i command when the AI Core usage is 100% during the computing power test, see Script for Querying Real-time Device Status.
- If multiple level-2 parameters such as -d and --et are added behind ascend-dmi -f, you can specify the sequence of these parameters. This does not affect the command output. For example, the output of ascend-dmi -f -d 2 --et 60 is the same as that of ascend-dmi -f --et 60 -d 2.
- The int8 mode uses the integer operation. Compared with the floating-point arithmetic of the fp16 mode, some operation units are reduced. Therefore, the final power consumption value is relatively low.
Example
- Perform a computing power test on device 2 (an inference server) by executing a matrix multiplication 60 million times. The default operator operation type is fp16.
ascend-dmi -f -d 2 --et 60
If information shown in Figure 1 is displayed, the tool is running properly.
- Perform a computing power test on device 2 (an inference server) by executing a matrix multiplication 60 million times. The operator operation type is int8.
ascend-dmi -f -t int8 -d 2 --et 60
If information shown in Figure 2 is displayed, the tool is running properly.
- Perform a computing power test on device 3 (a training server) by executing a matrix multiplication 8 million times.
If information shown in Figure 3 is displayed, the tool is running properly.
Table 3 describes the server parameters in the preceding figures.
Parameter |
Description |
|---|---|
Device |
Indicates the device ID. |
Execute Times |
Indicates the number of times that matrix multiplication is performed in the actual operation.
|
Duration(ms) |
Indicates the time used to complete the matrix multiplication computation. |
TFLOPS@FP16 |
Indicates the computing power of the processor when tested using the FP16 data. |
Power(W) |
Indicates the real-time power of the processor in full computing power.
NOTE:
You do not need to pay attention to the processor power during the computing power test because the power consumption data is collected periodically and there is an interval between two collections. When the computing power test period is too short, power consumption data fluctuates. Use a more specific power consumption test option to test the power consumption. |
- To ensure the correctness and accuracy of the test result, perform the computing power test separately.


