Performing a Bandwidth Test
Function
Measure the bus bandwidth, memory bandwidth, and latency.
Precautions
- To prevent frequent log output from affecting the test result, ensure that log levels on the host and device are set to ERROR before the test. The method is as follows:
- Check the log level.
- Host: Run the echo $GLOBAL_LOG_LEVEL command. If the query result is invalid or empty, the log level is ERROR (corresponding to the value 3).
- Device: Check the global log level, module log level, and whether the event log function is enabled in "Appendixes > msnpureport Instructions" in the CANN Log Reference.
- If the log level is not ERROR, set the log level on the host and device by referring to "Setting Log Levels" in the CANN Log Reference.
- Check the log level.
- The d2d bandwidth test result is the total amount of data read and write operations divided by the consumed time. Similar to actual training or inference, the d2d bandwidth test has internal optimizations such as cache and prefetch. Therefore, the calculated bandwidth may exceed the nominal bandwidth.
- When the test data of the Atlas 200I SoC A1 core board flows in h2d or d2h mode, the test result is directly copied from the CPU due to the particularity of the architecture. The test result is different from that of other product models, which is normal.
- To ensure the optimal bandwidth test result, perform the test on the bare metal server.
Commands for Querying Test Parameters
You can run either of the following commands to list the parameters of the bandwidth test command:
ascend-dmi --bw -h
ascend-dmi --bw --help
Table 1 describes the parameters.
Parameter |
Description |
Mandatory |
|---|---|---|
[-bw, --bw, --bandwidth] |
Measures the processor bandwidth. -bw is supported, but --bw or --bandwidth is recommended. |
Yes |
[-t, --type] |
Specifies the data flows to be tested.
When testing the bandwidth and latency, the test data flow can be divided into the following directions. If this parameter is not specified, the bandwidth and latency information in the h2d, d2h, and d2d directions will be returned by default.
|
No |
[-s, --size] |
Specifies the size of the data to be transmitted and the test result displaying mode. The value of data size ranges from 1 byte to 512 MB, in bytes. The display mode can be fixed-length mode or step mode. If -s is not specified, the step mode is used, and the bandwidth test result of the transmitted data is output. The range of the transmitted data is 2 bytes to 32 MB. If -s is specified, the fixed-length mode is used. -s must be followed by a number to specify the size of the data to be transmitted. If it is not followed by a number, the format is incorrect. In addition, when p2p is used and -s is not specified, the default data size is 128 MB. If -s is specified, -s must be followed by a number to specify the size of the data to be transmitted. If it is not followed by a number, the format is incorrect. |
No |
[-et, --et, --execute-times] |
Number of iterations, that is, number of copy times in the memory. The value range is 1 to 1000. If this parameter is not specified, the default value 5 is used. |
No |
[-d, --device] |
Specifies the ID of the device whose bandwidth is to be tested. The device ID is the ID of the Ascend AI Processor. You can run the ascend-dmi --info command to obtain the number of processors from the Chip parameter displayed. For example, if an Atlas 300I inference card is configured with four Ascend AI Processors, the value of Device ID ranges from 0 to 3. If the device ID is not specified, the bandwidth information of device 0 is returned by default. |
No |
[-ds, --ds, --device-src] |
Specifies the ID of the source device for a P2P test. This parameter must be specified together with the [-dd, --dd, --device-dst] parameter. |
No |
[-dd, --dd, --device-dst] |
Specifies the ID of the destination device for a P2P test. This parameter must be specified together with the [-ds, --ds, --device-src] parameter. |
No |
[-fmt, --fmt, --format] |
Specifies the output format. The value can be normal or json. If this parameter is not specified, the default value normal is used. |
No |
- The --ds and --dd parameters must be used together, and the values of the two parameters cannot be the same.
- If multiple level-2 parameters such as -t and -s are added behind ascend-dmi --bw, the sequence of these parameters does not affect the command output. For example, the output of ascend-dmi --bw -t h2d -d 0 --et 100 is the same as that of ascend-dmi --bw -t h2d --et 100 -d 0.
- The calculation method of the P2P bandwidth test depends on the NPU working mode. If the P2P bandwidth test result differs greatly from the nominal bandwidth, you are advised to use the SMP mode. Perform the following operations: Log in to the iBMC and run the following command to set the SMP mode. The value 1 indicates the SMP mode, and the value 0 indicates the AMP mode.
ipmcset -d npuworkmode -v 1
Example
The command output on an inference server is similar to that on a training server. The following uses screenshots from an inference server with an Ascend 310 AI Processor as an example. The screenshot of the P2P test is that of a training server, because only the Ascend 910 chips (used for the training servers only) support the P2P test.
- Measure the bandwidth and latency without specifying parameters. (If no parameter is specified, information about device 0 in h2d, d2h, and d2d directions will be returned and displayed in step mode).
- Measure the bandwidth and latency information about the data transmitted from the host to device 0 with the copy times of 100 times.
- Fixed-length mode
ascend-dmi --bw -t h2d -d 0 -s 8388608 --et 100
If information shown in Figure 1 is displayed, the tool is running properly. For details about the parameters, see Table 2.
- Step mode
ascend-dmi --bw -t h2d -d 0 --et 100
If information shown in Figure 2 is displayed, the tool is running properly. For details about the parameters, see Table 2.
- Fixed-length mode
- Tests the transmission rate and delay from the source device to the destination device.
- Perform a P2P test, in which data is transmitted from device 1 (source) to device 2 (destination).
ascend-dmi --bw -t p2p --ds 1 --dd 2
If information shown in Figure 3 is displayed, the tool is running properly. For details about the parameters, see Table 2.
Table 2 Parameter description Parameter
Description
Host to Device Test
Indicates the data flow direction. Value:- Host to Device Test
- Device to Host Test
- Device to Device Test
- Unidirectional Peer to Peer Test
- Bidirectional Peer to Peer Test
Device X: Ascend XXX
Device X indicates the ID of the source device, and Ascend XXX indicates the processor type.
ID
Device ID
Size(Bytes)
Indicates the size of the data transmitted.
Execute Times
Indicates the number of iterations.
Bandwidth(MB/s)
Indicates the processor bandwidth.
Elapsed Time(us)
Indicates the execution duration.
- Perform a P2P test without specifying the source and destination devices.
If information shown in Figure 4 is displayed, the tool is running properly.
- Perform a P2P test, in which data is transmitted from device 1 (source) to device 2 (destination).



