Low Bandwidth of a Single Server
Symptom
When the HCCL Test tool is used to test the bandwidth of a single server, the peak bandwidth of the single server is lower than the expected value.
Possible Cause
- The log level is not ERROR.
If the log level of some devices is not set to ERROR, the bandwidth of the corresponding devices may decrease (in some scenarios, the bandwidth may decrease by about 2 GB).
- The bandwidth of a certain link is low.
If the bandwidth of a certain link is low, the bandwidth of the entire server is low.
Solution
- If the bandwidth decrease is caused by the improper log level setting, you can set the log levels on the host and device to ERROR.
The following is an example of setting the log level on the host:
export ASCEND_GLOBAL_LOG_LEVEL=3 // (0:debug 1:info 2:warning 3:error)
The following is an example of setting the log level on the device:
# Querying for i in {0..7}; do /usr/local/Ascend/driver/tools/msnpureport -r -d $i; done # Setting for i in {0..7}; do msnpureport -g error -d $i; done - If the bandwidth is still low after the log level is properly set, check whether the HCCS bandwidth or PCIe bandwidth of a certain link is low.
Parent topic: HCCL Test FAQs