Analysis of Exposed Communication Performance Degradation
- Check whether the performance of communication operators deteriorates seriously.
Open the CommunicationCompare sheet in the performance_comparison_result_.xlsx and compare the performance metrics of the following communication operators, as shown in Figure 1.
- Operator type (such as Broadcast and AllReduce)
- Time consumption metrics (average time consumption, maximum/minimum time consumption) and call frequency statistics
- Information about associated subtasks (such as Reduce_Inline, Notify_Record, Notify_Wait, and Memcpy)
- On the OverallMetrics sheet, perform in-depth comparison and analysis by communication domain.
Pay attention to the differences between transit_time and wait time in the same communication domain, as shown in Figure 2.
- Check whether there are communication operators with deteriorated communication performance. If no, the parallelism between communication and computing is poor. Continue to analyze the cluster performance of the NPU.

