Byte Misalignment Between the Source and Destination Addresses
Byte misalignment occurs when the source and destination addresses are not aligned by a specific number of bytes (for example, 512 bytes) during cluster communication, significantly reducing transmission bandwidth and affecting communication performance. This issue commonly arises in SDMA transmissions (intra-node communication) and often appears in the ZeRO algorithm. Proper data padding for address alignment can resolve the issue and improve communication performance.
Symptom
Analysis
According to the analysis of HCCL experts, the source address and destination address of the allGather communication operator in the DP communicator cannot be aligned. As a result, the communication performance deteriorates seriously.
Troubleshooting
Perform byte-aligned padding for the allGather bucket in the DP communication domain. Currently, AscendSpeed has adapted to byte-aligned padding for the allGather bucket in the DP communication domain. DeepSpeed and Megatron have not been modified. You need to modify nccl_start_alignment_factor in the DeepSpeed source code, as shown in Figure 3. After the modification, the allGather duration is changed from 350 ms to 50 ms.


