HcclBatchSendRecv

Description

Completes sending and receiving tasks in batches on the current rank. The sending and receiving tasks of the current rank are asynchronous and do not block each other.

Prototype

HcclResult HcclBatchSendRecv(HcclSendRecvItem* sendRecvInfo, uint32_t itemNum, HcclComm comm, aclrtStream stream);

Parameters

Parameter

Input/Output

Description

sendRecvInfo

Input

Start address of the list of sending and receiving tasks to be distributed in the rank.

HcclSendRecvItem type. For details, see HcclSendRecvItem.

itemNum

Input

Number of tasks to be received and sent by the rank.

comm

Input

Communicator where the operation is performed.

stream

Input

Stream of the rank.

Returns

HcclResult: HCCL_SUCCESS on success; else, failure.

Constraints

  • "Asynchronous" means that the sending and receiving tasks on the same device are asynchronous and do not block each other. However, the sending and receiving tasks between devices are still synchronous. Therefore, the sending and receiving tasks between devices must be in one-to-one mapping, which is the same as HcclSend and HcclRecv.
  • The task list must not contain duplicate sending or receiving tasks pointed to the same rank.
  • In the current version, this API does not support the scenario where Virtual Pipeline (VPP) is enabled.
  • For the Atlas 200T A2 Box16 heterogeneous subrack, if a link fails to be set up between devices in the server (error code: EI0010), set HCCL_INTRA_ROCE_ENABLE to 1 and HCCL_INTRA_PCIE_ENABLE to 0 to enable the communication between the devices in the server through the RoCE loop. (Ensure that the server has RoCE NICs and the RDMA links between the devices that can send and receive data are connected.) The following is an example of configuring the environment variables:
    export HCCL_INTRA_ROCE_ENABLE=1
    export HCCL_INTRA_PCIE_ENABLE=0

Applicability

Atlas Training Series Product