HcclCommInitRootInfo

Applicability

Product

Supported

Atlas A3 training products/Atlas A3 inference products

Atlas A2 training products/Atlas A2 inference products

Atlas 200I/500 A2 inference products

Atlas inference products

Atlas training products

For Atlas A2 training products/Atlas A2 inference products, only the Atlas 800T A2 training server, Atlas 900 A2 PoD cluster basic unit, and Atlas 200T A2 Box16 heterogeneous subrack are supported.

For the Atlas inference products, only the Atlas 300I Duo inference card is supported.

Description

Initializes the HCCL based on rootInfo to create an HCCL communicator.

This API can be called concurrently by multiple threads within the same process. However, it only supports single-device single-thread scenarios. Concurrent calls on a single device across multiple threads are not supported.

As shown in the following figure, step 0 and step 1 cannot be called concurrently. Step 1 must be executed serially after step 0.

Prototype

1
HcclResult HcclCommInitRootInfo(uint32_t nRanks, const HcclRootInfo *rootInfo, uint32_t rank, HcclComm *comm)

Parameters

Parameter

Input/Output

Description

nRanks

Input

Number of ranks in a cluster.

rootInfo

Input

Root rank information, including the IP address and ID of the root rank, which is generated by HcclGetRootInfo.

rank

Input

ID of the current rank.

comm

Output

Pointer to the initialized communicator.

For details about the definition of the HcclComm type, see HcclComm.

Returns

HcclResult: HCCL_SUCCESS on success; else, failure.

Constraints

The values of nRanks and rootInfo of all ranks in the same communicator must be the same.

Example

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
uint32_t rankSize = 8;
uint32_t deviceId = 0;
// Generate the identification information of the root rank.
HcclRootInfo rootInfo;
HcclGetRootInfo(&rootInfo);
// Initialize the communicator.
HcclComm hcclComm;
HcclCommInitRootInfo(rankSize, &rootInfo, deviceId, &hcclComm);
// Destroy the communicator.
HcclCommDestroy(hcclComm);