TilingData of v2 (Deprecated)

This structure has been deprecated and will be removed in later versions. Do not use this structure. You do not need to directly set the members in this structure. Instead, use the API settings provided by HCCL Tiling.

Function

Obtains the fixed communication configuration as shown in Table 1 before the AI CPU starts to deliver a communication task. In operator implementation, the tiling method is used to assemble communication configuration items. After the fixed parameters are configured in the fixed sequence for tiling data, communication configuration is passed to the AI CPU when the AI CPU communication API is called.

Parameters

Table 1 HCCL TilingData V2 parameters

Parameter

Description

version

Version of TilingData. The parameter is of the uint32_t type.

In V2 TilingData struct, version can only be set to 2.

Note: This field in V2 TilingData corresponds to preparePosition in V1 TilingData. The field value 2 indicates struct of V2 version. When the field value is 1, the struct is of V1 version. In this case, use the Mc2Msg struct.

mc2HcommCnt

Total number of communication tasks in all communicators. The parameter is of the uint32_t type. The maximum value is 3.

serverCfg

Common parameter configuration of the collective communication server. The parameter is of the Mc2ServerCfg type.

hcom

Parameter configuration of each communication task in the communicators. The parameter is of the Mc2HcommCfg type. In the definition of the communication operator TilingData, a total of mc2HcommCnt Mc2HcommCfg structs need to be defined. For example, if mc2HcommCnt is set to 2, you need to define two Mc2HcommCfg parameters in sequence, for example, hcom1 and hcom2.

Table 2 Mc2ServerCfg struct description

Parameter

Description

version

Reserved field, which does not need to be configured.

debugMode

Reserved field, which does not need to be configured.

sendArgIndex

Reserved field, which does not need to be configured.

recvArgIndex

Reserved field, which does not need to be configured.

commOutArgIndex

Reserved field, which does not need to be configured.

reserved

Reserved field, which does not need to be configured.

Table 3 Mc2HcommCfg struct description

Parameter

Description

skipLocalRankCopy

Reserved field, which does not need to be configured.

skipBufferWindowCopy

Reserved field, which does not need to be configured.

stepSize

Reserved field, which does not need to be configured.

reserved

Reserved field, which does not need to be configured.

groupName

Communicator where the current communication task is located. The value is of the char * type and can contain a maximum of 128 bytes.

algConfig

Communication algorithm configuration. The value is of the char * type and can contain a maximum of 128 bytes.

Currently, the following values are supported:

  • "AllGather=level0:doublering": AllGather communication task.
  • "ReduceScatter=level0:doublering": ReduceScatter communication task.
  • "AlltoAll = level0:fullmesh;level1:pairwise": AlltoAllV communication task.

opType

Type of a communication task. The parameter is of the uint32_t type. The value is provided in HcclCMDType parameters.

reduceType

Reduction operation type. This parameter is valid only for communication tasks that have reduction operations. The parameter is of the uint32_t type. The value is provided in HcclReduceOp parameters.

Restrictions

  • To use a V2 tiling struct, you must set the first parameter version of the struct to 2.
  • The Tiling Data struct of the operator must contain all parameters in HCCL TilingData parameters of version v2. Each parameter must be defined in strict accordance with the corresponding parameter structure.

Example

The following is the prototype of the custom operator AlltoallvDoubleCommCustom. This operator has two input-output pairs. x1 and y1 are the input and output of the AlltoAllV task in the EP communicator, and x2 and y2 are the input and output of the AlltoAllV task in the TP communicator.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
namespace ops {
class AlltoallvDoubleCommCustom : public OpDef {
public:
    explicit AlltoallvDoubleCommCustom(const char *name) : OpDef(name)
    {
        this->Input("x1")
            .ParamType(REQUIRED)
            .DataType({ge::DT_FLOAT16, ge::DT_BF16})
            .Format({ge::FORMAT_ND, ge::FORMAT_ND})
            .UnknownShapeFormat({ge::FORMAT_ND, ge::FORMAT_ND});
        this->Input("x2")
            .ParamType(REQUIRED)
            .DataType({ge::DT_FLOAT16, ge::DT_BF16})
            .Format({ge::FORMAT_ND, ge::FORMAT_ND})
            .UnknownShapeFormat({ge::FORMAT_ND, ge::FORMAT_ND})
            .IgnoreContiguous();
        this->Output("y1")
            .ParamType(REQUIRED)
            .DataType({ge::DT_FLOAT16, ge::DT_BF16})
            .Format({ge::FORMAT_ND, ge::FORMAT_ND})
            .UnknownShapeFormat({ge::FORMAT_ND, ge::FORMAT_ND});
        this->Output("y2")
            .ParamType(REQUIRED)
            .DataType({ge::DT_FLOAT16, ge::DT_BF16})
            .Format({ge::FORMAT_ND, ge::FORMAT_ND})
            .UnknownShapeFormat({ge::FORMAT_ND, ge::FORMAT_ND});
        this->Attr("group_ep").AttrType(REQUIRED).String();
        this->Attr("group_tp").AttrType(REQUIRED).String();
        this->Attr("ep_world_size").AttrType(REQUIRED).Int();
        this->Attr("tp_world_size").AttrType(REQUIRED).Int();
        this->AICore().SetTiling(optiling::AlltoAllVDoubleCommCustomTilingFunc);
        this->AICore().AddConfig("ascendxxx"); // Replace ascendxxx with the actual Ascend AI Processor model.
        this->MC2().HcclGroup({"group_ep", "group_tp"});
    }
};
OP_ADD(AlltoallvDoubleCommCustom);
}

The following describes the declaration and implementation of the custom operator Tiling Data.

In the declaration, version is set to 2, which indicates the operator tiling struct of V2 version. In this example, two AlltoAllV communication tasks are performed in the kernel implementation of the AlltoallvDoubleCommCustom operator. The mc2HcommCnt parameter is set to 2. Then, Mc2ServerCfg, the common parameter configuration of the server, is defined. Finally, two Mc2HcommCfg structs are defined, which indicate the parameter configuration of each communication task in the communicators.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
// HCCL TilingData declaration
BEGIN_TILING_DATA_DEF(AlltoallvDoubleCommCustomTilingData)
    TILING_DATA_FIELD_DEF(uint32_t, version);                           // Version of the HCCL tiling structure. Set it to 2.
    TILING_DATA_FIELD_DEF(uint32_t, mc2HcommCnt);                       // Total number of communication operators in all communicators. The maximum value is 3. In the kernel implementation of the AlltoallvDoubleCommCustom operator, one AlltoAllV task is used in each communicator. Therefore, this field is set to 2.
    TILING_DATA_FIELD_DEF_STRUCT(Mc2ServerCfg, serverCfg);    // Common parameter configuration of server, fused operator level.
    TILING_DATA_FIELD_DEF_STRUCT(Mc2HcommCfg, hcom1);         // Parameter configuration of each communication task in the communicators, operator-level. The number of Mc2HcommCfgs structs is mc2HcommCnt.
    TILING_DATA_FIELD_DEF_STRUCT(Mc2HcommCfg, hcom2);
END_TILING_DATA_DEF;

REGISTER_TILING_DATA_CLASS(AlltoallvDoubleCommCustom, AlltoallvDoubleCommCustomTilingData);
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
// HCCL TilingData configuration snippet.
static ge::graphStatus AlltoAllVDoubleCommCustomTilingFunc(gert::TilingContext *context)
{
    char *group1 = const_cast<char *>(context->GetAttrs()->GetAttrPointer<char>(0));
    char *group2 = const_cast<char *>(context->GetAttrs()->GetAttrPointer<char>(1));

    AlltoallvDoubleCommCustomTilingData tiling;
    tiling.set_version(2);
    tiling.set_mc2HcommCnt(2);
    tiling.serverCfg.set_debugMode(0);

    tiling.hcom1.set_opType(8);
    tiling.hcom1.set_reduceType(4);
    tiling.hcom1.set_groupName(group1);
    tiling.hcom1.set_algConfig("AlltoAll=level0:fullmesh;level1:pairwise");

    tiling.hcom2.set_opType(8);
    tiling.hcom2.set_reduceType(4);
    tiling.hcom2.set_groupName(group2);
    tiling.hcom2.set_algConfig("AlltoAll=level0:fullmesh;level1:pairwise");
    
    tiling.SaveToBuffer(context->GetRawTilingData()->GetData(), context->GetRawTilingData()->GetCapacity());
    context->GetRawTilingData()->SetDataSize(tiling.GetDataSize());
    return ge::GRAPH_SUCCESS;
}