TilingData of v1 (Deprecated)
This structure has been deprecated and will be removed in later versions. Do not use this structure. You do not need to directly set the members in this structure. Instead, use the API settings provided by HCCL Tiling.
For the TilingData structure of an MC2 operator, the tiling structure of computation must be after that of communication.
TilingData of v1 and v2 can be distinguished using the first uint32_t field of the tiling structure, that is, the preparePosition field of v1 and the version field of v2. If the tiling structure of v2 is used, set version to 2. If the tiling structure of v1 is used, set preparePosition to 1. No matter which TilingData is used, you must strictly follow the tiling structure of the corresponding version and use it as a part of the TilingData structure of the operator.
Function
Obtains the fixed communication configuration Mc2Msg before the AI CPU starts to deliver a communication task. In operator implementation, the tiling method is used to assemble communication configuration items. After the fixed parameters are configured in the fixed sequence for tiling data, communication configuration is passed to the AI CPU when the AI CPU communication API is called.
Parameters
|
Parameter |
Description |
|---|---|
|
preparePosition |
Mode of task assembling on the server. You need to explicitly assign a value of the uint32_t type in tiling. The following value is supported: 1: The AI CPU and AI Core use the communication task mechanism for message transfer and task delivery. This parameter is set to 1 when the AI Core uses the mode of message notification, that is, when HCCL is used in the operator. |
|
sendOff |
Reserved parameter, which cannot be configured. |
|
recvOff |
Reserved parameter, which cannot be configured. |
|
tailSendOff |
Reserved parameter, which cannot be configured. |
|
tailRecvOff |
Reserved parameter, which cannot be configured. |
|
sendCnt |
Reserved parameter, which cannot be configured. |
|
recvCnt |
Reserved parameter, which cannot be configured. |
|
tailSendCnt |
Reserved parameter, which cannot be configured. |
|
tailRecvCnt |
Reserved parameter, which cannot be configured. |
|
totalCnt |
Reserved parameter, which cannot be configured. |
|
turnNum |
Reserved parameter, which cannot be configured. |
|
tailNum |
Reserved parameter, which cannot be configured. |
|
stride |
Reserved parameter, which cannot be configured. |
|
workspaceOff |
Reserved parameter, which cannot be configured. |
|
notifyOff |
Reserved parameter, which cannot be configured. |
|
notifyBeginCnt |
Reserved parameter, which cannot be configured. |
|
notifyEndCnt |
Reserved parameter, which cannot be configured. |
|
useBufferType |
Location where the input data of communication algorithm is obtained. The value is of the uint8_t type. The options are as follows:
|
|
funID |
Reserved parameter, which cannot be configured. |
|
dataType |
Reserved parameter, which cannot be configured. |
|
groupNum |
Reserved parameter, which cannot be configured. |
|
reuseMode |
Reserved parameter, which cannot be configured. |
|
commType |
Reserved parameter, which cannot be configured. |
|
reduceOp |
Reserved parameter, which cannot be configured. |
|
commOrder |
Reserved parameter, which cannot be configured. |
|
waitPolicy |
Reserved parameter, which cannot be configured. |
|
rspPolicy |
Reserved parameter, which cannot be configured. |
|
exitPolicy |
Reserved parameter, which cannot be configured. |
|
commAlg |
Communication algorithm setting. You need to explicitly assign a value in Tiling. The value is of the uint8_t type. The following value is supported: 1: Full-mesh algorithm. Full-mesh connections are established between NPUs, that is, data can be directly transmitted between any two NPUs. For details about the algorithm of "Collective Communication Algorithm Introduction" in . |
|
taskType |
Reserved parameter, which cannot be configured. |
|
debugMode |
Reserved parameter, which cannot be configured. |
|
stepSize |
Reserved parameter, which cannot be configured. |
|
sendArgIndex |
Reserved parameter, which cannot be configured. |
|
recvArgIndex |
Reserved parameter, which cannot be configured. |
|
commOutArgIndex |
Reserved parameter, which cannot be configured. |
|
hasCommOut |
Whether the computing result of the current device communication algorithm is to be output to the recvBuf (address of the destination data buffer). This parameter is configured only for the AllGather and AlltoAll algorithms. The value is of the uint8_t type. The options are as follows:
|
|
reserve |
Reserved parameter |
|
reserve2 |
Reserved parameter |
Restrictions
- The Tiling Data struct of the operator must contain all Mc2Msg parameters in sequence.
- The AI CPU needs to obtain the communication configuration of the fixed data structure to ensure the consistent structure when Tiling Data is registered.
Atlas A3 training products /Atlas A3 inference products does not support TilingData of this version currently.
Example
The following uses the custom operator AllGatherMatmulCustom as an example. The operator prototype is as follows. gather_out indicates the output of the AllGather communication task.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 |
[ { "op": "AllGatherMatmulCustom", "input_desc": [ { "name": "x1", "param_type": "required", "format": [ "ND", "ND" ], "type": [ "float16", "bfloat16" ] }, { "name": "x2", "param_type": "required", "format": [ "ND", "ND" ], "type": [ "float16", "bfloat16" ] }, { "name": "bias", "param_type": "optional", "format": [ "ND", "ND" ], "type": [ "float16", "bfloat16" ] } ], "output_desc":[ { "name": "y", "param_type": "required", "format": [ "ND", "ND" ], "type": [ "float16", "bfloat16" ] }, { "name": "gather_out", "param_type": "required", "format": [ "ND", "ND" ], "type": [ "float16", "bfloat16" ] } ], "attr": [ { "name": "group", "type": "string", "default_value":"", "param_type":"required" }, { "name": "rank_size", "type": "int", "default_value":0, "param_type":"optional" }, { "name": "is_gather_out", "type": "bool", "default_value":true, "param_type":"optional" } ] } ] |
The Tiling Data struct of the operator must contain all Mc2Msg parameters in sequence. The following is an example of the Tiling Data code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 |
// Declare the Mc2Msg struct. BEGIN_TILING_DATA_DEF(Mc2Msg) TILING_DATA_FIELD_DEF(uint32_t, preparePosition); TILING_DATA_FIELD_DEF(uint32_t, sendOff); TILING_DATA_FIELD_DEF(uint32_t, recvOff); TILING_DATA_FIELD_DEF(uint32_t, tailSendOff); TILING_DATA_FIELD_DEF(uint32_t, tailRecvOff); TILING_DATA_FIELD_DEF(uint64_t, sendCnt); TILING_DATA_FIELD_DEF(uint32_t, recvCnt); TILING_DATA_FIELD_DEF(uint32_t, tailSendCnt); TILING_DATA_FIELD_DEF(uint32_t, tailRecvCnt); TILING_DATA_FIELD_DEF(uint32_t, totalCnt); TILING_DATA_FIELD_DEF(uint32_t, turnNum); TILING_DATA_FIELD_DEF(uint32_t, tailNum); TILING_DATA_FIELD_DEF(uint32_t, stride); TILING_DATA_FIELD_DEF(uint32_t, workspaceOff); TILING_DATA_FIELD_DEF(uint32_t, notifyOff); TILING_DATA_FIELD_DEF(uint16_t, notifyBeginCnt); TILING_DATA_FIELD_DEF(uint16_t, notifyEndCnt); TILING_DATA_FIELD_DEF(uint8_t, useBufferType); TILING_DATA_FIELD_DEF(uint8_t, funID); TILING_DATA_FIELD_DEF(uint8_t, dataType); TILING_DATA_FIELD_DEF(uint8_t, groupNum); TILING_DATA_FIELD_DEF(uint8_t, reuseMode); TILING_DATA_FIELD_DEF(uint8_t, commType); TILING_DATA_FIELD_DEF(uint8_t, reduceOp); TILING_DATA_FIELD_DEF(uint8_t, commOrder); TILING_DATA_FIELD_DEF(uint8_t, waitPolicy); TILING_DATA_FIELD_DEF(uint8_t, rspPolicy); TILING_DATA_FIELD_DEF(uint8_t, exitPolicy); TILING_DATA_FIELD_DEF(uint8_t, commAlg); TILING_DATA_FIELD_DEF(uint8_t, taskType); TILING_DATA_FIELD_DEF(uint8_t, debugMode); TILING_DATA_FIELD_DEF(uint8_t, stepSize); TILING_DATA_FIELD_DEF(uint8_t, sendArgIndex); TILING_DATA_FIELD_DEF(uint8_t, recvArgIndex); TILING_DATA_FIELD_DEF(uint8_t, commOutArgIndex); TILING_DATA_FIELD_DEF(uint8_t, hasCommOut); TILING_DATA_FIELD_DEF(uint8_t, reserve); TILING_DATA_FIELD_DEF(uint32_t, reserve2); END_TILING_DATA_DEF; REGISTER_TILING_DATA_CLASS(Mc2MsgOp, Mc2Msg) BEGIN_TILING_DATA_DEF(AllGatherMatmulCustomTilingData) TILING_DATA_FIELD_DEF_STRUCT(Mc2Msg, msg); END_TILING_DATA_DEF; |
1 2 3 4 5 6 |
// Configure Mc2Msg. AllGatherMatmulCustomTilingData tiling; tiling.msg.set_preparePosition(1); tiling.msg.set_commAlg(1); tiling.msg.set_useBufferType(1); tiling.msg.set_hasCommOut(1); |