REGISTER_NONE_TILING
Supported Products
Product |
Supported/Unsupported |
|---|---|
√ |
|
√ |
|
x |
|
x |
|
x |
|
x |
Functions
When the TilingData structure customized using the standard C++ syntax is used in the kernel, if you are not sure about which structures need to be registered, you can use this API to notify the framework that the unregistered standard C++ syntax is required to define TilingData. In addition, GET_TILING_DATA_WITH_STRUCT, GET_TILING_DATA_MEMBER, and GET_TILING_DATA_PTR_WITH_STRUCT are used to obtain the corresponding TilingData.
Prototype
1 | REGISTER_NONE_TILING
|
Command-Line Options
None
Restrictions
- Currently, the kernel launch project is not supported.
- The GET_TILING_DATA API requires the default registered TilingData structure, but this API does not register the TilingData structure. Therefore, this API cannot be used together with 5.11.1-GET_TILING_DATA.
- It cannot be used together with REGISTER_TILING_DEFAULT or REGISTER_TILING_FOR_TILINGKEY. That is, the scenario where the TilingData structure is registered cannot be used together with the scenario where the TilingData structure is not registered.
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 | # The provider of the tiling template library cannot predict which TilingData structure will be instantiated by the user. template <class BrcDag> struct BroadcastBaseTilingData { int32_t scheMode; int32_t shapeLen; int32_t ubSplitAxis; int32_t ubFormer; int32_t ubTail; int64_t ubOuter; int64_t blockFormer; int64_t blockTail; int64_t dimProductBeforeUbInner; int64_t elemNum; int64_t blockNum; int64_t outputDims[BROADCAST_MAX_DIMS_NUM]; int64_t outputStrides[BROADCAST_MAX_DIMS_NUM]; int64_t inputDims[BrcDag::InputSize][2]; // Entire block + tail block. int64_t inputBrcDims[BrcDag::CopyBrcSize][BROADCAST_MAX_DIMS_NUM]; int64_t inputVecBrcDims[BrcDag::VecBrcSize][BROADCAST_MAX_DIMS_NUM]; int64_t inputStrides[BrcDag::InputSize][BROADCAST_MAX_DIMS_NUM]; int64_t inputBrcStrides[BrcDag::CopyBrcSize][BROADCAST_MAX_DIMS_NUM]; int64_t inputVecBrcStrides[BrcDag::VecBrcSize]; char scalarData[BROADCAST_MAX_SCALAR_BYTES]; }; template <uint64_t schMode, class BrcDag> class BroadcastSch { public: __aicore__ inline explicit BroadcastSch(GM_ADDR& tmpTiling) : tiling(tmpTiling) {} template <class... Args> __aicore__ inline void Process(Args... args) { REGISTER_NONE_TILING; // Informs the framework to use the unregistered TilingData structure. if constexpr (schMode == 1) { GET_TILING_DATA_WITH_STRUCT(BroadcastBaseTilingData<BrcDag>, tilingData, tiling); GET_TILING_DATA_MEMBER(BroadcastBaseTilingData<BrcDag>, blockNum, blockNumVar, tiling); TPipe pipe; BroadcastNddmaSch<BrcDag, false> sch(&tilingData); // Obtaining the Schedule sch.Init(&pipe, args...); sch.Process(); } else if constexpr (schMode == 202) { GET_TILING_DATA_PTR_WITH_STRUCT(BroadcastOneDimTilingDataAdvance, tilingDataPtr, tiling); BroadcastOneDimAdvanceSch<BrcDag> sch(tilingDataPtr); // Obtaining the Schedule sch.Init(args...); sch.Process(); } } public: GM_ADDR tiling; }; |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | # The user instantiates the template library by passing in the schMode and OpDag template parameters. using namespace AscendC; template <uint64_t schMode> __global__ __aicore__ void mul(GM_ADDR x1, GM_ADDR x2, GM_ADDR y, GM_ADDR workspace, GM_ADDR tiling) { if constexpr (std::is_same<DTYPE_X1, int8_t>::value) { // int8 using OpDag = MulDag::MulInt8Op::OpDag; BroadcastSch<schMode, OpDag> sch(tiling); sch.Process(x1, x2, y); } else if constexpr (std::is_same<DTYPE_X1, uint8_t>::value) { // uint8 using OpDag = MulDag::MulUint8Op::OpDag; BroadcastSch<schMode, OpDag> sch(tiling); sch.Process(x1, x2, y); } } |
Parent topic: Kernel Tiling