Conv3D Instructions

Ascend C provides a group of high-level Conv3D APIs for users to quickly implement three-dimensional convolution forward cube computation. Figure 1 is a 3D forward convolution diagram. The calculation formula is as follows:

$\text{[math]}$

X is the feature matrix input of Conv3D convolution.
W is the weight matrix of the Conv3D convolution.
B is the bias matrix of the Conv3D convolution.
Y is the result matrix output after completing the convolution and bias operations.

Figure 1 3D forward convolution diagram

Cin indicates the size of the input channel. Din indicates the depth dimension of the input. Hin indicates the height dimension of the input Win indicates the width dimension of the input. Cout indicates the output channel size of Weight and Output. Dout indicates the size of the depth dimension of Output. Hout indicates the size of the height dimension of Output. Wout indicates the Width dimension of Output. The M dimension mentioned in the following sections is the vertical axis of the input after img2col expansion during the forward convolution operation. The value is equal to Hout × Wout.

Channel, Depth, Height, and Width are abbreviated as C, D, H, and W, respectively.

In addition to the preceding basic operations, the parameters Padding, Stride, and Dilation can be set in Conv3D computation. Their meanings are as follows:

Padding indicates that 0 is padded to the three dimensions of the input matrix. See Figure 2.
Stride indicates the sliding distance of the convolution kernel in the three dimensions. See Figure 3.
Dilation indicates the spacing among data in the three dimensions of the convolution kernel. See Figure 4.

Figure 2 3D convolution forward Padding diagram

Figure 3 3D convolution forward Stride diagram

Figure 4 3D convolution forward Dilation diagram

The procedure for implementing Conv3D computation on the kernel side is as follows:

Create a Conv3D object.
Perform the initialization operation.
Set the 3D convolution Input, Weight, Bias, and Output.
Perform the 3D convolution operation.
Complete the 3D convolution operation.

To use the high-level Conv3D API to implement forward convolution, perform the following steps:

Create a Conv3D object.

       
            #include "lib/conv/conv3d/conv3d_api.h"

using inputType = ConvApi::ConvType<AscendC::TPosition::GM, ConvFormat::NDC1HWC0, bfloat16_t>;
using weightType = ConvApi::ConvType<AscendC::TPosition::GM, ConvFormat::FRACTAL_Z_3D, bfloat16_t>;
using outputType = ConvApi::ConvType<AscendC::TPosition::GM, ConvFormat::NDC1HWC0, bfloat16_t>;
using biasType = ConvApi::ConvType<AscendC::TPosition::GM, ConvFormat::ND, float>; // Optional parameters

Conv3dApi::Conv3D<inputType, weightType, outputType, biasType> conv3dApi;

When creating an object, you need to pass the types of the Input, Weight, and Output parameters. The Bias parameter is optional. If the convolution computation does not involve Bias input, this parameter is not passed. The type information is defined by ConvType, including the logical memory location, data format, and data type.

       
            template <TPosition POSITION, ConvFormat FORMAT, typename TYPE>
struct ConvType {
    constexpr static TPosition pos = POSITION;    // Position of the Conv3d input or output in memory
    constexpr static ConvFormat format = FORMAT;  // Conv3d input or output data format
    using T = TYPE;                               // Conv3d input or output data type
};

The following briefly describes the data structures used for object creation. Developers can selectively understand these content. The data structure used to create a Conv3D object is defined as follows:

       
            template <class INPUT_TYPE, class WEIGHT_TYPE, class OUTPUT_TYPE, class BIAS_TYPE = biasType, class CONV_CFG = Conv3dParam>
using Conv3D = Conv3dIntfExt<Config<ConvApi::ConvDataType<INPUT_TYPE, WEIGHT_TYPE, OUTPUT_TYPE, BIAS_TYPE, CONV_CFG>>, Impl, Intf>

The Conv3dIntfExt and Conv3dParam data structures are defined as follows:

       
            template <class Conv3dCfg, template <typename, class, bool> class Impl = Conv3dApiImpl,
    template <class, template <typename, class, bool> class> class Intf = Conv3dIntf>
struct Conv3dIntfExt : public Intf<Conv3dCfg, Impl> {
    __aicore__ inline Conv3dIntfExt()
    {}
};
struct Conv3dParam : public ConvApi::ConvParam {
    __aicore__ inline Conv3dParam(){};
};

Conv3dIntf is the base class of Conv3dIntfExt, and Conv3dCfg is the Conv3dIntf template input parameter. The data structure is defined as follows:

       
        
          
          
            template <class Config, template <typename, class, bool> class Impl>
struct Conv3dIntf {
    using InputT = typename Config::SrcAT;
    using WeightT = typename Config::SrcBT;
    using OutputT = typename Config::DstT;
    using BiasT = typename Config::BiasT;
    using L0cT = typename Config::L0cT;
    using ConvParam = typename Config::ConvParam;
    __aicore__ inline Conv3dIntf()
    {}
}
template <class ConvDataType>
struct Conv3dCfg : public ConvApi::ConvConfig<ConvDataType> {
public:
    __aicore__ inline Conv3dCfg()
    {}
    using ContextData = struct _ : public ConvApi::ConvConfig<ConvDataType>::ContextData {
        __aicore__ inline _()
        {}
    };
};

           

         

       
      

**Table 1** ConvType parameters
Parameter	Description
TPosition	Logical memory location. This parameter can be set to TPosition::GM for the input matrix. This parameter can be set to TPosition::GM for the weight matrix. This parameter can be set to TPosition::GM for the bias matrix. This parameter can be set to TPosition::GM for the output matrix.
ConvFormat	Data format. Input matrix can be set to ConvFormat::NDC1HWC0. Weight matrix can be set to ConvFormat::FRACTAL_Z_3D. This parameter can be set to ConvFormat::ND for the bias. Output matrix can be set to ConvFormat::NDC1HWC0.
TYPE	Data type. This parameter can be set to half or bfloat16_t for the input matrix. This parameter can be set to half or bfloat16_t for the weight matrix. This parameter can be set to half or float for the bias. This parameter can be set to half or bfloat16_t for the output matrix. Note: The data types of the input and output matrices must match. For details about the supported data type combinations, see Table 2.

**Table 2** Combinations of **Conv3D** input and output data types
Input Matrix	Weight Matrix	Bias	Output Matrix	Supported Platform
half	half	half	half	Atlas A3 training products / Atlas A3 inference products Atlas A2 training products / Atlas A2 inference products
bfloat16_t	bfloat16_t	float	bfloat16_t	Atlas A3 training products / Atlas A3 inference products Atlas A2 training products / Atlas A2 inference products

Perform the initialization operation.

       
            Conv3dApi::Conv3D<inputType, weightType, outputType, biasType> conv3dApi;
TPipe pipe;                                                        // Initialize TPipe.
conv3dApi.Init(&tiling);                                           // Initialize conv3dApi.

Set the 3D convolution Input, Weight, Bias, and Output.

       
            conv3dApi.SetWeight(weightGm);               // Set the address of the input weight of the current core in the GM.
if (biasFlag) {
    conv3dApi.SetBias(biasGm);               // Set the address of the input bias of the current core on GM.
}
// Set the offset of each dimension of input in the current core.
conv3dApi.SetInputStartPosition(diStartPos, mStartPos);
// Set the sizes of cout, dout, and m for the current core.
conv3dApi.SetSingleOutputShape(singleCoreCout, singleCoreDout, singleCoreM);

// Currently, Conv3D supports only single-batch convolution computation. In multi-batch scenarios, the for loop is used to implement the process, computing the address offset of the current batch between loops.
for (uint64_t batchIter = 0; batchIter < singleCoreBatch; ++batchIter) {
    conv3dApi.SetInput(inputGm[batchIter * inputOneBatchSize]);    // Set the address of the input for the current core on the GM.
}

Perform the 3D convolution operation.

Call IterateAll to compute all data on a single core.

        
             for (uint64_t batchIter = 0; batchIter < singleCoreBatch; ++batchIter) {
    ...
    conv3dApi.IterateAll(outputGm[batchIter * outputOneBatchSize]);    // Call IterateAll to complete the Conv3D computation.
    ...
}

Complete the 3D convolution operation.

       
            for (uint64_t batchIter = 0; batchIter < singleCoreBatch; ++batchIter) {
    ...
    conv3dApi.End();    //Clear the EventID and release the temporarily allocated internal memory.
}

Header File to Be Included

      
           #include "lib/conv/conv3d/conv3d_api.h"

Parent topic: Conv3D Kernel APIs