SetAippFunctions

Product Support

Product	Supported
Atlas A3 training products / Atlas A3 inference products	√
Atlas A2 training products / Atlas A2 inference products	√
Atlas 200I/500 A2 inference products	√
Atlas inference product 's AI Core	√
Atlas inference product 's Vector Core	x
Atlas training products	x

Function

Sets the Artificial Intelligence Preprocessing (AIPP) parameters for images. This API is used together with the LoadImageToLocal API. After setting the parameters, you can call LoadImageToLocal to preprocess images during data movement, including data padding, channel swapping, single-line read, data type conversion, channel padding, and CSC. When SetAippFunctions is called, the matrix and format of the source image in the global memory need to be passed.

Data padding: performs padding in the HW direction of the image. The following modes are available:
- Mode 0: constant padding mode. Each position in the padding area is padded with a constant. The constant padded in each channel can be set. In this mode, only left and right padding is supported.
  Figure 1 Constant padding mode (original data in the middle green region and padding data in other regions)
- Mode 1: row/column padding mode. Each position in the padding area is padded with the data closest to the source image position in the row or column.
  Figure 2 Row/column padding mode (original data in the middle green region and padding data in other regions)
- Mode 2: block padding mode. Data blocks are copied from the source image for padding based on the padding width and height.
  Figure 3 Block padding mode (original data in the middle green region and padding data in other regions)
- Mode 3: mirrored block padding mode. The mirrored data blocks are copied from the source image for padding based on the padding width and height.
  Figure 4 Mirrored block padding mode (original data in the middle green region and padding data in other regions)

Channel swapping: Swaps image channels.
For the RGB888 format, the R and B channels can be swapped.

For the YUV420SP format, the U and V channels can be swapped.

For the XRGB8888 format, the X channel can be moved backward (XRGB -> RGBX), and the R and B channels can be swapped.

Single-line read: reads only one line from the source image.

When the data transfer API is called, the configured destination image height parameter becomes invalid after single-row read is enabled, for example, loadImageToLocalParams.vertSize of the LoadImageToLocal API.

Data type conversion: Converts the pixel data type from uint8_t to int8_t or half. Conversion from uint8_t to int8_t limits output values to the range [–128, 127].

        
             // Example 1: Implement uint8_t-to-int8_t type conversion and mean subtraction. Set the mean value of each channel to the average value of all data within the channel. (The min and var values are invalid and require no configuration.)
output[i][j][k] = input[i][j][k] - mean[k]
// Example 2: Implement uint8_t-to-fp16 type conversion and normalization. Set the mean value of each channel to the average value of all data within the channel, the min value to the minimum value after mean subtraction of all data within the channel, and the var value to the reciprocal of the difference between the maximum and minimum values of all data within the channel.
uint8_t -> fp16:  output[i][j][k] = (input[i][j][k] - mean[k] - min[k]) * var[k]

The data type after conversion is configured by the template parameter U. When U is uint8_t, the data type conversion function is disabled.

When the data movement API is called, the data type of the destination tensor must be the same as that of the output of this API. For example, the data type of dstLocal in LoadImageToLocal.

Channel padding: performs padding on the image channel. The default value is 0.
Mode 0: Pad the channel to the 32-byte channel. For uint8_t or int8_t outputs, the channels are padded to a multiple of 32; for fp16 outputs, the channels are padded to a multiple of 16.

Mode 1: Pad channels to the 4-byte channel.

CSC: converts the RGB format to the YUV format, or vice versa.
$\text{[math]}$

$\text{[math]}$

Prototype

The input image format is YUV400, RGB888, or XRGB8888.

        
             template<typename T, typename U>
__aicore__ inline void SetAippFunctions(const GlobalTensor<T>& src0, AippInputFormat format, AippParams<U> config)

The input image format is YUV420 semi-planar.

        
             template<typename T, typename U>
__aicore__ inline void SetAippFunctions(const GlobalTensor<T>& src0, const GlobalTensor<T>& src1, AippInputFormat format, AippParams<U> config)

Parameters

**Table 1** Template parameters
Parameter	Meaning
T	Input data type, which must be the same as the data type of format.
U	Output data type. The same data type must be configured for the data movement API, for example, the data type of dstLocal in LoadImageToLocal. If the data type conversion function is disabled, the data type must be the same as the input data type. If the data type conversion function is enabled, the data type must be the same as the expected data type after conversion.

Table 2 Parameters

Parameter

Input/Output

Meaning

src0

Input

Matrix of the source image in the global memory.

If the source image format is YUV420SP, the value indicates the matrix of the Y dimension in the global memory.

src1

Input

If the source image format is YUV420SP, the value indicates the matrix of the UV dimension in the global memory.

This parameter is invalid when the source image is in other formats.

format

Input

Format of the source image. AippInputFormat has the following enumerated values:

AippInputFormat::YUV420SP_U8: The image format is YUV420 Semi-Planar and the data type is uint8_t.

AippInputFormat::XRGB8888_U8: The image format is XRGB8888 and the data type is uint8_t.

AippInputFormat::RGB888_U8: The image format is RGB888 and the data type is uint8_t.

AippInputFormat::YUV400_U8: The image format is YUV400 and the data type is uint8_t.

enum class AippInputFormat : uint8_t {
    YUV420SP_U8 = 0,
    XRGB8888_U8 = 1,
    RGB888_U8 = 4,
    YUV400_U8 = 9,
};

config

Input

Parameters for image preprocessing, which is of the AippParams type. The structure is defined as follows:

           
                template <typename T>
struct AippParams {
    AippPaddingParams<T> paddingParams;
    AippSwapParams swapParams;
    AippSingleLineParams singleLineParams;
    AippDataTypeConvParams dtcParams;
    AippChannelPaddingParams<T> cPaddingParams;
    AippColorSpaceConvParams cscParams;
};

The sub-structures in the AippParams structure are defined as follows:

Data padding parameters. For details, see Table 3.

             
                  template <typename T>
struct AippPaddingParams {
    uint32_t paddingMode;
    T paddingValueCh0;
    T paddingValueCh1;
    T paddingValueCh2;
    T paddingValueCh3;
};

Channel swapping parameters. For details, see Table 4.

             
                  struct AippSwapParams {
    bool isSwapRB;
    bool isSwapUV;
    bool isSwapAX;
};

Single-line read parameter. For details, see Table 5.

             
                  struct AippSingleLineParams {
    bool isSingleLineCopy;
};

Data type conversion parameters. For details, see Table 6.

             
                  struct AippDataTypeConvParams {
    uint8_t dtcMeanCh0{ 0 };
    uint8_t dtcMeanCh1{ 0 };
    uint8_t dtcMeanCh2{ 0 };
    half dtcMinCh0{ 0 };
    half dtcMinCh1{ 0 };
    half dtcMinCh2{ 0 };
    half dtcVarCh0{ 1.0 };
    half dtcVarCh1{ 1.0 };
    half dtcVarCh2{ 1.0 };
    uint32_t dtcRoundMode{ 0 };
};

Channel padding parameters. For details, see Table 7.

             
                  template <typename T>
struct AippChannelPaddingParams {
    uint32_t cPaddingMode;
    T cPaddingValue;
};

CSC parameters. For details, see Table 8.

             
              
                
                
                  struct AippColorSpaceConvParams {
    bool isEnableCsc;
    int16_t cscMatrixR0C0;
    int16_t cscMatrixR0C1;
    int16_t cscMatrixR0C2;
    int16_t cscMatrixR1C0;
    int16_t cscMatrixR1C1;
    int16_t cscMatrixR1C2;
    int16_t cscMatrixR2C0;
    int16_t cscMatrixR2C1;
    int16_t cscMatrixR2C2;
    uint8_t cscBiasIn0;
    uint8_t cscBiasIn1;
    uint8_t cscBiasIn2;
    uint8_t cscBiasOut0;
    uint8_t cscBiasOut1;
    uint8_t cscBiasOut2;
};

                 

               

             
            

**Table 3** Parameters in the AippPaddingParams structure
Parameter	Input/Output	Meaning
paddingMode	Input	Padding mode. The value range is [0, 3], and the default value is 0. 0: constant padding mode. This mode supports only left and right padding. 1: row/column copy mode. 2: block copy mode. 3: mirror block copy mode.
paddingValueCh0	Input	Padding data of channel 0 in the padding area, which is valid only in constant padding mode. The data type is T, and the default value is 0.
paddingValueCh1	Input	Padding data of channel 1 in the padding area, which is valid only in constant padding mode. The data type is T, and the default value is 0.
paddingValueCh2	Input	Padding data of channel 2 in the padding area, which is valid only in constant padding mode. The data type is T, and the default value is 0.
paddingValueCh3	Input	Padding data of channel 3 in the padding area, which is valid only in constant padding mode. The data type is T, and the default value is 0.

**Table 4** Parameters in the AippSwapParams structure
Parameter	Input/Output	Meaning
isSwapRB	Input	Whether to swap the R and B channels for the RGB888 and XRGB8888 formats. The default value is false.
isSwapUV	Input	Whether to swap the U and V channels for the YUV420SP format. The default value is false.
isSwapAX	Input	Whether to move the X channel backward for the XRGB8888 format, that is, XRGB - > RGBX. The default value is false.

**Table 5** Parameters in the AippSingleLineParams structure
Parameter	Input/Output	Meaning
isSingleLineCopy	Input	Whether to enable the single-line read mode. After this function is enabled, only one line is read from the source image. The default value is false.

**Table 6** Parameters in the AippDataTypeConvParams structure
Parameter	Input/Output	Meaning
dtcMeanCh0	Input	Mean value for channel 0 in the formula. The data type is uint8_t, and the default value is 0.
dtcMeanCh1	Input	Mean value for channel 1 in the formula. The data type is uint8_t, and the default value is 0.
dtcMeanCh2	Input	Mean value for channel 2 in the formula. The data type is uint8_t, and the default value is 0.
dtcMinCh0	Input	The min value in the formula, channel0. The data type is half, and the default value is 0. This parameter cannot be set for the Atlas 200I/500 A2 inference products .
dtcMinCh1	Input	The min value in the formula, channel1. The data type is half, and the default value is 0. This parameter cannot be set for the Atlas 200I/500 A2 inference products .
dtcMinCh2	Input	The min value in the formula, channel2. The data type is half, and the default value is 0. This parameter cannot be set for the Atlas 200I/500 A2 inference products .
dtcVarCh0	Input	The var value in the formula, channel0. The data type is half, and the default value is 1.0.
dtcVarCh1	Input	The var value in the formula, channel1. The data type is half, and the default value is 1.0.
dtcVarCh2	Input	The var value in the formula, channel2. The data type is half, and the default value is 1.0.
dtcRoundMode	Input	DTC data type conversion mode. The data type is uint32_t, and the default value is 0. 0: Rounds to the nearest integer (round in C language). 1: Rounds to the nearest even integer (rint in C language). Only Atlas 200I/500 A2 inference products support this parameter.

**Table 7** Parameters in the AippChannelPaddingParams structure
Parameter	Input/Output	Meaning
cPaddingMode	Input	Channel padding type. The value range is [0, 1], and the default value is 0. 0: Pad data to the 32-byte channel. That is, when the output data type U is uint8_t or int8_t, channels are padded to 32. When U is half, channels are padded to 16. 1: Pad data to the 4-byte channel.
cPaddingValue	Input	Channel padding value. The data type is T, and the default value is 0.

**Table 8** Parameters in the AippColorSpaceConvParams structure
Parameter	Input/Output	Meaning
isEnableCsc	Input	Whether to enable the CSC function. The default value is false.
cscMatrixR0C0	Input	CSC matrix cscMatrix[0][0].
cscMatrixR0C1	Input	CSC matrix cscMatrix[0][1].
cscMatrixR0C2	Input	CSC matrix cscMatrix[0][2].
cscMatrixR1C0	Input	CSC matrix cscMatrix[1][0].
cscMatrixR1C1	Input	CSC matrix cscMatrix[1][1].
cscMatrixR1C2	Input	CSC matrix cscMatrix[1][2].
cscMatrixR2C0	Input	CSC matrix cscMatrix[2][0].
cscMatrixR2C1	Input	CSC matrix cscMatrix[2][1].
cscMatrixR2C2	Input	CSC matrix cscMatrix[2][2].
cscBiasIn0	Input	RGB-to-YUV conversion bias cscBiasIn[0]. This parameter is invalid during YUV-to-RGB conversion.
cscBiasIn1	Input	RGB-to-YUV conversion bias cscBiasIn[1]. This parameter is invalid during YUV-to-RGB conversion.
cscBiasIn2	Input	RGB-to-YUV conversion bias cscBiasIn[2]. This parameter is invalid during YUV-to-RGB conversion.
cscBiasOut0	Input	YUV-to-RGB conversion bias cscBiasOut0[0]. This parameter is invalid during RGB-to-YUV conversion.
cscBiasOut1	Input	YUV-to-RGB conversion bias cscBiasOut1[1]. This parameter is invalid during RGB-to-YUV conversion.
cscBiasOut2	Input	YUV-to-RGB conversion bias cscBiasOut2[2]. This parameter is invalid during RGB-to-YUV conversion.

Restrictions

The address alignment requirements of src0 and src1 in global memory are as follows.

Image Format	src0	src1
YUV420SP	Must be 2-byte aligned.	Must be 2-byte aligned.
XRGB8888	Must be 4-byte aligned.	-
RGB888	No alignment requirement.	-
YUV400	No alignment requirement.	-

For data in XRGB input format, the chip discards the data of the fourth channel by default and outputs data in RGB format. Therefore, if X is in channel 0, the function of moving the X channel backward must be enabled to convert the input into RGBX. If X is in channel 3, this function must be disabled to output data in RGB format.

Returns

None

Example

This calling example supports the Atlas inference product 's AI Core platform. The example image format is YUV420SP.

        
         
           
           
             #include "kernel_operator.h"

class KernelLoadImage {
public:
    __aicore__ inline KernelLoadImage()
    {
        // Size of the Y dimension of the image in YUV420SP format.
        gmSrc0Size = srcHorizSize * srcVertSize;
        // Size of the UV dimension of the image in YUV420SP format.
        gmSrc1Size = (srcHorizSize / 2) * (srcVertSize / 2) * 2;
        dstSize = dstHorizSize * dstVertSize * cSize;
    }
    __aicore__ inline void Init(__gm__ uint8_t *fmGm, __gm__ uint8_t *dstGm)
    {
        fmGlobal.SetGlobalBuffer((__gm__ uint8_t *)fmGm);
        dstGlobal.SetGlobalBuffer((__gm__ int8_t *)dstGm);
        pipe.InitBuffer(inQueueA1, 1, (gmSrc0Size + gmSrc1Size) * sizeof(int8_t));
        pipe.InitBuffer(outQueueUB, 1, dstSize * sizeof(int8_t));
    }
    __aicore__ inline void Process()
    {
        CopyIn();
        CopyToUB();
        CopyOut();
    }
private:
    __aicore__ inline void CopyIn()
    {
        AscendC::LocalTensor<int8_t> featureMapA1 = inQueueA1.AllocTensor<int8_t>();
        uint64_t fm_addr = static_cast<uint64_t>(reinterpret_cast<uintptr_t>(fmGlobal.GetPhyAddr()));
        // aipp config
        AscendC::AippParams<int8_t> aippConfig;
        aippConfig.cPaddingParams.cPaddingMode = cPadMode;
        aippConfig.cPaddingParams.cPaddingValue = cPaddingValue;
        // fmGlobal is the entire input image. Set src1 to the start address of the UV dimension of the image.
        AscendC::SetAippFunctions(fmGlobal, fmGlobal[gmSrc0Size], inputFormat, aippConfig);
        AscendC::LoadImageToLocal(featureMapA1, { horizSize, vertSize, horizStartPos, vertStartPos, srcHorizSize, topPadSize, botPadSize, leftPadSize, rightPadSize });
        inQueueA1.EnQue(featureMapA1);
    }
    __aicore__ inline void CopyToUB()
    {
        AscendC::LocalTensor<int8_t> featureMapA1 = inQueueA1.DeQue<int8_t>();
        AscendC::LocalTensor<int8_t> featureMapUB = outQueueUB.AllocTensor<int8_t>();
        AscendC::DataCopy(featureMapUB, featureMapA1, dstSize);
        event_t eventIdMTE1ToMTE3 = static_cast<event_t>(GetTPipePtr()->FetchEventID(AscendC::HardEvent::MTE1_MTE3));
        AscendC::SetFlag<AscendC::HardEvent::MTE1_MTE3>(eventIdMTE1ToMTE3);
        AscendC::WaitFlag<AscendC::HardEvent::MTE1_MTE3>(eventIdMTE1ToMTE3);
        outQueueUB.EnQue<int8_t>(featureMapUB);
        inQueueA1.FreeTensor(featureMapA1);
    }
    __aicore__ inline void CopyOut()
    {
        AscendC::LocalTensor<int8_t> featureMapUB = outQueueUB.DeQue<int8_t>();
        AscendC::DataCopy(dstGlobal, featureMapUB, dstSize);
        outQueueUB.FreeTensor(featureMapUB);
    }
private:
    AscendC::TPipe pipe;
    AscendC::TQue<AscendC::TPosition::A1, 1> inQueueA1;
    AscendC::TQue<AscendC::TPosition::VECOUT, 1> outQueueUB;

    AscendC::GlobalTensor<uint8_t> fmGlobal;
    AscendC::GlobalTensor<int8_t> dstGlobal;

    uint16_t horizSize = 32, vertSize = 32, horizStartPos = 0, vertStartPos = 0, srcHorizSize = 32, srcVertSize = 32, leftPadSize = 0, rightPadSize = 0;
    uint32_t dstHorizSize = 32, dstVertSize = 32, cSize = 32;
    uint8_t topPadSize = 0, botPadSize = 0;
    uint32_t gmSrc0Size = 0, gmSrc1Size = 0, dstSize = 0;
    AscendC::AippInputFormat inputFormat = AscendC::AippInputFormat::YUV420SP_U8;
    uint32_t cPadMode = 0;
    int8_t cPaddingValue = 0;
};

extern "C" __global__ __aicore__ void load_image_simple_kernel(__gm__ uint8_t *fmGm, __gm__ uint8_t *dstGm)
{
    KernelLoadImage op;
    op.Init(fmGm, dstGm);
    op.Process();
}

            

          

        
       

Parent topic: Data Movement