LoadImageToLocal

Supported Products

Product

Supported/Unsupported

Atlas A3 training products/Atlas A3 inference products

Atlas A2 training products/Atlas A2 inference products

Atlas 200I/500 A2 inference products

Atlas inference product's AI Core

Atlas inference product's Vector Core

x

Atlas training products

x

Function Usage

Transfers image data from the global memory to the local memory. During data movement, you can preprocess images, including image flipping, image resizing (clipping, cropping, resizing, and stretching), color space conversion (CSC), and type conversion. The parameters related to image preprocessing are configured in SetAippFunctions.

Prototype

1
2
template <typename T>
__aicore__ inline void LoadImageToLocal(const LocalTensor<T>& dst, const LoadImageToLocalParams& loadDataParams)

Parameters

Table 1 Parameters

Parameter

Input/Output

Meaning

dst

Output

Destination operand of the LocalTensor type.

The start address of LocalTensor must be 32-byte aligned.

Atlas A3 training products/Atlas A3 inference products: The supported data types are int8_t and half. The supported TPosition is A1 and B1.

Atlas A2 training products/Atlas A2 inference products: The supported data types are int8_t and half. The supported TPosition is A1 and B1.

Atlas 200I/500 A2 inference products: The supported data types are uint8_t, int8_t, and half. The supported TPosition is A1 and B1.

Atlas inference product's AI Core: The supported data types are uint8_t, int8_t, and half. The supported TPosition is A1 and B1.

loadDataParams

Input

LoadData parameter structure, of the LoadImageToLocalParams type.

For details, see ${INSTALL_DIR}/include/ascendc/basic_api/interface/kernel_struct_mm.h. Replace ${INSTALL_DIR} with the actual CANN component directory.

For details about the parameter description, see Table 2.

Table 2 Parameters in the LoadImageToLocalParams structure

Parameter

Input/Output

Meaning

horizSize

Input

Horizontal width of the image loaded from the source image, in pixels. Value range: horSize ∈ [2, 4095].

vertSize

Input

Vertical height of the image loaded from the source image, in pixels. Value range: verSize ∈ [2, 4095].

horizStartPos

Input

Horizontal start address of the loaded image on the source image, in pixels. Value range: horizStartPos ∈ [0, 4095]. The default value is 0.

Note: When the input image format is YUV420SP, XRGB8888, RGB888, or YUV400, the value of this parameter must be an even number.

vertStartPos

Input

Vertical start address of the loaded image on the source image, in pixels. Value range: vertStartPos ∈ [0, 4095]. The default value is 0.

Note: If the input image is in YUV420SP format, the value of this parameter must be an even number.

srcHorizSize

Input

Horizontal width of the source image, in pixels. Value range: srcHorizSize ∈ [2, 4095].

Note: If the input image is in YUV420SP format, the value of this parameter must be an even number.

topPadSize

Input

Number of pixels padded on the top of the destination image. Value range: topPadSize ∈ [0, 32]. The default value is 0. This parameter is used for data padding. You need to call SetAippFunctions and use AippPaddingParams to configure the padding value, and then use topPadSize, botPadSize, leftPadSize, and rightPadSize to configure the padding size range.

botPadSize

Input

Number of pixels padded at the bottom of the destination image. Value range: botPadSize ∈ [0, 32]. The default value is 0.

leftPadSize

Input

Number of pixels padded on the left of the destination image. Value range: leftPadSize ∈ [0, 32]. The default value is 0.

rightPadSize

Input

Number of pixels padded on the right of the destination image. Value range: rightPadSize ∈ [0, 32]. The default value is 0.

sid

Input

Reserved. It is reserved for subsequent functions. You can use the default value for now.

Restrictions

  • For details about the operand address alignment requirements, see General Address Alignment Restrictions.
  • The size of the image loaded to the destination image plus the padding size must be less than or equal to the size of the storage space.
  • When the padding mode is set to block padding or mirrored block padding by using SetAippFunctions, the left and right padding sizes (leftPadSize and rightPadSize) must be less than or equal to the horizontal size (horizSize) of the cropped image, and the top and bottom padding sizes (topPadSize and botPadSize) must be less than or equal to the vertical size (vertSize) of the cropped image, because the padded data comes from the cropped image.

Returns

None

Example

This calling example supports the Atlas inference product's AI Core platform. The example image format is YUV420SP.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
#include "kernel_operator.h"

class KernelLoadImage {
public:
    __aicore__ inline KernelLoadImage()
    {
        // Size of the Y dimension of the image in YUV420SP format.
        gmSrc0Size = srcHorizSize * srcVertSize;
        // Size of the UV dimension of the image in YUV420SP format.
        gmSrc1Size = (srcHorizSize / 2) * (srcVertSize / 2) * 2;
        dstSize = dstHorizSize * dstVertSize * cSize;
    }
    __aicore__ inline void Init(__gm__ uint8_t *fmGm, __gm__ uint8_t *dstGm)
    {
        fmGlobal.SetGlobalBuffer((__gm__ uint8_t *)fmGm);
        dstGlobal.SetGlobalBuffer((__gm__ int8_t *)dstGm);
        pipe.InitBuffer(inQueueA1, 1, (gmSrc0Size + gmSrc1Size) * sizeof(int8_t));
        pipe.InitBuffer(outQueueUB, 1, dstSize * sizeof(int8_t));
    }
    __aicore__ inline void Process()
    {
        CopyIn();
        CopyToUB();
        CopyOut();
    }

private:
    __aicore__ inline void CopyIn()
    {
        AscendC::LocalTensor<int8_t> featureMapA1 = inQueueA1.AllocTensor<int8_t>();
        uint64_t fm_addr = static_cast<uint64_t>(reinterpret_cast<uintptr_t>(fmGlobal.GetPhyAddr()));
        // aipp config
        AscendC::AippParams<int8_t> aippConfig;
        aippConfig.cPaddingParams.cPaddingMode = cPadMode;
        aippConfig.cPaddingParams.cPaddingValue = cPaddingValue;
        // fmGlobal is the entire input image. Set src1 to the start address of the UV dimension of the image.
        AscendC::SetAippFunctions(fmGlobal, fmGlobal[gmSrc0Size], inputFormat, aippConfig);
        AscendC::LoadImageToLocal(featureMapA1, { horizSize, vertSize, horizStartPos, vertStartPos, srcHorizSize, topPadSize, botPadSize, leftPadSize, rightPadSize });
        inQueueA1.EnQue(featureMapA1);
    }
    __aicore__ inline void CopyToUB()
    {
        AscendC::LocalTensor<int8_t> featureMapA1 = inQueueA1.DeQue<int8_t>();
        AscendC::LocalTensor<int8_t> featureMapUB = outQueueUB.AllocTensor<int8_t>();
        AscendC::DataCopy(featureMapUB, featureMapA1, dstSize);
        event_t eventIdMTE1ToMTE3 = static_cast<event_t>(GetTPipePtr()->FetchEventID(AscendC::HardEvent::MTE1_MTE3));
        AscendC::SetFlag<AscendC::HardEvent::MTE1_MTE3>(eventIdMTE1ToMTE3);
        AscendC::WaitFlag<AscendC::HardEvent::MTE1_MTE3>(eventIdMTE1ToMTE3);
        outQueueUB.EnQue<int8_t>(featureMapUB);
        inQueueA1.FreeTensor(featureMapA1);
    }
    __aicore__ inline void CopyOut()
    {
        AscendC::LocalTensor<int8_t> featureMapUB = outQueueUB.DeQue<int8_t>();
        AscendC::DataCopy(dstGlobal, featureMapUB, dstSize);
        outQueueUB.FreeTensor(featureMapUB);
    }

private:
    AscendC::TPipe pipe;
    AscendC::TQue<AscendC::TPosition::A1, 1> inQueueA1;
    AscendC::TQue<AscendC::TPosition::VECOUT, 1> outQueueUB;

    AscendC::GlobalTensor<uint8_t> fmGlobal;
    AscendC::GlobalTensor<int8_t> dstGlobal;

    uint16_t horizSize = 32, vertSize = 32, horizStartPos = 0, vertStartPos = 0, srcHorizSize = 32, srcVertSize = 32, leftPadSize = 0, rightPadSize = 0;
    uint32_t dstHorizSize = 32, dstVertSize = 32, cSize = 32;
    uint8_t topPadSize = 0, botPadSize = 0;
    uint32_t gmSrc0Size = 0, gmSrc1Size = 0, dstSize = 0;
    AscendC::AippInputFormat inputFormat = AscendC::AippInputFormat::YUV420SP_U8;
    uint32_t cPadMode = 0;
    int8_t cPaddingValue = 0;
};

extern "C" __global__ __aicore__ void load_image_simple_kernel(__gm__ uint8_t *fmGm, __gm__ uint8_t *dstGm)
{
    KernelLoadImage op;
    op.Init(fmGm, dstGm);
    op.Process();
}