LoadImageToLocal
Supported Products
Product |
Supported/Unsupported |
|---|---|
√ |
|
√ |
|
√ |
|
√ |
|
x |
|
x |
Function Usage
Transfers image data from the global memory to the local memory. During data movement, you can preprocess images, including image flipping, image resizing (clipping, cropping, resizing, and stretching), color space conversion (CSC), and type conversion. The parameters related to image preprocessing are configured in SetAippFunctions.
Prototype
1 2 | template <typename T> __aicore__ inline void LoadImageToLocal(const LocalTensor<T>& dst, const LoadImageToLocalParams& loadDataParams) |
Parameters
Parameter |
Input/Output |
Meaning |
|---|---|---|
dst |
Output |
Destination operand of the LocalTensor type. The start address of LocalTensor must be 32-byte aligned. |
loadDataParams |
Input |
LoadData parameter structure, of the LoadImageToLocalParams type. For details, see ${INSTALL_DIR}/include/ascendc/basic_api/interface/kernel_struct_mm.h. Replace ${INSTALL_DIR} with the actual CANN component directory. For details about the parameter description, see Table 2. |
Parameter |
Input/Output |
Meaning |
|---|---|---|
horizSize |
Input |
Horizontal width of the image loaded from the source image, in pixels. Value range: horSize ∈ [2, 4095]. |
vertSize |
Input |
Vertical height of the image loaded from the source image, in pixels. Value range: verSize ∈ [2, 4095]. |
horizStartPos |
Input |
Horizontal start address of the loaded image on the source image, in pixels. Value range: horizStartPos ∈ [0, 4095]. The default value is 0. Note: When the input image format is YUV420SP, XRGB8888, RGB888, or YUV400, the value of this parameter must be an even number. |
vertStartPos |
Input |
Vertical start address of the loaded image on the source image, in pixels. Value range: vertStartPos ∈ [0, 4095]. The default value is 0. Note: If the input image is in YUV420SP format, the value of this parameter must be an even number. |
srcHorizSize |
Input |
Horizontal width of the source image, in pixels. Value range: srcHorizSize ∈ [2, 4095]. Note: If the input image is in YUV420SP format, the value of this parameter must be an even number. |
topPadSize |
Input |
Number of pixels padded on the top of the destination image. Value range: topPadSize ∈ [0, 32]. The default value is 0. This parameter is used for data padding. You need to call SetAippFunctions and use AippPaddingParams to configure the padding value, and then use topPadSize, botPadSize, leftPadSize, and rightPadSize to configure the padding size range. |
botPadSize |
Input |
Number of pixels padded at the bottom of the destination image. Value range: botPadSize ∈ [0, 32]. The default value is 0. |
leftPadSize |
Input |
Number of pixels padded on the left of the destination image. Value range: leftPadSize ∈ [0, 32]. The default value is 0. |
rightPadSize |
Input |
Number of pixels padded on the right of the destination image. Value range: rightPadSize ∈ [0, 32]. The default value is 0. |
sid |
Input |
Reserved. It is reserved for subsequent functions. You can use the default value for now. |
Restrictions
- For details about the operand address alignment requirements, see General Address Alignment Restrictions.
- The size of the image loaded to the destination image plus the padding size must be less than or equal to the size of the storage space.
- When the padding mode is set to block padding or mirrored block padding by using SetAippFunctions, the left and right padding sizes (leftPadSize and rightPadSize) must be less than or equal to the horizontal size (horizSize) of the cropped image, and the top and bottom padding sizes (topPadSize and botPadSize) must be less than or equal to the vertical size (vertSize) of the cropped image, because the padded data comes from the cropped image.
Returns
None
Example
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 | #include "kernel_operator.h" class KernelLoadImage { public: __aicore__ inline KernelLoadImage() { // Size of the Y dimension of the image in YUV420SP format. gmSrc0Size = srcHorizSize * srcVertSize; // Size of the UV dimension of the image in YUV420SP format. gmSrc1Size = (srcHorizSize / 2) * (srcVertSize / 2) * 2; dstSize = dstHorizSize * dstVertSize * cSize; } __aicore__ inline void Init(__gm__ uint8_t *fmGm, __gm__ uint8_t *dstGm) { fmGlobal.SetGlobalBuffer((__gm__ uint8_t *)fmGm); dstGlobal.SetGlobalBuffer((__gm__ int8_t *)dstGm); pipe.InitBuffer(inQueueA1, 1, (gmSrc0Size + gmSrc1Size) * sizeof(int8_t)); pipe.InitBuffer(outQueueUB, 1, dstSize * sizeof(int8_t)); } __aicore__ inline void Process() { CopyIn(); CopyToUB(); CopyOut(); } private: __aicore__ inline void CopyIn() { AscendC::LocalTensor<int8_t> featureMapA1 = inQueueA1.AllocTensor<int8_t>(); uint64_t fm_addr = static_cast<uint64_t>(reinterpret_cast<uintptr_t>(fmGlobal.GetPhyAddr())); // aipp config AscendC::AippParams<int8_t> aippConfig; aippConfig.cPaddingParams.cPaddingMode = cPadMode; aippConfig.cPaddingParams.cPaddingValue = cPaddingValue; // fmGlobal is the entire input image. Set src1 to the start address of the UV dimension of the image. AscendC::SetAippFunctions(fmGlobal, fmGlobal[gmSrc0Size], inputFormat, aippConfig); AscendC::LoadImageToLocal(featureMapA1, { horizSize, vertSize, horizStartPos, vertStartPos, srcHorizSize, topPadSize, botPadSize, leftPadSize, rightPadSize }); inQueueA1.EnQue(featureMapA1); } __aicore__ inline void CopyToUB() { AscendC::LocalTensor<int8_t> featureMapA1 = inQueueA1.DeQue<int8_t>(); AscendC::LocalTensor<int8_t> featureMapUB = outQueueUB.AllocTensor<int8_t>(); AscendC::DataCopy(featureMapUB, featureMapA1, dstSize); event_t eventIdMTE1ToMTE3 = static_cast<event_t>(GetTPipePtr()->FetchEventID(AscendC::HardEvent::MTE1_MTE3)); AscendC::SetFlag<AscendC::HardEvent::MTE1_MTE3>(eventIdMTE1ToMTE3); AscendC::WaitFlag<AscendC::HardEvent::MTE1_MTE3>(eventIdMTE1ToMTE3); outQueueUB.EnQue<int8_t>(featureMapUB); inQueueA1.FreeTensor(featureMapA1); } __aicore__ inline void CopyOut() { AscendC::LocalTensor<int8_t> featureMapUB = outQueueUB.DeQue<int8_t>(); AscendC::DataCopy(dstGlobal, featureMapUB, dstSize); outQueueUB.FreeTensor(featureMapUB); } private: AscendC::TPipe pipe; AscendC::TQue<AscendC::TPosition::A1, 1> inQueueA1; AscendC::TQue<AscendC::TPosition::VECOUT, 1> outQueueUB; AscendC::GlobalTensor<uint8_t> fmGlobal; AscendC::GlobalTensor<int8_t> dstGlobal; uint16_t horizSize = 32, vertSize = 32, horizStartPos = 0, vertStartPos = 0, srcHorizSize = 32, srcVertSize = 32, leftPadSize = 0, rightPadSize = 0; uint32_t dstHorizSize = 32, dstVertSize = 32, cSize = 32; uint8_t topPadSize = 0, botPadSize = 0; uint32_t gmSrc0Size = 0, gmSrc1Size = 0, dstSize = 0; AscendC::AippInputFormat inputFormat = AscendC::AippInputFormat::YUV420SP_U8; uint32_t cPadMode = 0; int8_t cPaddingValue = 0; }; extern "C" __global__ __aicore__ void load_image_simple_kernel(__gm__ uint8_t *fmGm, __gm__ uint8_t *dstGm) { KernelLoadImage op; op.Init(fmGm, dstGm); op.Process(); } |