BilinearInterpolation (ISASI)
Function Usage
Functions are classified into horizontal iteration and vertical iteration. In each horizontal iteration, eight offset values are read from src0Offset in sequence, indicating the offset of src0. Each offset value points to the start address of a data block in src0. If repeatMode is set to false, a value is obtained from src1 and multiplied by each value in eight data blocks in src0. If repeatMode is set to true, eight values are obtained from src1 and multiplied by the values in the eight data blocks in src0 in sequence. The dst result of the current iteration and the previous dst result are accumulated by data block and stored in the destination address, the dst address remains unchanged in the same horizontal iteration. Then, vertical iteration is performed. The dst start address of vertical iteration is the dst start address of the previous vertical iteration plus vROffset. The dst space occupied by this round of vertical iteration is the eight blocks after the dst start address. In each round of vertical iteration, hRepeat horizontal iterations are performed.

Prototype
- Bitwise mask mode:
1 2
template <typename T> __aicore__ inline void BilinearInterpolation(const LocalTensor<T> &dstLocal, const LocalTensor<T> &src0Local, const LocalTensor<uint32_t> &src0OffsetLocal, const LocalTensor<T> &src1Local, uint64_t mask, uint8_t hRepeat, bool repeatMode, uint16_t dstBlkStride, uint16_t vROffset, uint8_t vRepeat, const LocalTensor<uint8_t> &sharedTmpBuffer)
- Contiguous mask mode:
1 2
template <typename T> __aicore__ inline void BilinearInterpolation(const LocalTensor<T> &dstLocal, const LocalTensor<T> &src0Local, const LocalTensor<uint32_t> &src0OffsetLocal, const LocalTensor<T> &src1Local, uint64_t mask[], uint8_t hRepeat, bool repeatMode, uint16_t dstBlkStride, uint16_t vROffset, uint8_t vRepeat, const LocalTensor<uint8_t> &sharedTmpBuffer)
Parameters
|
Parameter |
Input/Output |
Description |
|---|---|---|
|
dstLocal |
Output |
Destination operand. The type is LocalTensor, and the supported TPosition is VECIN, VECCALC, or VECOUT. The start address of the LocalTensor must be 32-byte aligned. For the For the For the |
|
src0Local and src1Local |
Input |
Source operand. The type is LocalTensor, and the supported TPosition is VECIN, VECCALC, or VECOUT. The start address of the LocalTensor must be 32-byte aligned. The source operand must have the same data type as the destination operand. For the For the For the |
|
src0OffsetLocal |
Input |
Source operand. The type is LocalTensor, and the supported TPosition is VECIN, VECCALC, or VECOUT. The start address of the LocalTensor must be 32-byte aligned. For the For the For the |
|
mask |
Input |
The mask parameter is used to control the elements involved in computation in each iteration.
|
|
hRepeat |
Input |
Number of horizontal iterations. The value range is [1, 255]. |
|
repeatMode |
Input |
An immediate of type int, specifying the repeat mode. The value range is [0, 1].
|
|
dstBlkStride |
Input |
Address stride of the destination operand between different data blocks in a single repeat, in the unit of 32 bytes. |
|
vROffset |
Input |
Address offset of the destination operand between vertical repeats, in the unit of elements. The value range is [128, 65535]. |
|
vRepeat |
Input |
Number of vertical iterations. The value range is [1, 255]. |
|
sharedTmpBuffer |
Input |
Temporary space. For the For the For the |
Returns
None
Availability
Constraints
- The addresses of src0Local, src1Local, and srcOffsetLocal cannot overlap. In addition, the destination addresses of two vertical repeats cannot overlap.
- For details about the operand address alignment requirements, see General Address Alignment Restrictions.
Examples
- API example - contiguous mask mode
1 2 3 4 5 6 7 8 9 10 11 12
AscendC::LocalTensor<half> dstLocal, src0Local, src1Local; AscendC::LocalTensor<uint32_t> src0OffsetLocal; AscendC::LocalTensor<uint8_t> tmpLocal; uint64_t mask = 128; // Continuous mask mode uint8_t hRepeat = 2; // Two horizontal iterations bool repeatMode = false; // Iteration mode uint16_t dstBlkStride = 1; // Data is continuously written in a single iteration. uint16_t vROffset = 128; // Data is continuously written between adjacent iterations. uint8_t vRepeat = 2; // Two vertical iterations AscendC::BilinearInterpolation(dstLocal, src0Local, src0OffsetLocal, src1Local, mask, hRepeat, repeatMode, dstBlkStride, vROffset, vRepeat, tmpLocal);
- API example - bitwise mask mode
1 2 3 4 5 6 7 8 9 10 11 12
AscendC::LocalTensor<half> dstLocal, src0Local, src1Local; AscendC::LocalTensor<uint32_t> src0OffsetLocal; AscendC::LocalTensor<uint8_t> tmpLocal; uint64_t mask[2] = { UINT64_MAX, UINT64_MAX}; // Bitwise mask mode uint8_t hRepeat = 2; // Two horizontal iterations bool repeatMode = false; // Iteration mode uint16_t dstBlkStride = 1; // Data is continuously written in a single iteration. uint16_t vROffset = 128; // Data is continuously written between adjacent iterations. uint8_t vRepeat = 2; // Two vertical iterations AscendC::BilinearInterpolation(dstLocal, src0Local, src0OffsetLocal, src1Local, mask, hRepeat, repeatMode, dstBlkStride, vROffset, vRepeat, tmpLocal);
Input (src0Local,half): [1, 2, 3, ..., 512] Input (src1Local,half): [2, 3, 4, ..., 17] Input (src0OffsetLocal,uint32_t): [0, 32, 64, ..., 992] Output (dstLocal,half): [389, 394, 399, 404, ..., 4096]