Conv2D (Discarded)
Supported Products
Product |
Supported (√/x) |
|---|---|
x |
|
x |
|
x |
|
√ |
|
x |
|
√ |
Functions
This API has been deprecated and will be removed in later versions. Do not use this API.
Performs 2D convolution on a given input tensor and a weight tensor and outputs a result tensor. The Conv2d convolution layer is mostly used for image recognition, and a filter is used to extract features in an image.
Prototype
1 2 | template <typename T, typename U> __aicore__ inline void Conv2D(const LocalTensor<T>& dst, const LocalTensor<U>& featureMap, const LocalTensor<U>& weight, Conv2dParams& conv2dParams, Conv2dTilling& tilling) |
1 2 | template <typename T> __aicore__ inline Conv2dTilling GetConv2dTiling(Conv2dParams& conv2dParams) |
Parameters
Parameter |
Input/Output |
Meaning |
||
|---|---|---|---|---|
dst |
Output |
Destination operand. For the For the Has format [Cout/16, Ho, Wo, 16], and size Cout * Ho * Wo, where Ho and Wo can be calculated as follows: Ho = floor((H + pad_top + pad_bottom - dilation_h * (Kh - 1) - 1) / stride_h + 1) Wo = floor((W + pad_left + pad_right - dilation_w * (Kw - 1) - 1) / stride_w + 1) The hardware requires Ho * Wo to be a multiple of 16. When defining the dst tensor, shape should be rounded up to the multiple of 16. The actual shape size should be Cout * round_howo: round_howo = ceil(Ho * Wo/16) * 16 |
||
featureMap |
Input |
Input tensor. The TPosition of the tensor is A1. Shape of feature_map, in the format [C1, H, W, C0]. C1 * C0 indicates the number of input channels.
H indicates the height. Value range: [1, 40]. W indicates the width. Value range: [1, 40]. |
||
weight |
Input |
Convolution kernel (weight) tensor. The TPosition of the tensor is B1. Shape of weight, in the format [C1, Kh, Kw, Cout, C0]. C1 * C0 indicates the number of input channels.
Cout indicates the number of convolution kernels. The value range is [16, 32, 64, 128], which must be a multiple of 16. Kh indicates the height of the convolution kernel. Value range: [1, 5]. Kw indicates the width of the convolution kernel. Value range is [1, 5]. |
||
conv2dParams |
Input |
Status parameters such as the input matrix shape. The type is Conv2dParams. The specific definition of the structure is as follows:
|
||
tilling |
Input |
Fractal control parameter. The type is Conv2dTilling. The specific definition of the structure is as follows:
|
Parameter |
Input/Output |
Meaning |
|---|---|---|
imgShape |
vector<int> |
Shape of feature_map, in the format [H, W].
|
kernelShape |
vector<int> |
Shape of weight, in the format [Kh, Kw].
|
stride |
vector<int> |
Convolution stride, in the format of [stride_h, stride_w].
|
cin |
int |
Fractal layout parameter. Cin = C1 * C0. Cin indicates the number of input channels. The value range of C1 is [1, 4].
|
cout |
int |
Cout indicates the number of convolution kernels. The value range is [16, 32, 64, 128], which must be a multiple of 16. |
padList |
vector<int> |
Padding factors, in the format of [pad_left, pad_right, pad_top, pad_bottom].
|
dilation |
vector<int> |
Convolution dilation factors, in the format of [dilation_h, dilation_w]
The width of the dilated convolution kernel is dilation_w * (Kw – 1) + 1, and the height of the dilated convolution kernel is dilation_h * (Kh – 1) + 1 |
initY |
uint32_t |
dst initialization enable.
|
partialSum |
uint32_t |
When TPosition where the dstLocal parameter is located is set to CO2, this parameter is used to control whether the computation result is moved out.
|
Parameter |
Input/Output |
Meaning |
||
|---|---|---|---|---|
blockSize |
uint32_t |
Number of elements stored in a dimension. The value is fixed at 16. |
||
loopMode |
LoopMode |
Traversal mode. The structure is defined as follows:
|
||
c0Size |
uint32_t |
Length of a block. The value can be 16 or 32. |
||
dtypeSize |
uint32_t |
Length of the input data, in bytes. The value range is [1, 2]. |
||
strideH |
uint32_t |
Height of the convolution stride. The value range is [1, 4]. |
||
strideW |
uint32_t |
Width of the convolution stride. The value range is [1, 4]. |
||
dilationH |
uint32_t |
Height of the convolution dilation factor. The value range is [1, 4]. |
||
dilationW |
uint32_t |
Width of the convolution dilation factor. The value range is [1, 4]. |
||
hi |
uint32_t |
Height of the feature_map shape. The value range is [1, 40]. |
||
wi |
uint32_t |
Width of the feature_map shape. The value range is [1, 40]. |
||
ho |
uint32_t |
Height of the feature_map shape. The value range is [1, 40]. |
||
wo |
uint32_t |
Width of the feature_map shape. The value range is [1, 40]. |
||
height |
uint32_t |
Height of the weight shape. The value range is [1, 5]. |
||
width |
uint32_t |
Width of the weight shape. The value range is [1, 5]. |
||
howo |
uint32_t |
Size of the feature_map shape (ho * wo) |
||
mNum |
uint32_t |
Equivalent data length of the M axis. The value range is [1, 4096]. |
||
nNum |
uint32_t |
Equivalent data length of the N axis. The value range is [1, 4096]. |
||
kNum |
uint32_t |
Equivalent data length of the K axis. The value range is [1, 4096]. |
||
roundM |
uint32_t |
Equivalent data length of the M axis. The value is rounded up to an integer multiple of blockSize. The value range is [1, 4096]. |
||
roundN |
uint32_t |
Equivalent data length of the N axis. The value is rounded up to an integer multiple of blockSize. The value range is [1, 4096]. |
||
roundK |
uint32_t |
Equivalent data length of the K axis. The value is rounded up to an integer multiple of c0Size. The value range is [1, 4096]. |
||
mBlockNum |
uint32_t |
Number of blocks on the M axis. mBlockNum = mNum/blockSize. The value range is [1, 4096]. |
||
nBlockNum |
uint32_t |
Number of blocks on the N axis. nBlockNum = nNum/blockSize. The value range is [1, 4096]. |
||
kBlockNum |
uint32_t |
Number of blocks on the K axis. kBlockNum = kNum/blockSize. The value range is [1, 4096]. |
||
mIterNum |
uint32_t |
Number of dimensions traversed on the M axis. The value range is [1, 4096]. |
||
nIterNum |
uint32_t |
Number of dimensions traversed on the N axis. The value range is [1, 4096]. |
||
kIterNum |
uint32_t |
Number of dimensions traversed on the K axis. The value range is [1, 4096]. |
||
mTileBlock |
uint32_t |
Number of split blocks on the M axis. The value range is [1, 4096]. |
||
nTileBlock |
uint32_t |
Number of split blocks on the N axis. The value range is [1, 4096]. |
||
kTileBlock |
uint32_t |
Number of split blocks on the K axis. The value range is [1, 4096]. |
||
kTailBlock |
uint32_t |
Number of tail blocks on the K axis. The value range is [1, 4096]. |
||
mTailBlock |
uint32_t |
Number of tail blocks on the M axis. The value range is [1, 4096]. |
||
nTailBlock |
uint32_t |
Number of tail blocks on the N axis. The value range is [1, 4096]. |
||
kHasTail |
bool |
Indicates whether a tail block exists on the K axis. |
||
mHasTail |
bool |
Indicates whether a tail block exists on the M axis. |
||
nHasTail |
bool |
Indicates whether a tail block exists on the N axis. |
||
mTileNums |
uint32_t |
Length of split blocks on the M axis. The value range is [1, 4096]. |
||
mTailNums |
uint32_t |
Length of tail blocks on the M axis. The value range is [1, 4096]. |
feature_map.dtype |
weight.dtype |
dst.dtype |
|---|---|---|
int8_t |
int8_t |
int32_t |
half |
half |
float |
half |
half |
half |
Restrictions
- This instruction does not support the scenario where W is equal to Kw and H is greater than Kh. This will produce unexpected results.
- For details about the operand address alignment requirements, see General Address Alignment Restrictions.