aclnnDeformableConv2d

产品支持情况

产品	是否支持
[object Object]Atlas A3 训练系列产品/Atlas A3 推理系列产品[object Object]	√
[object Object]Atlas A2 训练系列产品/Atlas 800I A2 推理产品/A200I A2 Box 异构组件[object Object]	√
[object Object]Atlas 200I/500 A2 推理产品[object Object]	×
[object Object]Atlas 推理系列产品 [object Object]	×
[object Object]Atlas 训练系列产品[object Object]	×

功能说明

算子功能：实现卷积功能，支持2D卷积，同时支持可变形卷积、分组卷积。
计算公式：

假定输入（input）的shape是[N, inC, inH, inW]，输出的（out）的shape为[N, outC, outH, outW]，根据已有参数计算outH、outW:
$outH = (inH + padding[0] + padding[1] - ((K_H - 1) * dilation[2] + 1)) // stride[2] + 1$ $outW = (inW + padding[2] + padding[3] - ((K_W - 1) * dilation[3] + 1)) // stride[3] + 1$
标准卷积计算采样点下标：
$x = -padding[2] + ow*stride[3] + kw*dilation[3]$ $y = -padding[0] + oh*stride[2] + kh*dilation[2]$
根据传入的offset，进行变形卷积，计算偏移后的下标：
$(x,y) = (x + offsetX, y + offsetY)$
使用双线性插值计算偏移后点的值：
$(x_{0}, y_{0}) = (int(x), int(y)) \\ (x_{1}, y_{1}) = (x_{0} + 1, y_{0} + 1)$ $weight_{00} = (x_{1} - x) * (y_{1} - y) \\ weight_{01} = (x_{1} - x) * (y - y_{0}) \\ weight_{10} = (x - x_{0}) * (y_{1} - y) \\ weight_{11} = (x - x_{0}) * (y - y_{0}) \\$ $deformOut(x, y) = weight_{00} * input(x0, y0) + weight_{01} * input(x0,y1) + weight_{10} * input(x1, y0) + weight_{11} * input(x1,y1)$
进行卷积计算得到最终输出：
$\text{out}(N_i, C_{\text{out}_j}) = \text{bias}(C_{\text{out}_j}) + \sum_{k = 0}^{C_{\text{in}} - 1} \text{weight}(C_{\text{out}_j}, k) \star \text{deformOut}(N_i, k)$

函数原型

每个算子分为undefined，必须先调用“aclnnDeformableConv2dGetWorkspaceSize”接口获取计算所需workspace大小以及包含了算子计算流程的执行器，再调用“aclnnDeformableConv2d”接口执行计算。

aclnnStatus aclnnDeformableConv2dGetWorkspaceSize(const aclTensor* x, const aclTensor* weight, const aclTensor* offset, const aclTensor* biasOptional, const aclIntArray* kernelSize, const aclIntArray* stride, const aclIntArray* padding, const aclIntArray* dilation, int64_t groups, int64_t deformableGroups, bool modulated, aclTensor* out, aclTensor* deformOutOptional, uint64_t* workspaceSize, aclOpExecutor** executor)
aclnnStatus aclnnDeformableConv2d(void* workspace, uint64_t workspaceSize, aclOpExecutor* executor, aclrtStream stream)

aclnnDeformableConv2dGetWorkspaceSize

参数说明：
- x（aclTensor*，计算输入）：输入的原始数据，对应公式中的input，Device侧的aclTensor。shape为[N, inC, inH, inW]，其中inH * inW不能超过2147483647。支持undefined，undefined支持ND、NCHW，数据类型支持FLOAT32、FLOAT16、BFLOAT16。
- weight（aclTensor*，计算输入）：可学习过滤器的4D张量，对应公式中的weight，Device侧的aclTensor。shape为[outC, inC/groups, K_H, K_W]，支持undefined，undefined支持ND、NCHW，数据类型支持FLOAT32、FLOAT16、BFLOAT16。数据类型、数据格式与x保持一致。
- offset（aclTensor*，计算输入）：x-y坐标偏移和掩码的4D张量，对应公式中的offset，Device侧的aclTensor。当modulated为True时，shape为[N, 3 * deformableGroups * K_H * K_W, outH, outW]，当modulated为False时，shape为[N, 2 * deformableGroups * K_H * K_W, outH, outW]。支持undefined，undefined支持ND、NCHW，数据类型支持FLOAT32、FLOAT16、BFLOAT16。数据类型、数据格式与x保持一致。
- biasOptional（aclTensor*，计算输入）：过滤器输出附加偏置的1D张量，对应公式中的bias，Device侧的aclTensor。可选输入，不需要时为空指针，存在时shape为[outC]，支持undefined，undefined支持ND，数据类型支持FLOAT32、FLOAT16、BFLOAT16，数据类型与x保持一致。
- kernelSize（aclIntArray*，计算输入）：表示卷积核大小，对应公式中的kernelSize，Host侧的aclIntArray。size大小为2(K_H, K_W)，各元素均大于零，K_H*K_W不能超过2048，K_H*K_W*inC/groups不能超过65535。
- stride（aclIntArray*，计算输入）：表示每个输入维度的滑动窗口步长，对应公式中的stride，Host侧的aclIntArray，size大小为4，各元素均大于零，维度顺序根据x的数据格式解释。N维和C维必须设置为1。
- padding（aclIntArray*，计算输入）：表示要添加到输入每侧（顶部、底部、左侧、右侧）的像素数，对应公式中的padding，Host侧的aclIntArray，size大小为4。
- dilation（aclIntArray*，计算输入）：表示输入每个维度的膨胀系数，对应公式中的dilation，Host侧的aclIntArray，size大小为4，各元素均大于零，维度顺序根据x的数据格式解释。N维和C维必须设置为1。
- groups（int64_t，计算输入）：表示从输入通道到输出通道的阻塞连接数，对应公式中的groups，数据类型支持int64_t，inC和outC需都可被groups数整除，groups数大于零。
- deformableGroups（int64_t，计算输入）：表示可变形组分区的数量，对应公式中的deformableGroups，数据类型支持int64_t，inC需可被deformableGroups数整除，deformableGroups数大于零。
- modulated（bool，计算输入）：预留参数，当前只支持true，若为true，offset中包含掩码，若为false，则不包含。
- out（aclTensor*，计算输出）：输出的数据，对应公式中的out，Device侧的aclTensor，shape为[N, outC, outH, outW]，支持undefined，undefined支持ND、NCHW，数据类型支持FLOAT32、FLOAT16、BFLOAT16。数据类型、数据格式与x保持一致。
- deformOutOptional（aclTensor*，计算输出）：可选输出，对应公式中的deformOut，Device侧的aclTensor，shape为[N, inC, outH * K_H, outW * K_W]，支持undefined，undefined支持ND、NCHW，数据类型支持FLOAT32、FLOAT16、BFLOAT16。数据类型、数据格式与x保持一致。
- workspaceSize（uint64_t*，出参）：返回需要在Device侧申请的workspace大小。
- executor（aclOpExecutor**，出参）：返回op执行器，包含了算子计算流程。
返回值：

aclnnStatus：返回状态码，具体参见undefined。

[object Object]

aclnnDeformableConv2d

参数说明：
- workspace（void*，入参）：在Device侧申请的workspace内存地址。
- workspaceSize（uint64_t，入参）：在Device侧申请的workspace大小，由第一段接口aclnnDeformableConv2dGetWorkspaceSize获取。
- executor（aclOpExecutor*，入参）：op执行器，包含了算子计算流程。
- stream（aclrtStream，入参）：指定执行任务的Stream。
返回值：

aclnnStatus：返回状态码，具体参见undefined。

约束说明

无。

调用示例

示例代码如下，仅供参考，具体编译和执行过程请参考undefined。

[object Object]