aclnnWeightQuantBatchMatmul

该接口后续版本会废弃，请使用undefined、undefined接口。

产品支持情况

[object Object]Atlas A2 训练系列产品/Atlas 800I A2 推理产品/A200I A2 Box 异构组件[object Object]。
[object Object]Atlas A3 训练系列产品/Atlas A3 推理系列产品[object Object]。

功能说明

算子功能：伪量化用于对self * mat2（matmul/batchmatmul）中的mat2进行量化。
计算公式： $result = self@mat2+bias$

函数原型

undefined，必须先调用“aclnnWeightQuantBatchMatmulGetWorkspaceSize”接口获取入参并根据计算流程计算所需workspace大小，再调用“aclnnWeightQuantBatchMatmul”接口执行计算。
- aclnnStatus aclnnWeightQuantBatchMatmulGetWorkspaceSize(const aclTensor *x1, const aclTensor *x2, const aclTensor *diagonalMatrix, const aclTensor *deqOffset, const aclTensor *deqScale, const aclTensor *addOffset, const aclTensor *mulScale, const aclTensor *bias, bool transposeX1, bool transposeX2, float antiquantScale, float antiquantOffset, aclTensor *out, uint64_t *workspaceSize, aclOpExecutor **executor)
- aclnnStatus aclnnWeightQuantBatchMatmul(void *workspace, uint64_t workspaceSize, aclOpExecutor *executor, const aclrtStream stream)

aclnnWeightQuantBatchMatmulGetWorkspaceSize

参数说明
- x1(aclTensor*, 计算输入)：公式中的输入self，数据类型支持FLOAT16，undefined支持ND。不支持undefined。维度仅支持2维不支持batch轴，与x2需满足undefined。
- x2(aclTensor*, 计算输入)：经处理能得到公式中的输入mat2，数据类型支持INT8，undefined支持ND。不支持undefined。维度仅支持2维不支持batch轴，但与x1需满足undefined。
- diagonalMatrix(aclTensor*, 计算输入)：对x2反量化得到公式中的输入mat2，数据类型支持INT8，undefined支持ND。不支持undefined。维度固定为2维，shape为（32, 32），为单位矩阵，m > 64时不参与计算且可以为空。
- deqOffset(aclTensor*, 计算输入)：对x2反量化得到公式中的输入mat2，由addOffset、antiquantOffset、antiquantScale计算得到，计算方式见示例代码，数据类型支持INT32，undefined支持ND。不支持undefined。shape支持1或者n或者（1, 1）或者（1, n）或者（n, 1），需和x2满足undefined。m > 64时不参与计算且可以为空。
- deqScale(aclTensor*, 计算输入)：对x2反量化得到公式中的输入mat2，由接口aclnnTransQuantParam计算得到，计算方式见示例代码，数据类型支持UINT64，undefined支持ND。不支持undefined。shape支持 1 或者 n 或者（1, 1）或者（1, n）或者（n, 1），需和x2满足undefined。m > 64时不参与计算且可以为空。
- addOffset(aclTensor*, 计算输入)：对x2反量化得到公式中的输入mat2，数据类型支持FLOAT16，undefined支持ND。不支持undefined。shape支持 1 或者 n 或者（1, 1）或者（1, n）或者（n, 1），需和x2满足undefined。m < 64时不参与计算, 任意情况都可以为空。
- mulScale(aclTensor*, 计算输入)：对x2反量化得到公式中的输入mat2，数据类型支持FLOAT16，undefined支持ND。不支持undefined。shape支持 1 或者 n 或者（1, 1）或者（1, n）或者（n, 1），需和x2满足undefined。m < 64时不参与计算, 任意情况都可以为空。
- bias(aclTensor*, 计算输入)：公式中的输入bias，数据类型支持FLOAT，undefined支持ND。不支持undefined。维度为1维且值等于N，可以为空。
- transposeX1(bool, 计算输入)：用于描述x1是否转置。
- transposeX2(bool, 计算输入)：用于描述x2是否转置。
- antiquantScale(float, 计算输入)：对x2反量化得到公式中的输入mat2。
- antiquantOffset(float, 计算输入)：对x2反量化得到公式中的输入mat2。
- out(aclTensor*, 计算输出)：公式中的result，数据类型支持FLOAT16和INT8，且数据类型需要是x1与x2推导之后可转换的数据类型，shape需要是x1与x2 broadcast之后的shape。undefined支持ND。
- workspaceSize(uint64_t*, 出参)：返回需要在Device侧申请的workspace大小。
- executor(aclOpExecutor**, 出参)：返回op执行器，包含了算子计算流程。
返回值：

aclnnStatus：返回状态码，具体参见undefined。

[object Object]

aclnnWeightQuantBatchMatmul

参数说明
- workspace(void*, 入参)：在Device侧申请的workspace内存地址。
- workspaceSize(uint64_t, 入参)：在Device侧申请的workspace大小，由第一段接口aclnnWeightQuantBatchMatmulGetWorkspaceSize获取。
- executor(aclOpExecutor*, 入参)：op执行器，包含了算子计算流程。
- stream(aclrtStream, 入参)：指定执行任务的Stream。
返回值：

aclnnStatus：返回状态码，具体参见undefined。

约束说明

无

调用示例

示例代码如下，仅供参考，具体编译和执行过程请参考undefined。

[object Object]