aclnnGatherV2

支持的产品型号

Atlas 训练系列产品。
Atlas A2训练系列产品。

接口原型

每个算子分为，必须先调用“aclnnGatherV2GetWorkspaceSize”接口获取计算所需workspace大小以及包含了算子计算流程的执行器，再调用“aclnnGatherV2”接口执行计算。

aclnnStatus aclnnGatherV2GetWorkspaceSize(const aclTensor *self, int64_t dim, const aclTensor *index, aclTensor *out, uint64_t *workspaceSize, aclOpExecutor **executor)
aclnnStatus aclnnGatherV2(void *workspace, uint64_t workspaceSize, aclOpExecutor *executor, aclrtStream stream)

功能描述

算子功能：从输入Tensor的指定维度dim，按index中的下标序号提取元素，保存到out Tensor中。例如，对于输入张量 $x=\begin{bmatrix}1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9\end{bmatrix}$ 和索引张量 idx=[1, 0]， dim=0的结果： $y=\begin{bmatrix}4 & 5 & 6 \\ 1 & 2 & 3\end{bmatrix}$ ;

dim=1的结果： $y=\begin{bmatrix}2 & 1\\ 5 & 4\\ 8 & 7\end{bmatrix}$ ;

具体计算过程如下：以三维张量为例，shape为(3,2,2)的张量 x = $\begin{bmatrix}[[1,&2],&[3,&4]], \\ [[5,&6],&[7,&8]], \\ [[9,&10],&[11,&12]]\end{bmatrix}$ idx=[1, 0], x张量dim=0,1,2对应的下标分别是 $l, m, n$ , idx是一维（零维的情况：当成是size为1的一维） dim为0： I=index[i]; y $[i][m][n]$ = x $[I][m][n]$

dim为1： J=index[j]; y $[l][j][n]$ = x $[l][J][n]$

dim为2： K=index[k]; y $[l][m][k]$ = x $[l][m][K]$

aclnnGatherV2GetWorkspaceSize

参数说明：
- self(aclTensor*, 计算输入) ：Device侧的aclTensor，数据类型支持FLOAT、FLOAT16、INT64、INT32、INT16、INT8、UINT8、BOOL、DOUBLE、COMPLEX64、COMPLEX128、BFLOAT16（仅Atlas A2训练系列产品支持），支持，支持ND，数据维度不支持8维以上。
- dim(int64_t, 计算输入)：Host侧的整数，数据类型支持INT64。
- index(aclTensor * ,计算输入)：Device侧的aclTensor，数据类型支持INT64、INT32，支持，支持ND，数据维度不支持8维以上。
- out(aclTensor*, 计算输出)：Device侧的aclTensor，数据类型支持FLOAT、FLOAT16、INT64、INT32、INT16、INT8、UINT8、BOOL、DOUBLE、COMPLEX64、COMPLEX128、BFLOAT16（仅Atlas A2训练系列产品支持），数据类型需要与self一致，维数等于self维数与index维数之和减一，除dim维扩展为跟index的shape一样外，其他维长度与self相应维一致，支持ND。
- workspaceSize(uint64_t*, 出参)：返回用户需要在Device侧申请的workspace大小。
- executor(aclOpExecutor**, 出参)：返回op执行器，包含了算子计算流程。
返回值：

aclnnStatus：返回状态码，具体参见。

[object Object]

aclnnGatherV2

参数说明：
- workspace(void*, 入参)：在Device侧申请的workspace内存地址。
- workspaceSize(uint64_t, 入参)：在Device侧申请的workspace大小，由第一段接口aclnnGatherV2GetWorkspaceSize获取。
- executor(aclOpExecutor*, 入参)：op执行器，包含了算子计算流程。
- stream(aclrtStream, 入参)：指定执行任务的 AscendCL Stream流。
返回值：

aclnnStatus：返回状态码，具体参见。

约束与限制

无。

调用示例

示例代码如下，仅供参考，具体编译和执行过程请参考。

[object Object]