OnehotOperation

功能

onehot编码。

算子功能实现

机器学习的预处理编码方式的一种：onehot编码。onehot编码对depth个状态进行编码，每个状态的对应位置置1，其它位置置0；输出结果为只有一位为1且长为depth的向量。

算子输出输入tensor x各个元素对应的独热编码，结果表现为在axis位置上增加维度depth。

计算过程示意（Python）：

res = np.eye(depth)[input0]

示例1（Python）：

x shape: torch.Size([2, 3])
x: tensor([[4, 4, 6],
           [6, 7, 6]])
depth: 10
axis: -1

# axis为-1表示output的最后一维output[i][j][:] 是x[i][j]对应的onehot编码
out_tensor shape: torch.Size([2, 3, 10])
out_tensor: tensor([[[0, 0, 0, 0, 1, 0, 0, 0, 0, 0],
                     [0, 0, 0, 0, 1, 0, 0, 0, 0, 0],
                     [0, 0, 0, 0, 0, 0, 1, 0, 0, 0]],

                    [[0, 0, 0, 0, 0, 0, 1, 0, 0, 0],
                     [0, 0, 0, 0, 0, 0, 0, 1, 0, 0],
                     [0, 0, 0, 0, 0, 0, 1, 0, 0, 0]]], device='npu:0')

示例2（Python）：

x shape: torch.Size([2, 3])
x: tensor([[2, 0, 4],
           [4, 3, 0]])
depth: 5
axis: 0

# output[:][i][j] 是x[i][j]对应的onehot编码，若axis为1则output[i][:][j] 是x[i][j]对应的onehot编码
out_tensor shape: torch.Size([5, 2, 3])
out_tensor: tensor([[[0, 1, 0],
                    [0, 0, 1]],

                   [[0, 0, 0],
                    [0, 0, 0]],

                   [[1, 0, 0],
                    [0, 0, 0]],

                   [[0, 0, 0],
                    [0, 1, 0]],

                   [[0, 0, 1],
                    [1, 0, 0]]], device='npu:0')

定义

struct OnehotParam {
    int64_t axis = 0;
    int64_t depth = 0;
};

参数列表

成员名称	类型	默认值	描述
axis	int64_t	0	depth所在下标。可为负数，为负数表示output的倒数第axis个维度是对应输入的独热编码。
depth	int64_t	0	每个输入对应的独热编码长度。

输入

参数	维度	数据类型	格式	描述
x	[dim_0，dim_1，... ，dim_n]	int32/int64	ND	输入tensor，表示要获取哪些状态的独热编码。
one	[1]	int32/int64 与x相同	ND	标量1。类型/格式与x保持一致。传给算子用，没有实际意义。
zero	[1]	int32/int64 与x相同	ND	标量0。类型/格式与x保持一致。传给算子用，没有实际意义。

参数

维度

数据类型

格式

描述

[dim_0，dim_1，... ，dim_n]

int32/int64

输入tensor，表示要获取哪些状态的独热编码。

one

[1]

int32/int64

与x相同

标量1。类型/格式与x保持一致。传给算子用，没有实际意义。

zero

[1]

int32/int64

与x相同

标量0。类型/格式与x保持一致。传给算子用，没有实际意义。

输出

参数	维度	数据类型	格式	描述
output	和x相比，在axis上多一个depth维度。	int32/int64 与x相同	ND	输出tensor。类型/格式与x保持一致，输出的独热编码。

参数

维度

数据类型

格式

描述

output

和x相比，在axis上多一个depth维度。

int32/int64

与x相同

输出tensor。类型/格式与x保持一致，输出的独热编码。

规格约束

axis的绝对值小于x的维度数量。
x中元素小于depth。

父主题： atb/infer_op_params.h