vec_conv

功能说明

根据src与dst tensor的类型进行精度转换。

在介绍不同的类型转换模式之前，先介绍下浮点数的表示方式：

i、float16共16 bit，包括1 bit符号位（S），5 bit指数位（E）和10 bit尾数位（M）。

当E不全为0或不全为1时，表示的结果为

当E全为0时，表示的结果为

当E全为1时，若M全为0，表示的结果为±inf（取决于符号位）；若M不全为0，表示的结果为nan。

上图中S=0，E=15，M=2**(-1)+2**(-2)，表示的结果为1.75。

ii、float32共32 bit，包括1 bit符号位（S），8 bit指数位（E）和23 bit尾数位（M）。

当E不全为0或不全为1时，表示的结果为

当E全为0时，表示的结果为

当E全为1时，若M全为0，表示的结果为±inf（取决于符号位）；若M不全为0，表示的结果为nan。

上图中S=0，E=127，M=2**(-1)+2**(-2)，最终表示的结果为1.75。

vec_conv所支持的类型转换模式如下：

1、f322f32：将src按照round_mode取整，仍以float32格式存入dst中；

示例：输入0.5

'round'模式输出0.0，'floor'模式输出0.0，'ceil'模式出1.0，'away-zero'模式输出1.0，'to-zero'模式输出0.0

2、f322f16：将src按照round_mode取到float16所能表示的数，以float16格式（溢出默认按照饱和处理）存入dst中；

示例：输入0.5+2**(-12)，写成float32的表示形式：2**(-1) * (1+2**(-11))，因此E=-1+127=126，M=2**（-11）：

float16的指数位可以表示的出2**(-1)，有E=-1+15=14，

但float16只有10 bit尾数位，因此灰色部分要进行舍入。

'round'模式舍入得尾数0000000000，E=14，M=0，最终表示的结果为0.5；

'floor'模式舍入得尾数0000000000，E=14，M=0，最终表示的结果为0.5；

'ceil'模式舍入得尾数0000000001，E=14，M=2**(-10)，最终表示的结果为0.5+2**(-11)；

'away-zero'模式舍入得尾数0000000001，E=14，M=2**(-10)，最终表示的结果为0.5+2**(-11)；

'to-zero'模式舍入得尾数0000000000，E=14，M=0，最终表示的结果为0.5；

'odd'模式舍入得尾数0000000001，E=14，M=2**(-10)，最终表示的结果为0.5+2**(-11)

3、f322s64：将src按照round_mode取整，以int64格式（溢出默认按照饱和处理）存入dst中；

示例：输入2**22+0.5

'round'模式输出2**22，'floor'模式输出2**22，'ceil'模式出2**22+1，'away-zero'模式输出2**22+1，'to-zero'模式输出2**22

4、f322s32：将src按照round_mode取整，以int32格式（溢出默认按照饱和处理）存入dst中；

示例：输入2**22+0.5

'round'模式输出2**22，'floor'模式输出2**22，'ceil'模式出2**22+1，'away-zero'模式输出2**22+1，'to-zero'模式输出2**22

5、f322s16：将src按照round_mode取整，以int16格式（溢出默认按照饱和处理）存入dst中；

示例：输入2**22+0.5

'round'模式输出2**15-1，'floor'模式输出2**15-1，'ceil'模式出2**15-1，'away-zero'模式输出2**15-1，'to-zero'模式输出2**15-1（溢出处理）

6、f162f32：将src以float32格式存入dst中，不存在精度转换问题，无舍入模式；

示例：输入1.5-2**(-10），输出1.5-2**(-10）

7、f162s32：将src按照round_mode取整，以int32格式存入dst中；

示例：输入-1.5

'round'模式输出-2，'floor'模式输出-2，'ceil'模式出-1，'away-zero'模式输出-2，'to-zero'模式输出-1

8、f162s16：将src按照round_mode取整，以int16格式（溢出默认按照饱和处理）存入dst中；

示例：输入2**7-0.5

'round'模式输出2**7，'floor'模式输出2**7-1，'ceil'模式出2**7，'away-zero'模式输出2**7，'to-zero'模式输出2**7-1

9、f162s8：将src按照round_mode取整，以int8格式（溢出默认按照饱和处理）存入dst中；

示例：输入2**7-0.5

'round'模式输出2**7-1，'floor'模式输出2**7-1，'ceil'模式出2**7-1，'away-zero'模式输出2**7-1，'to-zero'模式输出2**7-1（溢出处理）

10、f162u8：将src按照round_mode取整，以uint8格式（溢出默认按照饱和处理）存入dst中；

示例：输入1.75

'round'模式输出2，'floor'模式输出1，'ceil'模式出2，'away-zero'模式输出2，'to-zero'模式输出2

11、u82f16：将src以float16格式存入dst中，不存在精度转换问题，无舍入模式；

示例：输入1，输出1.0

12、s82f16：将src以float16格式存入dst中，不存在精度转换问题，无舍入模式；

示例：输入-1，输出-1.0

13、s162f16：将src按照round_mode取到float16所能表示的数，以float16格式存入dst中；

示例：输入2**12+2，写成float16的表示形式：2**12 * (1+2**(-11))，要求E=12+15=27，M=2**(-11)：

由于float16只有10 bit尾数位，因此灰色部分要进行舍入。

'round'模式舍入得尾数0000000000，E=27，M=0，最终表示的结果为2**12；

'floor'模式舍入得尾数0000000000，E=27，M=0，最终表示的结果为2**12；

'ceil'模式舍入得尾数0000000001，E=27，M=2**(-10)，最终表示的结果为2**12+4；

'away-zero'模式舍入得尾数0000000001，E=27，M=2**(-10)，最终表示的结果为2**12+4；

'to-zero'模式舍入得尾数0000000000，E=27，M=0，最终表示的结果为2**12

14、s162f32：将src以float32格式存入dst中，不存在精度转换问题，无舍入模式；

示例：输入2**15-1，输出2**15-1

15、s162s8：如果deqscale为Tensor类型，则取该Tensor的前16个元素，记为{deq_factor[i]，i=0,1,2,...,15}；如果deqscale为int、Scalar、Expr类型，则各个deq_factor[i]的值都等于该deqscale。要求deq_factor[i][46](第46比特位)=1，将deq_factor[i][31:13](第31到13比特位)看作float类型(1 bit符号位，8 bit指数位，10 bit尾数位)，记作scale[i]，将deq_factor[i][45:37](第45到37比特位)看作int9类型，记作offset[i];

deqscale也支持分开传入scale和offset，此时传入一个包含两个元素的list/tuple，其中第一个元素代表scale[i]，要求是int/float类型，取值不能超过19bit浮点数(1 bit符号位，8 bit指数位，10 bit尾数位)的表示范围，如图，若scale输入0.5+2**(-12)，10bit的尾数无法表示下该数，灰色部分会被舍弃（相当于全为0），最终真正生效的scale值为0.5。

第二个元素代表offset[i]，要求是int类型，取值在[-256, 255]之间。

每次转换16个src元素{src[j*16+i]，i=0,1,2,...,15，j=0,1,2,...}，先计算src[j*16+i]*scale[i]（float32类型），然后按'round'模式取整到int9（溢出按照饱和处理），然后将中间结果加上offset，再以int8格式（溢出按照饱和处理）存入dst[j*16+i]中；

示例：deqscale为Tensor，存储了16个uint64类型的数据：2**46+2**31+（127+i）*2**23（scale[i]=-2**i， offset[i]=0），

输入[1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1]

输出[-1,-2,-4,-8,-16,-32,-64,-128,-128,-128,-128,-128,-128,-128,-128,-128]（溢出处理）

16、s162u8：如果deqscale为Tensor类型，则取该Tensor的前16个元素，记为{deq_factor[i]，i=0,1,2,...,15}；如果deqscale为int、Scalar、Expr类型，则各个deq_factor[i]的值都等于该deqscale。要求deq_factor[i][46](第46比特位)=0，将deq_factor[i][31:13](第31到13比特位)看作float类型(1 bit符号位，8 bit指数位，10 bit尾数位)，记作scale[i]，将deq_factor[i][45:37](第45到37比特位)看作int9类型，记作offset[i];

deqscale也支持分开传入scale和offset，此时传入一个包含两个元素的list/tuple，其中第一个元素代表scale[i]，要求是int/float类型，取值不能超过19bit浮点数(1 bit符号位，8 bit指数位，10 bit尾数位)的表示范围；第二个元素代表offset[i]，要求是int类型，取值在[-256, 255]之间。

每次转换16个src元素{src[j*16+i]，i=0,1,2,...,15，j=0,1,2,...}，先计算src[j*16+i]*scale[i]（float32类型），然后按'round'模式取整到int9（溢出按照饱和处理），然后将中间结果加上offset，再以uint8格式（溢出按照饱和处理）存入dst[j*16+i]中；

示例：deqscale为Tensor，存储了16个uint64类型的数据：i*2**37+127*2**23（scale[i]=1， offset[i]=i）

输入[1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1]

输出[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16]

17、s322f32：将src按照round_mode取到float32所能表示的数，以float32格式存入dst中；

示例：输入2**25+3，写成float32的表示形式：2**25 * (1+2**(-24)+2**(-25))，要求E=25+127=152，M=2**(-24)+2**(-25)：

由于float32只有23 bit尾数位，因此灰色部分要进行舍入。

'round'模式舍入得尾数00000000000000000000001，E=152，M=2**(-23)，最终表示的结果为2**25+4；

'floor'模式舍入得尾数00000000000000000000000，E=152，M=0，最终表示的结果为2**25；

'ceil'模式舍入得尾数00000000000000000000001，E=152，M=2**(-23)，最终表示的结果为2**25+4；

'away-zero'模式舍入得尾数00000000000000000000001，E=152，M=2**(-23)，最终表示的结果为2**25+4；

'to-zero'模式舍入得尾数00000000000000000000000，E=152，M=0，最终表示的结果为2**25

18、s322f16：将src*deqscale按'round'模式取到float16能表示的数，以float16格式（溢出默认按照饱和处理）存入dst中

示例：deqscale=3.0，输入2**10+1，乘积为2**11+2**10+3，按照'round'模式舍入得2**11+2**10+4（参考s162f16的转换

19、s322s64：将src以int64格式存入dst中，不存在精度转换问题，无舍入模式；

示例：输入2**31-1，输出2**31-1

20、s322s16：将src以int16格式（溢出默认按照饱和处理）存入dst中，不存在精度转换问题，无舍入模式；

示例：输入2**31-1，输出2**15-1

21、s642s32：将src以int32格式（溢出默认按照饱和处理）存入dst中，不存在精度转换问题，无舍入模式；

示例：输入2**31，输出2**31-1

22、s642f32：将src按照round_mode取到float32所能表示的数，以float32格式存入dst中；

示例：输入2**35+2**12+2**11，写成float32的表示形式：2**35 * (1+2**(-23)+2**(-24))，要求E=35+127=162，M=2**(-23)+2**(-24)：

由于float32只有23 bit尾数位，因此灰色部分要进行舍入。

'round'模式舍入得尾数00000000000000000000010，E=162，M=2**(-22)，最终表示的结果为2**35+2**13；

'floor'模式舍入得尾数00000000000000000000001，E=162，M=2**(-23)，最终表示的结果为2**25+2**12；

'ceil'模式舍入得尾数00000000000000000000010，E=162，M=2**(-22)，最终表示的结果为2**25+2**13；

'away-zero'模式舍入得尾数00000000000000000000010，E=162，M=2**(-22)，最终表示的结果为2**25+2**13；

'to-zero'模式舍入得尾数00000000000000000000001，E=162，M=2**(-23)，最终表示的结果为2**25+2**12

函数原型

vec_conv(mask, round_mode, dst, src, repeat_times, dst_rep_stride, src_rep_stride, deqscale=None, ldst_high_half=False)

参数说明

表1 参数说明
参数名称	输入/输出	含义
mask	输入	请参考表1中mask参数描述。
round_mode	输入	在转换过程中，最后一位转换处理模式，支持以下字符串配置： ‘‘ 或‘none’：在转换有精度损失时表示'round'模式，不涉及精度损失时表示不取整 ; ‘round’四舍六入五成双（C语言rint）; ‘floor’：向负无穷舍入（C语言floor）； ‘ceil’或‘ceiling’：向正无穷舍入（C语言ceil） ‘away-zero’：四舍五入（C语言round）; ‘to-zero’：向零舍入（C语言trunc） ‘odd’：最近邻奇数舍入（Von Neumann rounding）
dst	输出	目的操作数。 Tensor的scope为Unified Buffer。
src	输入	源操作数。 Tensor的scope为Unified Buffer。
repeat_times	输入	重复迭代次数。
dst_rep_stride	输入	相邻迭代间，目的操作数相同block地址步长。
src_rep_stride	输入	相邻迭代间，源操作数相同block地址步长。
deqscale	输入	量化scale，辅助转换参数，默认值为None。
ldst_high_half	输入	指定dst_list/src_list是存储/来自每个block的高半部或者低半部，默认值为False。支持bool类型，True/False表示高半部/低半部；注：该参数对于不同组合会分别表现为不同的功能定义，分别表示dst_list/src_list的存储/读取。 Atlas 200/300/500 推理产品，不支持该参数。 Atlas 训练系列产品，不支持该参数。

表2 Atlas 200/300/500 推理产品round mode说明
src.dtype	dst.dtype	round_mode supported	deqscale
float16	int32	'round', 'floor', 'ceil', 'ceiling'	None
float16	float32	'', 'none'	None
float32	float16	'', 'none'	None
float16	int8	'', 'none'	None
float16	uint8	'', 'none'	None
int32	float16	'', 'none'	Scalar(float16)/立即数(float)
uint8	float16	'', 'none'	None
int8	float16	'', 'none'	None

表3 Atlas 训练系列产品round mode说明
src.dtype	dst.dtype	round_mode supported	deqscale
float16	int32	'round', 'floor', 'ceil', 'ceiling', 'away-zero', 'to-zero'	None
float32	int32	'round', 'floor', 'ceil', 'ceiling', 'away-zero', 'to-zero'	None
int32	float32	'', 'none'	None
float16	float32	'', 'none'	None
float32	float16	'', 'none', 'odd'	None
float16	int8	'', 'none', 'floor', 'ceil', 'ceiling', 'away-zero', 'to-zero'	None
float16	uint8	'', 'none', 'floor', 'ceil', 'ceiling', 'away-zero', 'to-zero'	None
int32	float16	'', 'none'	Scalar(float16)/立即数(float)
uint8	float16	'', 'none'	None
int8	float16	'', 'none'	None

表4 Atlas推理系列产品AI Core round mode说明
src.dtype	dst.dtype	round_mode supported	deqscale	ldst_high_half
float16	int32	'round', 'floor', 'ceil', 'ceiling', 'away-zero', 'to-zero'	None	None
float32	int32	'round', 'floor' , 'ceil', 'ceiling', 'away-zero', 'to-zero'	None	None
float16	int16	'round'	None	None
int32	float32	'', 'none'	None	None
float16	float32	'', 'none'	None	None
float16	int8	'', 'none', 'floor', 'ceil', 'ceiling', 'away-zero', 'to-zero'	None	None
float16	uint8	'', 'none', 'floor', 'ceil', 'ceiling', 'away-zero', 'to-zero'	None	None
int32	float16	'', 'none'	Scalar(float16)/立即数(float)	None
uint8	float16	'', 'none'	None	None
int8	float16	'', 'none'	None	None
float32	float16	'', 'none', 'odd'	None	None
int16	uint8	'', 'none'	立即数(int)/Tensor(uint64)/Scalar(uint64)/Expr(uint64)	True/False
int16	int8	'', 'none'	立即数(int)/Tensor(uint64)/Scalar(uint64)/Expr(uint64)	True/False
int16	float16	'', 'none'	None	None

表5 Atlas推理系列产品Vector Core round mode说明
src.dtype	dst.dtype	round_mode supported	deqscale	ldst_high_half
float16	int32	'round', 'floor', 'ceil', 'ceiling', 'away-zero', 'to-zero'	None	None
float32	int32	'round', 'floor' , 'ceil', 'ceiling', 'away-zero', 'to-zero'	None	None
float16	int16	'round'	None	None
float32	int16	'round', 'to-zero'	None	None
int32	float32	'', 'none'	None	None
float16	float32	'', 'none'	None	None
float16	int8	'', 'none', 'floor', 'ceil', 'ceiling', 'away-zero', 'to-zero'	None	None
float16	uint8	'', 'none', 'floor', 'ceil', 'ceiling', 'away-zero', 'to-zero'	None	None
int32	float16	'', 'none'	Scalar(float16)/立即数(float)	None
uint8	float16	'', 'none'	None	None
int8	float16	'', 'none'	None	None
float32	float16	'', 'none', 'odd'	None	None
int16	uint8	'', 'none'	立即数(int)/Tensor(uint64)/Scalar(uint64)/Expr(uint64)	True/False
int16	int8	'', 'none'	立即数(int)/Tensor(uint64)/Scalar(uint64)/Expr(uint64)	True/False
int16	float16	'', 'none'	None	None
int16	float32	'', 'none'	None	None

表6 Atlas A2训练系列产品/Atlas 800I A2推理产品 round mode说明
src.dtype	dst.dtype	round_mode supported	deqscale	ldst_high_half
float32	float32	'round', 'floor', 'ceil', 'ceiling', 'away-zero', 'to-zero'	None	None
float32	float16	'', 'none'，'round', 'floor' , 'ceil', 'ceiling', 'away-zero', 'to-zero'，'odd'	None	None
float32	int64	'round', 'floor', 'ceil', 'ceiling', 'away-zero', 'to-zero'	None	None
float32	int32	'round', 'floor', 'ceil', 'ceiling', 'away-zero', 'to-zero'	None	None
float32	int16	'round', 'floor', 'ceil', 'ceiling', 'away-zero', 'to-zero'	None	None
float16	float32	'', 'none'	None	None
float16	int32	'round', 'floor', 'ceil', 'ceiling', 'away-zero', 'to-zero'	None	None
float16	int16	'round', 'floor', 'ceil', 'ceiling', 'away-zero', 'to-zero'	None	None
float16	int8	'', 'none'，'round', 'floor', 'ceil', 'ceiling', 'away-zero', 'to-zero'	None	None
float16	uint8	'', 'none'，'round', 'floor', 'ceil', 'ceiling', 'away-zero', 'to-zero'	None	None
uint8	float16	'', 'none'	None	None
int8	float16	'', 'none'	None	None
int16	float16	'', 'none'，'round', 'floor', 'ceil', 'ceiling', 'away-zero', 'to-zero'	None	None
int16	float32	'', 'none'	None	None
int16	int8	'', 'none'	立即数(int)/Tensor(uint64)/Scalar(uint64)/Expr(uint64)	True/False
int16	uint8	'', 'none'	立即数(int)/Tensor(uint64)/Scalar(uint64)/Expr(uint64)	True/False
int32	float32	'', 'none'，'round', 'floor', 'ceil', 'ceiling', 'away-zero', 'to-zero'	None	None
int32	float16	'', 'none'	Scalar(float16)/立即数(float)	None
int32	int64	'', 'none'	None	None
int32	int16	'', 'none'	None	None
int64	int32	'', 'none'	None	None
int64	float32	'round', 'floor', 'ceil', 'ceiling', 'away-zero', 'to-zero'	None	None

表7 Atlas 200/500 A2推理产品 round mode说明
src.dtype	dst.dtype	round_mode supported	deqscale	ldst_high_half
float32	float32	'round', 'floor', 'ceil', 'ceiling', 'away-zero', 'to-zero'	None	None
float32	float16	'', 'none'，'round', 'floor' , 'ceil', 'ceiling', 'away-zero', 'to-zero'，'odd'	None	None
float32	int64	'round', 'floor', 'ceil', 'ceiling', 'away-zero', 'to-zero'	None	None
float32	int32	'round', 'floor', 'ceil', 'ceiling', 'away-zero', 'to-zero'	None	None
float32	int16	'round', 'floor', 'ceil', 'ceiling', 'away-zero', 'to-zero'	None	None
float16	float32	'', 'none'	None	None
float16	int32	'round', 'floor', 'ceil', 'ceiling', 'away-zero', 'to-zero'	None	None
float16	int16	'round', 'floor', 'ceil', 'ceiling', 'away-zero', 'to-zero'	None	None
float16	int8	'', 'none'，'round', 'floor', 'ceil', 'ceiling', 'away-zero', 'to-zero'	None	None
float16	uint8	'', 'none'，'round', 'floor', 'ceil', 'ceiling', 'away-zero', 'to-zero'	None	None
uint8	float16	'', 'none'	None	None
int8	float16	'', 'none'	None	None
int16	float16	'', 'none'，'round', 'floor', 'ceil', 'ceiling', 'away-zero', 'to-zero'	None	None
int16	float32	'', 'none'	None	None
int16	uint8	'', 'none'	None	None
int32	float32	'', 'none'，'round', 'floor', 'ceil', 'ceiling', 'away-zero', 'to-zero'	None	None
int32	float16	'', 'none'	Scalar(float16)/立即数(float)	None
int32	int64	'', 'none'	None	None
int32	int16	'', 'none'	None	None
int64	int32	'', 'none'	None	None
int64	float32	'round', 'floor', 'ceil', 'ceiling', 'away-zero', 'to-zero'	None	None

返回值

无

支持的型号

Atlas 200/300/500 推理产品

Atlas 训练系列产品

Atlas推理系列产品AI Core

Atlas推理系列产品Vector Core

Atlas A2训练系列产品/Atlas 800I A2推理产品

Atlas 200/500 A2推理产品

注意事项

repeat_times∈[0,255]。支持的数据类型为：Scalar(int16/int32/int64/uint16/uint32/uint64)、立即数(int)、Expr(int16/int32/int64/uint16/uint32/uint64)。当repeat_times为立即数时，不支持0。
每个repeat的并行度取决于数据精度、芯片版本，如f32->f16转换每次迭代操作64个源/目的element。
指令dst_rep_stride/src_rep_stride；单位为32B，支持的数据类型为：Scalar(int16/int32/int64/uint16/uint32/uint64)、立即数(int)、Expr(int16/int32/int64/uint16/uint32/uint64)。
dst/src所支持的数据类型与芯片版本有关，如果不支持，工具会报错。
dst与src的应为不同tensor，或同一tensor的同一element，不支持同一tensor的不同element。
src为float32，dst为float32时，取整模式表示向整数取整（仍为float32类型），其他情况表示向dst dtype所能表示的数字取整。
Atlas推理系列产品AI Core，当从int16转换为uint8/int8时，每个block处理16个elements，支持ldst_high_half参数，表示16个结果elements存储在dst每个block的前半部分或后半部分。比如ldst_high_half=True时，结果写入dst每个block的后半部分。当从int16转换为uint8/int8时，支持deqscale(uint64)参数，其[46]bit位表示结果是否包含符号位。比如当src.dtype=int16，dst.dtype=int8时，deqscale[46]需保证为0b1。当src.dtype=int16，dst.dtype=uint8时，deqscale[46]需保证为0b0。否则将结果有误。
Atlas推理系列产品Vector Core，当从int16转换为uint8/int8时，每个block处理16个elements，支持ldst_high_half参数，表示16个结果elements存储在dst每个block的前半部分或后半部分。比如ldst_high_half=True时，结果写入dst每个block的后半部分。当从int16转换为uint8/int8时，支持deqscale(uint64)参数，其[46]bit位表示结果是否包含符号位。比如当src.dtype=int16，dst.dtype=int8时，deqscale[46]需保证为0b1。当src.dtype=int16，dst.dtype=uint8时，deqscale[46]需保证为0b0。否则将结果有误。
Atlas A2训练系列产品/Atlas 800I A2推理产品，当从int16转换为uint8/int8时，每个block处理16个elements，支持ldst_high_half参数，表示16个结果elements存储在dst每个block的前半部分或后半部分。比如ldst_high_half=True时，结果写入dst每个block的后半部分。当从int16转换为uint8/int8时，支持deqscale(uint64)参数，其[46]bit位表示结果是否包含符号位。比如当src.dtype=int16，dst.dtype=int8时，deqscale[46]需保证为0b1。当src.dtype=int16，dst.dtype=uint8时，deqscale[46]需保证为0b0。否则将结果有误。
Atlas 200/500 A2推理产品，当从int16转换为uint8/int8时，每个block处理16个elements，支持ldst_high_half参数，表示16个结果elements存储在dst每个block的前半部分或后半部分。比如ldst_high_half=True时，结果写入dst每个block的后半部分。当从int16转换为uint8/int8时，支持deqscale(uint64)参数，其[46]bit位表示结果是否包含符号位。比如当src.dtype=int16，dst.dtype=int8时，deqscale[46]需保证为0b1。当src.dtype=int16，dst.dtype=uint8时，deqscale[46]需保证为0b0。否则将结果有误。
为了节省地址空间，开发者可以定义一个Tensor，供源操作数与目的操作数同时使用（即地址重叠），相关约束如下：
- 对于单次repeat（repeat_times=1），且源操作数与目的操作数之间要求100%完全重叠，不支持部分重叠。
- 对于多次repeat（repeat_times>1），若源操作数与目的操作数之间存在依赖，即第N次迭代的目的操作数是第N+1次的源操作数，这种情况是不支持地址重叠的。
操作数地址偏移对齐要求请见通用约束。
二进制下的舍入和十进制类似，具体如下：
- 'round'模式下，若待舍入部分的第一位为0，则不进位；若第一位为1且后续位不全为0，则进位；
若第一位为1且后续位全为0，当M的最后一位为0则不进位，当M的最后一位为1则进位。
- 'floor'模式下，若S为0，则不进位；若S为1，当待舍入部分全为0则不进位，否则，进位。
- 'ceil'/'ceiling'模式下，若S为1，则不进位；若S为0，当待舍入部分全为0则不进位；否则，进位。
- 'away-zero'模式下，若待舍入部分的第一位为0，则不进位；否则，进位。
- 'to-zero'模式下，总是不进位。
- 'odd'模式下，若待舍入部分全为0，则不进位；
  若待舍入部分不全为0，当M的最后一位为1则不进位，当M的最后一位为0则进位。

调用示例

示例一

tik_instance = tik.Tik()
dtype_size = {
    "int8": 1,
    "uint8": 1,
    "int16": 2,
    "uint16": 2,
    "float16": 2,
    "int32": 4,
    "uint32": 4,
    "float32": 4,
    "int64": 8,
}

src_shape = (2, 128)
dst_shape = (3, 64)
src_dtype = "float16"
dst_dtype = "int32"
# 数据量大小
elements = 2 * 128
dst_elements = 3 * 64
# 单次迭代操作的数，当前示例一次迭代处理64个数，mask逐bits模式可写成 [0, 2**64-1]
mask = 64
# rep_stride 表示相邻迭代间操作数之间的地址步长,当前示例dst_rep_stride为16，表示第二次迭代起始位置距离第一次迭代的起始位置16个block
dst_rep_stride = 16
src_rep_stride = 8
# 迭代次数，当前示例进行了2次迭代，可根据需要调整对应的迭代次数
repeat_times = 2
# 当前示例以round为例, 四舍六入五成双取整
round_mode = "round"
# 表示存储在dst 高半部还是低半部，此示例表示高半部
ldst_high_half = False
deqscale = None

src_gm = tik_instance.Tensor(src_dtype, src_shape, name="src_gm", scope=tik.scope_gm)
dst_gm = tik_instance.Tensor(dst_dtype, dst_shape, name="dst_gm", scope=tik.scope_gm)
src_ub = tik_instance.Tensor(src_dtype, src_shape, name="src_ub", scope=tik.scope_ubuf)
dst_ub = tik_instance.Tensor(dst_dtype, dst_shape, name="dst_ub", scope=tik.scope_ubuf)
# 搬移的片段数
nburst = 1
# 每次搬运的片段长度,单位32B
src_burst = elements * dtype_size[src_dtype] // 32 // nburst
dst_burst = dst_elements * dtype_size[dst_dtype] // 32 // nburst
# 前burst尾与后burst头的距离,单位32B
dst_stride, src_stride = 0, 0
# 拷贝用户输入数据到src ubuf

tik_instance.data_move(src_ub, src_gm, 0, nburst, src_burst, dst_stride, src_stride)
# 为了方便观察，对目的操作数置零
tik_instance.vec_dup(64, dst_ub, 0, 3, 8)

# vec_conv精度转换
tik_instance.vec_conv(mask, round_mode, dst_ub, src_ub, repeat_times, dst_rep_stride, src_rep_stride, deqscale=deqscale,
                   ldst_high_half=ldst_high_half)

# 将数据从ub搬到gm
tik_instance.data_move(dst_gm, dst_ub, 0, nburst, dst_burst, dst_stride, src_stride)
tik_instance.BuildCCE(kernel_name="vec_conv", inputs=[src_gm], outputs=[dst_gm])

示例结果：
输入src_gm:
[[7.996   7.875   5.14    2.266   4.844   7.492   1.845   7.492   6.824
  3.223   0.809   2.033   2.773   0.2542  7.59    4.992   2.473   3.47
  2.85    4.35    6.39    3.168   6.715   2.11    6.94    6.98    4.59
  2.883   8.21    1.8125  3.447   0.0353  5.055   1.697   8.836   1.68
  3.29    5.965   0.3535  5.6     7.977   7.902   7.56    1.571   4.504
  7.863   5.492   1.106   3.969   1.315   1.896   6.61    0.281   2.482
  5.49    4.06    3.652   6.3     3.916   8.77    2.838   6.023   4.63
  8.15    8.266   4.523   0.10114 5.04    2.479   0.5713  2.324   3.986
  6.957   0.208   2.807   8.945   2.559   1.896   2.299   5.566   2.498
  8.      8.516   2.432   4.52    5.77    2.465   2.684   4.11    3.705
  7.332   1.713   3.768   6.94    8.24    7.836   5.492   8.64    6.36
  6.098   7.1     8.62    2.082   2.15    4.188   7.33    7.723   8.086
  8.945   2.754   7.617   1.895   5.69    3.176   8.18    4.617   8.42
  8.15    4.01    1.016   4.004   7.098   7.445   7.48    5.316   7.54
  5.44    5.098  ]
 [2.795   8.516   6.      4.758   1.311   4.703   7.86    0.8057  1.796
  2.908   3.363   0.916   6.      3.2     1.468   7.125   3.213   5.32
  1.127   1.906   7.285   4.29    6.438   8.7     2.652   5.426   7.19
  2.496   2.523   6.76    0.3948  3.908   7.367   1.133   8.06    7.277
  5.445   0.0669  3.072   0.2046  6.625   8.94    5.527   8.11    7.082
  1.025   6.566   0.7217  1.268   0.8843  1.702   3.65    2.445   0.782
  5.316   0.945   7.918   0.2131  4.844   7.598   6.695   0.562   3.53
  3.822   7.152   2.793   2.121   3.65    4.08    6.83    2.617   8.59
  5.168   8.06    7.598   7.082   7.742   3.01    5.758   3.236   2.225
  0.933   3.963   3.873   7.645   3.703   2.373   1.344   8.14    5.742
  8.16    1.834   1.135   6.457   8.03    8.305   5.695   1.066   1.298
  8.61    3.057   1.526   3.59    6.316   6.992   4.258   6.617   4.81
  5.6     6.297   4.066   6.234   5.4     4.69    4.105   8.54    4.617
  3.87    1.194   5.88    7.504   2.055   6.46    5.01    4.855   2.32
  2.232   2.617  ]]
输出dst_gm:
[[8 8 5 2 5 7 2 7 7 3 1 2 3 0 8 5 2 3 3 4 6 3 7 2 7 7 5 3 8 2 3 0 5 2 9 2
  3 6 0 6 8 8 8 2 5 8 5 1 4 1 2 7 0 2 5 4 4 6 4 9 3 6 5 8]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [3 9 6 5 1 5 8 1 2 3 3 1 6 3 1 7 3 5 1 2 7 4 6 9 3 5 7 2 3 7 0 4 7 1 8 7
  5 0 3 0 7 9 6 8 7 1 7 1 1 1 2 4 2 1 5 1 8 0 5 8 7 1 4 4]]

示例二

"""此示例为了展示ldst_high_half和deqscale的使用效果"""
tik_instance = tik.Tik()
dtype_size = {
    "int8": 1,
    "uint8": 1,
    "int16": 2,
    "uint16": 2,
    "float16": 2,
    "int32": 4,
    "uint32": 4,
    "float32": 4,
    "int64": 8,
}

src_shape = (2, 128)
dst_shape = (3, 128)
src_dtype = "int16"
dst_dtype = "int8"
elements = 2 * 128
dst_elements = 3 * 128
# 单次迭代操作的数，当前示例一次迭代处理128个数，mask逐bits模式可写成 [2**64-1, 2**64-1]
mask = 128
# 迭代间目的操作数前一次repeat头与后一次repeat头之间的距离,单位32B, dst为int8一个block 32个数，src为int16一个block 16个数
dst_rep_stride = 4
src_rep_stride = 8

# 迭代次数，当前示例进行了8次迭代，可根据需要调整对应的迭代次数
repeat_times = 2

# 当前示例以none为例
round_mode = "none"
# 表示存储在dst 高半部还是低半部，此示例表示高半部
ldst_high_half = True
# 数据类型从int16转换为int8，需要deqscale的第46位需要为“0b1”
deqscale = 2 ** 46 - 1

src_gm = tik_instance.Tensor(src_dtype, src_shape, name="src_gm", scope=tik.scope_gm)
src1_gm = tik_instance.Tensor(dst_dtype, dst_shape, name="src1_gm", scope=tik.scope_gm)
dst_gm = tik_instance.Tensor(dst_dtype, dst_shape, name="dst_gm", scope=tik.scope_gm)
src_ub = tik_instance.Tensor(src_dtype, src_shape, name="src_ub", scope=tik.scope_ubuf)
dst_ub = tik_instance.Tensor(dst_dtype, dst_shape, name="dst_ub", scope=tik.scope_ubuf)
# 搬移的片段数
nburst = 1
# 每次搬运的片段长度,单位32B
src_burst = elements * dtype_size[src_dtype] // 32 // nburst
dst_burst = dst_elements * dtype_size[dst_dtype] // 32 // nburst
# 前burst尾与后burst头的距离,单位32B
dst_stride, src_stride = 0, 0
# 拷贝用户输入数据到src ubuf

tik_instance.data_move(src_ub, src_gm, 0, nburst, src_burst, dst_stride, src_stride)
# 为了方便观察，对目的操作数置零
tik_instance.data_move(dst_ub, src1_gm, 0, nburst, dst_burst, dst_stride, src_stride)

# vec_conv精度转换
tik_instance.vec_conv(mask, round_mode, dst_ub, src_ub, repeat_times, dst_rep_stride, src_rep_stride, deqscale=deqscale,
                      ldst_high_half=ldst_high_half)

# 将数据从ub搬到gm
tik_instance.data_move(dst_gm, dst_ub, 0, nburst, dst_burst, dst_stride, src_stride)
tik_instance.BuildCCE(kernel_name="vec_conv", inputs=[src_gm, src1_gm], outputs=[dst_gm])

示例结果
输入src_gm:
[[6 8 6 7 2 5 7 0 7 8 4 1 2 1 5 1 1 8 2 5 7 5 8 6 1 7 4 6 0 5 3 1 4 6 4 0
  0 1 4 3 0 2 2 3 3 0 3 6 6 3 5 7 2 3 1 0 8 5 5 4 7 6 3 7 3 6 8 3 3 1 4 1
  1 6 7 8 1 0 0 3 3 0 3 1 1 4 0 4 2 0 6 1 8 1 4 1 7 5 7 5 0 4 6 3 3 8 3 1
  2 1 8 5 1 4 5 6 3 1 6 2 2 1 8 4 0 6 1 5]
 [8 7 1 7 0 0 2 4 1 7 2 2 7 8 2 6 3 6 0 6 2 4 0 4 7 7 8 4 2 0 1 5 1 0 3 0
  1 6 2 6 2 5 0 3 0 2 1 7 7 8 7 0 0 4 3 4 5 6 2 6 1 5 2 1 6 7 0 1 4 2 0 1
  3 8 4 0 1 1 6 1 6 8 4 0 5 8 1 1 3 2 1 2 2 8 7 2 6 8 8 5 0 3 1 4 4 0 1 3
  0 5 3 7 8 7 4 8 1 3 4 5 7 4 3 6 5 4 8 2]]
输入src1_gm:
[[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]]
输出dst_gm:
[[ 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 -1 -1 -1 -1 -1 -1 -1 -1
  -1 -1 -1 -1 -1 -1 -1 -1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1  0  0  0  0  0  0  0  0
   0  0  0  0  0  0  0  0 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
   0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 -1 -1 -1 -1 -1 -1 -1 -1
  -1 -1 -1 -1 -1 -1 -1 -1]
 [ 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 -1 -1 -1 -1 -1 -1 -1 -1
  -1 -1 -1 -1 -1 -1 -1 -1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1  0  0  0  0  0  0  0  0
   0  0  0  0  0  0  0  0 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
   0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 -1 -1 -1 -1 -1 -1 -1 -1
  -1 -1 -1 -1 -1 -1 -1 -1]
 [ 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 -1 -1 -1 -1 -1 -1 -1 -1
  -1 -1 -1 -1 -1 -1 -1 -1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1  0  0  0  0  0  0  0  0
   0  0  0  0  0  0  0  0 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
   0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 -1 -1 -1 -1 -1 -1 -1 -1
  -1 -1 -1 -1 -1 -1 -1 -1]]

父主题： 数据精度转换指令