vcgmax

Function

vcgmax instruction abstraction.

Calculates the maximum element of each block. There are eight blocks in total. Mixed addresses are not supported.

Prototype

1
class vcgmax(x, y, reduce_num)

Parameters

Parameter

Input/Output

Data Type

Description

x

Input

Tensor variable

Input x-vector tensor. FP16 and FP32 are supported.

y

Output

Tensor variable

Output y-vector tensor. FP16 and FP32 are supported.

reduce_num

Input

int

Number of times that the last dimension is reduced. The value of this parameter does not affect the instruction performance.

Constraints

The value of reduce_num cannot be 0.

Example

1
2
3
4
5
6
from mskpp import vcgmax, Tensor
ub_x, ub_y = Tensor("UB"), Tensor("UB")
gm_x = Tensor("GM")
reduce_num = 16
ub_x.load(gm_x)
out = vcgmax(ub_x, ub_y, reduce_num)()