vcgmax
Function
vcgmax instruction abstraction.
Calculates the maximum element of each block. There are eight blocks in total. Mixed addresses are not supported.
Prototype
1 | class vcgmax(x, y, reduce_num) |
Parameters
Parameter |
Input/Output |
Data Type |
Description |
|---|---|---|---|
x |
Input |
Tensor variable |
Input x-vector tensor. FP16 and FP32 are supported. |
y |
Output |
Tensor variable |
Output y-vector tensor. FP16 and FP32 are supported. |
reduce_num |
Input |
int |
Number of times that the last dimension is reduced. The value of this parameter does not affect the instruction performance. |
Constraints
The value of reduce_num cannot be 0.
Example
1 2 3 4 5 6 | from mskpp import vcgmax, Tensor ub_x, ub_y = Tensor("UB"), Tensor("UB") gm_x = Tensor("GM") reduce_num = 16 ub_x.load(gm_x) out = vcgmax(ub_x, ub_y, reduce_num)() |
Parent topic: Description of msKPP External APIs