昇腾社区首页
中文
注册

vaxpy

功能说明

  • 当src0、src1和dst类型相同时(均为half或float),以block(32Byte)为单位完成以下计算,一次完成8个block的计算。
    each_element_of([dst]) = a * each_element_of([src]) + each_element_of([dst])
  • 当src和a为half类型,dst为float类型时,src取低4个block,合计64个half数。高4个block被忽略,与a相乘后,与dst的64个float(8个block)相加。
    each_element_of([dst]) = a * each_element_of([src(低4个block的64个half)]) + each_element_of([dst])

函数原型

void vaxpy(__ubuf__ half *dst, __ubuf__ half *src, half a, uint8_t repeat, uint16_t dstBlockStride, uint16_t srcBlockStride, uint16_t dstRepeatStride, uint16_t srcRepeatStride); 
void vaxpy(__ubuf__ float *dst, __ubuf__ float *src, float a, uint8_t repeat, uint16_t dstBlockStride, uint16_t srcBlockStride, uint16_t dstRepeatStride, uint16_t srcRepeatStride);

void vaxpy(__ubuf__ float *dst, __ubuf__ half *src, half a, uint8_t repeat, uint16_t dstBlockStride, uint16_t srcBlockStride, uint16_t dstRepeatStride, uint16_t srcRepeatStride);

流水类型

PIPE_V