vec_adds

Description

Performs addition between a vector and a scalar element-wise:

Prototype

vec_adds(mask, dst, src, scalar, repeat_times, dst_rep_stride, src_rep_stride, mask_mode="normal")

Parameters

For details, see Parameters.

dst, src, and scalar must have the same data type.

Atlas 200/300/500 Inference Product : Tensors of type float16/float32

Atlas Training Series Product : Tensors of type float16/float32

Returns

None

Applicability

Atlas 200/300/500 Inference Product

Atlas Training Series Product

Restrictions

For details, see Restrictions.

Example

This example applies to a small amount of data that can be moved at a time, helping you understand the API functions. For more complex samples with a large amount of data, see Example.

from tbe import tik
tik_instance = tik.Tik()
src_gm = tik_instance.Tensor("float16", (128,), name="src_gm", scope=tik.scope_gm)
dst_gm = tik_instance.Tensor("float16", (128,), name="dst_gm", scope=tik.scope_gm)
src_ub = tik_instance.Tensor("float16", (128,), name="src_ub", scope=tik.scope_ubuf)
dst_ub = tik_instance.Tensor("float16", (128,), name="dst_ub", scope=tik.scope_ubuf)
# Move the user input from the Global Memory to the Unified Buffer.
tik_instance.data_move(src_ub, src_gm, 0, 1, 8, 0, 0)
# Define a Scalar and assign 2.0 to it as its initial value.
scalar = tik_instance.Scalar(dtype="float16", init_value=2.0)
tik_instance.vec_adds(128, dst_ub, src_ub, scalar, 1, 8, 8)
# Move the compute result from the Unified Buffer to the Global Memory.
tik_instance.data_move(dst_gm, dst_ub, 0, 1, 8, 0, 0)

tik_instance.BuildCCE(kernel_name="vec_adds", inputs=[src_gm], outputs=[dst_gm])

Result example:

Input (src_gm):
[ -4.5     -8.7      7.16    -3.89     5.47     5.992    1.921   -2.672
   6.2     -7.695   -9.86     0.8354  -4.52     9.03     0.689    1.7
  -7.03    -0.658   -6.13    -7.246   -7.86    -2.34     1.379   -6.49
   5.125   -9.73     8.12    -5.062   -7.562    9.1      4.184    6.93
  -2.678    4.344    3.904   -3.123    9.08     3.986    2.295   -4.234
  -5.605    9.52     8.08     3.727   -1.413   -2.062   -6.21    -5.676
   9.28    -7.13    -1.329   -1.236    3.137   -0.5293   4.96     5.332
  -1.962    4.133   -6.617    3.383    4.06    -4.16     1.146   -7.043
  -4.773    3.049    0.757    9.5     -2.018    3.41    -6.316    7.37
   9.93    -5.133   -5.305    8.59    -5.727   -1.143   -2.2      4.766
   3.695   -7.438    7.645   -0.508   -2.752    3.838    2.135    2.64
  -8.82    -4.75    -4.89     2.37    -1.686   -1.867    8.89    -0.6562
   3.115   -6.953    2.307    4.94    -7.63    -8.086    8.82     0.1056
  -0.3682  -3.342    5.77     6.016   -3.295    9.79    -2.889   -1.579
  -2.092   -3.066    9.91   -10.      -6.516    5.176   -2.08    -5.04
   9.75    -3.018    4.105    6.77    -1.656   -6.324   -8.31     1.606 ]

Output (dst_gm):
[-2.5     -6.703    9.16    -1.891    7.47     7.992    3.922   -0.672
  8.2     -5.695   -7.86     2.836   -2.52    11.03     2.69     3.7
 -5.03     1.342   -4.13    -5.246   -5.86    -0.3398   3.379   -4.49
  7.125   -7.727   10.12    -3.062   -5.562   11.1      6.184    8.93
 -0.6777   6.344    5.906   -1.123   11.08     5.984    4.297   -2.234
 -3.605   11.52    10.08     5.727    0.587   -0.0625  -4.21    -3.676
 11.28    -5.13     0.671    0.7637   5.137    1.471    6.96     7.332
  0.0381   6.133   -4.617    5.383    6.06    -2.16     3.146   -5.043
 -2.773    5.047    2.758   11.5     -0.01758  5.41    -4.316    9.375
 11.93    -3.133   -3.305   10.59    -3.727    0.8574  -0.1992   6.766
  5.695   -5.438    9.64     1.492   -0.752    5.836    4.133    4.64
 -6.82    -2.75    -2.89     4.367    0.3145   0.1328  10.89     1.344
  5.117   -4.953    4.305    6.94    -5.63    -6.086   10.82     2.105
  1.632   -1.342    7.77     8.016   -1.295   11.79    -0.8887   0.421
 -0.0918  -1.066   11.91    -8.      -4.516    7.176   -0.0801  -3.04
 11.75    -1.018    6.105    8.766    0.3438  -4.324   -6.312    3.605  ]