vec_rec
Description
Computes the reciprocal element-wise: 
Prototype
vec_rec(mask, dst, src, repeat_times, dst_rep_stride, src_rep_stride)
Parameters
For details, see Parameters.
dst and src must have the same data type.
Returns
None
Applicability
Restrictions
- If the value of src is 0, an unknown result may be produced.
- For the
Atlas 200/300/500 Inference Product , the compute result using this API fails to meet the dual-0.1% error limit (both the error ratio and relative error are within 0.1%) with float16 input, and fails to meet the dual-0.01% error limit with float32 input. For high accuracy requirements, the vec_rec_high_preci API is recommended. - For the
Atlas Training Series Product , the compute result using this API fails to meet the dual-0.1% error limit (both the error ratio and relative error are within 0.1%) with float16 input, and fails to meet the dual-0.01% error limit with float32 input. For high accuracy requirements, the vec_rec_high_preci API is recommended. - For other restrictions, see Restrictions.
Example 1
This example applies to a small amount of data that can be moved at a time, helping you understand the API functions. For more complex samples with a large amount of data, see Example.
from tbe import tik
# Define a container.
tik_instance = tik.Tik()
# Define the tensors.
src_gm = tik_instance.Tensor("float32", (128,), name="src_gm", scope=tik.scope_gm)
dst_gm = tik_instance.Tensor("float32", (128,), name="dst_gm", scope=tik.scope_gm)
src_ub = tik_instance.Tensor("float32", (128,), name="src_ub", scope=tik.scope_ubuf)
dst_ub = tik_instance.Tensor("float32", (128,), name="dst_ub", scope=tik.scope_ubuf)
# Move data from the Global Memory to the Unified Buffer.
tik_instance.data_move(src_ub, src_gm, 0, 1, 128*4 // 32, 0, 0)
tik_instance.vec_rec(64, dst_ub, src_ub, 2, 8, 8)
# Move data from the Unified Buffer to the Global Memory.
tik_instance.data_move(dst_gm, dst_ub, 0, 1, 128*4 // 32, 0, 0)
tik_instance.BuildCCE(kernel_name="vec_rec", inputs=[src_gm], outputs=[dst_gm])
Result example:
Input: [1.2017815 -8.758528 -3.9551935 ... -1.3599057 -2.319316] Output: [0.83203125 -0.11401367 -0.2529297 ... -0.734375 -0.43164062]
Example 2
This example applies to a small amount of data that can be moved at a time, helping you understand the API functions. For more complex samples with a large amount of data, see Example.
from tbe import tik
# Define a container.
tik_instance = tik.Tik()
# Define the tensors.
src_gm = tik_instance.Tensor("float16", (128,), name="src_gm", scope=tik.scope_gm)
dst_gm = tik_instance.Tensor("float16", (128,), name="dst_gm", scope=tik.scope_gm)
src_ub = tik_instance.Tensor("float16", (128,), name="src_ub", scope=tik.scope_ubuf)
dst_ub = tik_instance.Tensor("float16", (128,), name="dst_ub", scope=tik.scope_ubuf)
# Move data from the Global Memory to the Unified Buffer.
tik_instance.data_move(src_ub, src_gm, 0, 1, 128*2 // 32, 0, 0)
tik_instance.vec_rec(128, dst_ub, src_ub, 1, 8, 8)
# Move data from the Unified Buffer to the Global Memory.
tik_instance.data_move(dst_gm, dst_ub, 0, 1, 128*2 // 32, 0, 0)
tik_instance.BuildCCE(kernel_name="vec_rec", inputs=[src_gm], outputs=[dst_gm])
Result example:
Input: [-7.152 -7.24 1.771 ... -1.339 4.473] Output: [-0.1396 -0.1382 0.5645 ... -0.748 0.2231]