ScaleAdd

Function Usage

The addition operation of tensor rescaling (dst = src1 x scale + src2). The float16, float32, and uint8 data types are supported. Asynchronous calling is supported. The inplace operation is not supported.

It is supported by the Atlas inference product and Atlas 200I/500 A2 inference product.

For the Atlas 200I/500 A2 inference product, preloading is supported. (The attr attribute needs to be added during preloading. For details, see Example of the Preloading File of the Initialization Operator.)

The following conditions must be met:

  • The input and output tensors must be on the device or DVPP side, and the parameters (stream and data memory) must be on the same device.
  • For synchronization, the device where the data memory is located must be the same as the initialized device.
  • Handle the issue of out-of-range data if any.
  • The input and output parameters cannot exceed four dimensions, and must match the tensor shapes and types.
  • For the Atlas inference product, when the data type of the input tensor is float32 or float16 and the size is greater than 480p (640 x 480), or when the data type of the input tensor is uint8 and the size is greater than 1080p (1920 x 1080), the compute performance of ScaleAdd is better than that of cv::scaleAdd on the CPU.
  • For the Atlas 200I/500 A2 inference product, when the input size is greater than 720p (1280 x 720), the compute performance is better than that of cv::scaleAdd on the CPU.

Prototype

1
APP_ERROR ScaleAdd(const Tensor &src1, float scale, const Tensor &src2, Tensor &dst, AscendStream& stream = AscendStream::DefaultStream());

Parameters

Parameter

Input/Output

Description

src1

Input

Tensor class, input tensor, supporting the float16, float32, and uint8 data types. The data memory must be on the device or DVPP side.

scale

Input

Scaling parameter, input scalar of the float type.

src2

Input

Tensor class, input tensor, supporting the float16, float32, and uint8 data types. The data memory must be on the device or DVPP side.

dst

Output

Tensor class, output tensor, supporting the float16, float32, and uint8 data types. An empty tensor can be passed. If dst is not empty, the shape of dst must be the same as that of src1 or src2. Call Tensor.Malloc() to allocate memory in advance. The data memory must be on the device (the same device as that of src) or DVPP.

stream

Input

AscendStream type. The default value is AscendStream::DefaultStream(). When the parameter value is the default value, the API calling is a synchronous operation. In other cases, the API calling is an asynchronous operation.

Response Parameters

Data Structure

Description

APP_ERROR

For details about the returned error codes, see APP_ERROR Description.