Overview

This section describes the operator information defined based on the Ascend IR. Before using an operator, read the following restrictions and other descriptions.

General Restrictions	Describes the overall constraints of operators.
Format	Describes the formats listed in the operator specifications.
TensorType	Describes the tensor types listed in the operator specifications.
Type Promotion	Describes the rules for data type promotion so that the data type can be automatically promoted during operator computation if the input tensor data types of some operators (such as Add and Mul) are different.
Deterministic Computing	List the operators involved in and supported by deterministic computing.

General Restrictions

For details about the implementation formulas of the DynamicRNN, DynamicGRUV2, and BNInference operators, click Link.
The maximum value range of the groups attribute of the following operators is 65535:
TransData, Deconvolution, Conv2D, Conv2DTranspose, Conv2DBackpropInput, Conv2dBackpropFilter, Conv3D, Conv3DTranspose, Conv3DBackpropInput, Conv3DBackpropFilter, DeformableConv2D, Correlation.
Constraints on the Sin operator:
Constraints on the Cos operator:
Constraints on the Tan operator:
Due to hardware restrictions, the precision requirements are met when the input of the FLOAT32, BFLOAT16, FLOAT16, INT32, and INT64 data types is within [-65504,65504]. Beyond this range, the precision cannot be ensured. In this case, use the CPU for computing.
The Conv2D and DepthwiseConv2D operators support different data types for different product models.
- Atlas 200/300/500 Inference Product:
  Tensor
  
  x
  
  filter
  
  bias
  
  y
  
  Data Type
  
  float16
  
  float16
  
  float16
  
  float16
  
  int8
  
  int8
  
  int32
  
  int32
- Atlas Training Series Product:
  Tensor
  
  x
  
  filter
  
  bias
  
  y
  
  Data Type
  
  float16
  
  float16
  
  float16
  
  float16
  
  float16
  
  float16
  
  float32
  
  float32
  
  int8
  
  int8
  
  int32
  
  int32

Format

ND: any format, applicable to operators that take singular inputs, such as Square and Tanh.
NC1HWC0: self-developed 5D data format. C0 is closely related to the micro-architecture, and the value is equal to the Cube Unit size, for example, 16. C1 is obtained by dividing the C dimension by C0, that is, C1 = C/C0. When the division is not exact, the last data segment is padded to C0.
FRACTAL_Z: a format of the convolution weight.

TensorType

struct TensorType {

  explicit TensorType(DataType dt);

  TensorType(const std::initializer_list<DataType> &types);

  static TensorType ALL() {
    return TensorType{DT_BOOL, DT_COMPLEX128, DT_COMPLEX64, DT_DOUBLE, DT_FLOAT, DT_FLOAT16, DT_INT16, DT_INT32, 
                      DT_INT64, DT_INT8, DT_QINT16, DT_QINT32, DT_QINT8, DT_QUINT16, DT_QUINT8, DT_RESOURCE,
                      DT_STRING, DT_UINT16, DT_UINT32, DT_UINT64, DT_UINT8};
  }

  static TensorType QuantifiedType() { return TensorType{DT_QINT16, DT_QINT32, DT_QINT8, DT_QUINT16, DT_QUINT8}; }

  static TensorType OrdinaryType() {
    return TensorType{DT_BOOL, DT_COMPLEX128, DT_COMPLEX64, DT_DOUBLE, DT_FLOAT, DT_FLOAT16, DT_INT16,
                      DT_INT32, DT_INT64, DT_INT8, DT_UINT16, DT_UINT32, DT_UINT64, DT_UINT8};
  }

  static TensorType BasicType() {
    return TensorType{DT_COMPLEX128, DT_COMPLEX64, DT_DOUBLE, DT_FLOAT, DT_FLOAT16, DT_INT16,
                      DT_INT32, DT_INT64, DT_INT8, DT_QINT16, DT_QINT32, DT_QINT8,
                      DT_QUINT16, DT_QUINT8, DT_UINT16, DT_UINT32, DT_UINT64, DT_UINT8};
  }

  static TensorType NumberType() {
    return TensorType{DT_COMPLEX128, DT_COMPLEX64, DT_DOUBLE, DT_FLOAT, DT_FLOAT16, DT_INT16, DT_INT32, DT_INT64,
                      DT_INT8, DT_QINT32, DT_QINT8, DT_QUINT8, DT_UINT16, DT_UINT32, DT_UINT64, DT_UINT8};
  }

  static TensorType RealNumberType() {
    return TensorType{DT_DOUBLE, DT_FLOAT, DT_FLOAT16, DT_INT16, DT_INT32, DT_INT64,
                      DT_INT8, DT_UINT16, DT_UINT32, DT_UINT64, DT_UINT8};
  }

  static TensorType ComplexDataType() { return TensorType{DT_COMPLEX128, DT_COMPLEX64}; }

  static TensorType IntegerDataType() {
    return TensorType{DT_INT16, DT_INT32, DT_INT64, DT_INT8, DT_UINT16, DT_UINT32, DT_UINT64, DT_UINT8};
  }

  static TensorType SignedDataType() { return TensorType{DT_INT16, DT_INT32, DT_INT64, DT_INT8}; }

  static TensorType UnsignedDataType() { return TensorType{DT_UINT16, DT_UINT32, DT_UINT64, DT_UINT8}; }

  static TensorType FloatingDataType() { return TensorType{DT_DOUBLE, DT_FLOAT, DT_FLOAT16}; }

  static TensorType IndexNumberType() { return TensorType{DT_INT32, DT_INT64}; }

  static TensorType UnaryDataType() { return TensorType{DT_COMPLEX128, DT_COMPLEX64, DT_DOUBLE, DT_FLOAT, DT_FLOAT16}; }

  static TensorType FLOAT() { return TensorType{DT_FLOAT, DT_FLOAT16}; }

  std::shared_ptr<TensorTypeImpl> tensor_type_impl_; 
};

Type Promotion

If the input tensor data types of some operators (such as Add and Mul) are different, the data type is automatically promoted during operator computation. The following table lists the rules for data type promotion.

Data Type

f32

f16

bf16

s16

u16

s32

u32

s64

u64

bool

c32

c64

f32

c64

f16

f32

f16

f32

f16

c32

c64

bf16

f32

f64

bf16

c32

c64

f32

f16

bf16

s16

s32

s64

c32

c64

f32

f16

bf16

s16

s32

s64

c32

c64

s16

f32

f16

bf16

s16

s32

s64

s16

c32

c64

u16

s32

f32

f16

bf16

s32

s64

s32

c32

c64

u32

s64

f32

f16

bf16

s64

c32

c64

u64

bool

f32

f16

bf16

s16

s32

s64

bool

c32

c64

c32

c64

c32

c64

For ease of description, the data types used in the table are abbreviated: DT_FLOAT (f32), DT_FLOAT16 (f16), DT_BF16 (bf16), DT_INT8 (s8), DT_UINT8 (u8), DT_INT16 (s16), DT_UINT16 (u16), DT_INT32 (s32), DT_UINT32 (u32), DT_INT64 (s64), DT_UINT64 (u64), DT_BOOL (bool), DT_COMPLEX64 (c32), and DT_COMPLEX64 (c64).
Currently, the AI Core engine does not support the DT_DOUBLE and DT_COMPLEX128 types for operator precision promotion. For example, if the data types of the input parameters of the Mul operator are float32 and double, the float32 data type cannot be promoted to double for the AI Core engine, and the input will be allocated to the AI CPU engine.
The table heading and the leftmost column in the table indicate the two input data types to be deduced. The corresponding intersections in the table indicate the deduced data types.
× indicates that the two data types cannot be deduced.

Deterministic Computing

Asynchronous multi-thread executions during operator implementation change the accumulation sequence of floating point numbers. The results of multiple executions of an operator with the same hardware and input may be different. When deterministic computing is enabled, multiple executions of an operator with the same hardware and input generate the same output.

If an operator listed in the following table is not in the corresponding specification list, the current processor version does not support the operator.

The following operators involve but do not support deterministic computing:
- resizegradD
- WeightQuantBatchMatmulV2
The following operators involve and support deterministic computing:
- AvgPool3DGrad
- BatchMatMul
- BatchMatMulV2
- BiasAddGrad
- BinaryCrossEntropy
- BN3DTrainingReduce
- BN3DTrainingUpdateGrad
- BNTrainingReduce
- BNTrainingUpdateGrad
- Conv2DBackpropFilter
- Conv3DBackpropFilter
- EmbeddingDenseGrad
- FusedMulAddNL2loss
- FullyConnection
- GroupNormGrad
- Histogram
- IndexPut
- IndexPutV2
- InplaceIndexAdd
- KLDiv
- LayerNormBetaGammaBackpropV2
- LayerNormGradV3
- LpNormReduceV2
- LpNormV2
- MseLoss
- MatMul
- MatMulV2
- MseLoss
- NLLLoss
- ReduceMean
- ReduceMeanD
- ReduceSum
- ReduceSumD
- ScatterAdd
- ScatterElements
- ScatterNd
- ScatterNdAdd
- SquareSumV1
- UnsortedSegmentSum

Parent topic: CANN Operator Specifications

Tensor	x	filter	bias	y
Data Type	float16	float16	float16	float16
Data Type	int8	int8	int32	int32

Tensor	x	filter	bias	y
Data Type	float16	float16	float16	float16
	float16	float16	float32	float32
	int8	int8	int32	int32