Overview

This section describes the operator information defined based on the Ascend IR. Before using an operator, read the following restrictions and other descriptions.

General Restrictions

Describes the overall constraints of operators.

Format

Describes the formats listed in the operator specifications.

TensorType

Describes the tensor types listed in the operator specifications.

Type Promotion

Describes the rules for data type promotion so that the data type can be automatically promoted during operator computation if the input tensor data types of some operators (such as Add and Mul) are different.

Deterministic Computing

List the operators involved in and supported by deterministic computing.

General Restrictions

  • For details about the implementation formulas of the DynamicRNN, DynamicGRUV2, and BNInference operators, click Link.
  • The maximum value range of the groups attribute of the following operators is 65535:

    TransData, Deconvolution, Conv2D, Conv2DTranspose, Conv2DBackpropInput, Conv2dBackpropFilter, Conv3D, Conv3DTranspose, Conv3DBackpropInput, Conv3DBackpropFilter, DeformableConv2D, Correlation.

  • Constraints on the Sin operator:
  • Constraints on the Cos operator:
  • Constraints on the Tan operator:

    Due to hardware restrictions, the precision requirements are met when the input of the FLOAT32, BFLOAT16, FLOAT16, INT32, and INT64 data types is within [-65504,65504]. Beyond this range, the precision cannot be ensured. In this case, use the CPU for computing.

  • The Conv2D and DepthwiseConv2D operators support different data types for different product models.
    • Atlas 200/300/500 Inference Product:

      Tensor

      x

      filter

      bias

      y

      Data Type

      float16

      float16

      float16

      float16

      int8

      int8

      int32

      int32

    • Atlas Training Series Product:

      Tensor

      x

      filter

      bias

      y

      Data Type

      float16

      float16

      float16

      float16

      float16

      float16

      float32

      float32

      int8

      int8

      int32

      int32

Format

  • ND: any format, applicable to operators that take singular inputs, such as Square and Tanh.
  • NC1HWC0: self-developed 5D data format. C0 is closely related to the micro-architecture, and the value is equal to the Cube Unit size, for example, 16. C1 is obtained by dividing the C dimension by C0, that is, C1 = C/C0. When the division is not exact, the last data segment is padded to C0.
  • FRACTAL_Z: a format of the convolution weight.

TensorType

struct TensorType {

  explicit TensorType(DataType dt);

  TensorType(const std::initializer_list<DataType> &types);

  static TensorType ALL() {
    return TensorType{DT_BOOL, DT_COMPLEX128, DT_COMPLEX64, DT_DOUBLE, DT_FLOAT, DT_FLOAT16, DT_INT16, DT_INT32, 
                      DT_INT64, DT_INT8, DT_QINT16, DT_QINT32, DT_QINT8, DT_QUINT16, DT_QUINT8, DT_RESOURCE,
                      DT_STRING, DT_UINT16, DT_UINT32, DT_UINT64, DT_UINT8};
  }

  static TensorType QuantifiedType() { return TensorType{DT_QINT16, DT_QINT32, DT_QINT8, DT_QUINT16, DT_QUINT8}; }

  static TensorType OrdinaryType() {
    return TensorType{DT_BOOL, DT_COMPLEX128, DT_COMPLEX64, DT_DOUBLE, DT_FLOAT, DT_FLOAT16, DT_INT16,
                      DT_INT32, DT_INT64, DT_INT8, DT_UINT16, DT_UINT32, DT_UINT64, DT_UINT8};
  }

  static TensorType BasicType() {
    return TensorType{DT_COMPLEX128, DT_COMPLEX64, DT_DOUBLE, DT_FLOAT, DT_FLOAT16, DT_INT16,
                      DT_INT32, DT_INT64, DT_INT8, DT_QINT16, DT_QINT32, DT_QINT8,
                      DT_QUINT16, DT_QUINT8, DT_UINT16, DT_UINT32, DT_UINT64, DT_UINT8};
  }

  static TensorType NumberType() {
    return TensorType{DT_COMPLEX128, DT_COMPLEX64, DT_DOUBLE, DT_FLOAT, DT_FLOAT16, DT_INT16, DT_INT32, DT_INT64,
                      DT_INT8, DT_QINT32, DT_QINT8, DT_QUINT8, DT_UINT16, DT_UINT32, DT_UINT64, DT_UINT8};
  }

  static TensorType RealNumberType() {
    return TensorType{DT_DOUBLE, DT_FLOAT, DT_FLOAT16, DT_INT16, DT_INT32, DT_INT64,
                      DT_INT8, DT_UINT16, DT_UINT32, DT_UINT64, DT_UINT8};
  }

  static TensorType ComplexDataType() { return TensorType{DT_COMPLEX128, DT_COMPLEX64}; }

  static TensorType IntegerDataType() {
    return TensorType{DT_INT16, DT_INT32, DT_INT64, DT_INT8, DT_UINT16, DT_UINT32, DT_UINT64, DT_UINT8};
  }

  static TensorType SignedDataType() { return TensorType{DT_INT16, DT_INT32, DT_INT64, DT_INT8}; }

  static TensorType UnsignedDataType() { return TensorType{DT_UINT16, DT_UINT32, DT_UINT64, DT_UINT8}; }

  static TensorType FloatingDataType() { return TensorType{DT_DOUBLE, DT_FLOAT, DT_FLOAT16}; }

  static TensorType IndexNumberType() { return TensorType{DT_INT32, DT_INT64}; }

  static TensorType UnaryDataType() { return TensorType{DT_COMPLEX128, DT_COMPLEX64, DT_DOUBLE, DT_FLOAT, DT_FLOAT16}; }

  static TensorType FLOAT() { return TensorType{DT_FLOAT, DT_FLOAT16}; }

  std::shared_ptr<TensorTypeImpl> tensor_type_impl_; 
};

Type Promotion

If the input tensor data types of some operators (such as Add and Mul) are different, the data type is automatically promoted during operator computation. The following table lists the rules for data type promotion.

Data Type

f32

f16

bf16

s8

u8

s16

u16

s32

u32

s64

u64

bool

c32

c64

f32

f32

f32

f32

f32

f32

f32

×

f32

×

f32

×

f32

c64

c64

f16

f32

f16

f32

f16

f16

f16

×

f16

×

f16

×

f16

c32

c64

bf16

f32

f64

bf16

bf16

bf16

bf16

×

bf16

×

bf16

×

bf16

c32

c64

s8

f32

f16

bf16

s8

s16

s16

×

s32

×

s64

×

s8

c32

c64

u8

f32

f16

bf16

s16

u8

s16

×

s32

×

s64

×

u8

c32

c64

s16

f32

f16

bf16

s16

s16

s16

×

s32

×

s64

×

s16

c32

c64

u16

×

×

×

×

×

×

u16

×

×

×

×

×

×

×

s32

f32

f16

bf16

s32

s32

s32

×

s32

×

s64

×

s32

c32

c64

u32

×

×

×

×

×

×

×

×

u32

×

×

×

×

×

s64

f32

f16

bf16

s64

s64

s64

×

s64

×

s64

×

s64

c32

c64

u64

×

×

×

×

×

×

×

×

×

×

u64

×

×

×

bool

f32

f16

bf16

s8

u8

s16

×

s32

×

s64

×

bool

c32

c64

c32

c64

c32

c32

c32

c32

c32

×

c32

×

c32

×

c32

c32

c64

c64

c64

c64

c64

c64

c64

c64

×

c64

×

c64

×

c64

c64

c64

  • For ease of description, the data types used in the table are abbreviated: DT_FLOAT (f32), DT_FLOAT16 (f16), DT_BF16 (bf16), DT_INT8 (s8), DT_UINT8 (u8), DT_INT16 (s16), DT_UINT16 (u16), DT_INT32 (s32), DT_UINT32 (u32), DT_INT64 (s64), DT_UINT64 (u64), DT_BOOL (bool), DT_COMPLEX64 (c32), and DT_COMPLEX64 (c64).
  • Currently, the AI Core engine does not support the DT_DOUBLE and DT_COMPLEX128 types for operator precision promotion. For example, if the data types of the input parameters of the Mul operator are float32 and double, the float32 data type cannot be promoted to double for the AI Core engine, and the input will be allocated to the AI CPU engine.
  • The table heading and the leftmost column in the table indicate the two input data types to be deduced. The corresponding intersections in the table indicate the deduced data types.
  • × indicates that the two data types cannot be deduced.

Deterministic Computing

Asynchronous multi-thread executions during operator implementation change the accumulation sequence of floating point numbers. The results of multiple executions of an operator with the same hardware and input may be different. When deterministic computing is enabled, multiple executions of an operator with the same hardware and input generate the same output.

If an operator listed in the following table is not in the corresponding specification list, the current processor version does not support the operator.

  • The following operators involve but do not support deterministic computing:
    • resizegradD
    • WeightQuantBatchMatmulV2
  • The following operators involve and support deterministic computing:
    • AvgPool3DGrad
    • BatchMatMul
    • BatchMatMulV2
    • BiasAddGrad
    • BinaryCrossEntropy
    • BN3DTrainingReduce
    • BN3DTrainingUpdateGrad
    • BNTrainingReduce
    • BNTrainingUpdateGrad
    • Conv2DBackpropFilter
    • Conv3DBackpropFilter
    • EmbeddingDenseGrad
    • FusedMulAddNL2loss
    • FullyConnection
    • GroupNormGrad
    • Histogram
    • IndexPut
    • IndexPutV2
    • InplaceIndexAdd
    • KLDiv
    • LayerNormBetaGammaBackpropV2
    • LayerNormGradV3
    • LpNormReduceV2
    • LpNormV2
    • MseLoss
    • MatMul
    • MatMulV2
    • MseLoss
    • NLLLoss
    • ReduceMean
    • ReduceMeanD
    • ReduceSum
    • ReduceSumD
    • ScatterAdd
    • ScatterElements
    • ScatterNd
    • ScatterNdAdd
    • SquareSumV1
    • UnsortedSegmentSum