Specifications

This section describes the operator information defined based on the Ascend IR. Before using an operator, read the following description.

Format	Describes the formats listed in the operator specifications.
TensorType	Describes the tensor types listed in the operator specifications.
Type Promotion	If the input tensor data types of some operators (such as Add and Mul) are different, the data type is automatically promoted during operator computation. This part describes the rules for data type promotion.
Deterministic Computing	Lists the operators involved in and supported by deterministic computing.

Format

ND: any format, applicable to operators that take singular inputs, such as Square and Tanh.
NC1HWC0: self-developed 5D data format. C0 is closely related to the micro-architecture, and the value is equal to the Cube Unit size, for example, 16. C1 is obtained by dividing the C dimension by C0, that is, C1 = C/C0. When the division is not exact, the last data segment is padded to C0.
FRACTAL_Z: a format of the convolution weight.

TensorType

Currently, only the StridedSlice, StridedSliceGrad, and AsStrided operators support the DT_COMPLEX32 data type in OrdinaryType, BasicType, NumberType, ComplexDataType, and UnaryDataType.

struct TensorType {
  explicit TensorType(DataType dt);

  TensorType(const std::initializer_list<DataType> &initial_types);

  static TensorType ALL() {
    return TensorType{DT_BOOL,   DT_COMPLEX128, DT_COMPLEX64, DT_DOUBLE, DT_FLOAT,  DT_FLOAT16, DT_INT16,
                      DT_INT32,  DT_INT64,      DT_INT8,      DT_QINT16, DT_QINT32, DT_QINT8,   DT_QUINT16,
                      DT_QUINT8, DT_RESOURCE,   DT_STRING,    DT_UINT16, DT_UINT32, DT_UINT64,  DT_UINT8,
                      DT_BF16, DT_COMPLEX32};
  }

  static TensorType QuantifiedType() { return TensorType{DT_QINT16, DT_QINT32, DT_QINT8, DT_QUINT16, DT_QUINT8}; }

  static TensorType OrdinaryType() {
    return TensorType{DT_BOOL,  DT_COMPLEX128, DT_COMPLEX64, DT_DOUBLE, DT_FLOAT,  DT_FLOAT16, DT_INT16,
                      DT_INT32, DT_INT64,      DT_INT8,      DT_UINT16, DT_UINT32, DT_UINT64,  DT_UINT8,
                      DT_BF16, DT_COMPLEX32};
  }

  static TensorType BasicType() {
    return TensorType{DT_COMPLEX128, DT_COMPLEX64, DT_DOUBLE, DT_FLOAT,  DT_FLOAT16, DT_INT16,
                      DT_INT32,      DT_INT64,     DT_INT8,   DT_QINT16, DT_QINT32,  DT_QINT8,
                      DT_QUINT16,    DT_QUINT8,    DT_UINT16, DT_UINT32, DT_UINT64,  DT_UINT8,
                      DT_BF16, DT_COMPLEX32};
  }

  static TensorType NumberType() {
    return TensorType{DT_COMPLEX128, DT_COMPLEX64, DT_DOUBLE, DT_FLOAT,  DT_FLOAT16, DT_INT16,  DT_INT32,  DT_INT64,
                      DT_INT8,       DT_QINT32,    DT_QINT8,  DT_QUINT8, DT_UINT16,  DT_UINT32, DT_UINT64, DT_UINT8,
                      DT_BF16, DT_COMPLEX32};
  }

  static TensorType RealNumberType() {
    return TensorType{DT_DOUBLE, DT_FLOAT,  DT_FLOAT16, DT_INT16,  DT_INT32, DT_INT64,
                      DT_INT8,   DT_UINT16, DT_UINT32,  DT_UINT64, DT_UINT8, DT_BF16};
  }

  static TensorType ComplexDataType() { return TensorType{DT_COMPLEX128, DT_COMPLEX64, DT_COMPLEX32}; }

  static TensorType IntegerDataType() {
    return TensorType{DT_INT16, DT_INT32, DT_INT64, DT_INT8, DT_UINT16, DT_UINT32, DT_UINT64, DT_UINT8};
  }

  static TensorType SignedDataType() { return TensorType{DT_INT16, DT_INT32, DT_INT64, DT_INT8}; }

  static TensorType UnsignedDataType() { return TensorType{DT_UINT16, DT_UINT32, DT_UINT64, DT_UINT8}; }

  static TensorType FloatingDataType() { return TensorType{DT_DOUBLE, DT_FLOAT, DT_FLOAT16}; }

  static TensorType IndexNumberType() { return TensorType{DT_INT32, DT_INT64}; }

  static TensorType UnaryDataType() {
    return TensorType{DT_COMPLEX128, DT_COMPLEX64, DT_DOUBLE, DT_FLOAT, DT_FLOAT16, DT_BF16, DT_COMPLEX32};
  }

  static TensorType FLOAT() { return TensorType{DT_FLOAT, DT_FLOAT16, DT_BF16}; }

  std::shared_ptr<TensorTypeImpl> tensor_type_impl_;
};

Type Promotion

If the input tensor data types of some operators (such as Add and Mul) are different, the data type is automatically promoted during operator computation. The following table lists the rules for data type promotion.

Data Type	f32	f16	bf16	s8	u8	s16	u16	s32	u32	s64	u64	bool	c32	c64
f32	f32	f32	f32	f32	f32	f32	×	f32	×	f32	×	f32	c64	c64
f16	f32	f16	f32	f16	f16	f16	×	f16	×	f16	×	f16	c32	c64
bf16	f32	f32	bf16	bf16	bf16	bf16	×	bf16	×	bf16	×	bf16	c32	c64
s8	f32	f16	bf16	s8	s16	s16	×	s32	×	s64	×	s8	c32	c64
u8	f32	f16	bf16	s16	u8	s16	×	s32	×	s64	×	u8	c32	c64
s16	f32	f16	bf16	s16	s16	s16	×	s32	×	s64	×	s16	c32	c64
u16	×	×	×	×	×	×	u16	×	×	×	×	×	×	×
s32	f32	f16	bf16	s32	s32	s32	×	s32	×	s64	×	s32	c32	c64
u32	×	×	×	×	×	×	×	×	u32	×	×	×	×	×
s64	f32	f16	bf16	s64	s64	s64	×	s64	×	s64	×	s64	c32	c64
u64	×	×	×	×	×	×	×	×	×	×	u64	×	×	×
bool	f32	f16	bf16	s8	u8	s16	×	s32	×	s64	×	bool	c32	c64
c32	c64	c32	c32	c32	c32	c32	×	c32	×	c32	×	c32	c32	c64
c64	c64	c64	c64	c64	c64	c64	×	c64	×	c64	×	c64	c64	c64

For ease of description, the data types used in the table are abbreviated: DT_FLOAT (f32), DT_FLOAT16 (f16), DT_BF16 (bf16), DT_INT8 (s8), DT_UINT8 (u8), DT_INT16 (s16), DT_UINT16 (u16), DT_INT32 (s32), DT_UINT32 (u32), DT_INT64 (s64), DT_UINT64 (u64), DT_BOOL (bool), DT_COMPLEX32 (c32), and DT_COMPLEX64 (c64).
Currently, the AI Core engine does not support the DT_DOUBLE and DT_COMPLEX128 types for operator precision promotion. For example, if the data types of the input parameters of the Mul operator are float32 and double, the float32 data type cannot be promoted to double for the AI Core engine, and the input will be allocated to the AI CPU engine.
The table heading and the leftmost column in the table indicate the two input data types to be deduced. The corresponding intersections in the table indicate the deduced data types.
× indicates that the two data types cannot be deduced.

Deterministic Computing

Asynchronous multi-thread executions during operator implementation change the accumulation sequence of floating point numbers. The results of multiple executions of an operator with the same hardware and input may be different. When deterministic computing is enabled, multiple executions of an operator with the same hardware and input generate the same output.

If an operator listed in the following table is not in the corresponding specification list, the current processor version does not support the operator.

The following operators involve but do not support deterministic computing:
- resizegradD
- WeightQuantBatchMatmulV2
The following operators involve and support deterministic computing:
- AvgPool3DGrad
- BatchMatMul
- BatchMatMulV2
- BiasAddGrad
- BinaryCrossEntropy
- BN3DTrainingReduce
- BN3DTrainingUpdateGrad
- BNTrainingReduce
- BNTrainingUpdateGrad
- Conv2DBackpropFilter: This operator is supported only by Atlas training products, Atlas A3 training products/Atlas A3 inference products, and Atlas A2 training products/Atlas A2 inference products.
- Conv3DBackpropFilter: This operator is supported only by Atlas training products, Atlas A3 training products/Atlas A3 inference products, and Atlas A2 training products/Atlas A2 inference products.
- EmbeddingDenseGrad
- FullyConnection
- GroupNormGrad
- Histogram
- InplaceIndexAdd
- KLDiv
- LpNormReduceV2
- LpNormV2
- MseLoss
- MatMul
- MatMulV2
- NLLLoss
- ReduceMean
- ReduceSum
- ScatterAdd
- ScatterElements
- ScatterNd
- ScatterNdAdd
- UnsortedSegmentSum

Parent topic: Ascend IR Operator Specifications