Overview
This section describes the operator information defined based on the Ascend IR. Before using an operator, read the following restrictions and other descriptions.
Describes the overall constraints of operators. |
|
Describes the formats listed in the operator specifications. |
|
Describes the tensor types listed in the operator specifications. |
|
Describes the rules for data type promotion so that the data type can be automatically promoted during operator computation if the input tensor data types of some operators (such as Add and Mul) are different. |
|
List the operators involved in and supported by deterministic computing. |
General Restrictions
- For details about the implementation formulas of the DynamicRNN, DynamicGRUV2, and BNInference operators, click Link.
- The maximum value range of the groups attribute of the following operators is 65535:
TransData, Deconvolution, Conv2D, Conv2DTranspose, Conv2DBackpropInput, Conv2dBackpropFilter, Conv3D, Conv3DTranspose, Conv3DBackpropInput, Conv3DBackpropFilter, DeformableConv2D, Correlation.
- Constraints on the Sin operator:
- Constraints on the Cos operator:
- Constraints on the Tan operator:
Due to hardware restrictions, the precision requirements are met when the input of the FLOAT32, BFLOAT16, FLOAT16, INT32, and INT64 data types is within [-65504,65504]. Beyond this range, the precision cannot be ensured. In this case, use the CPU for computing.
- The Conv2D and DepthwiseConv2D operators support different data types for different product models.
Atlas 200/300/500 Inference Product :Tensor
x
filter
bias
y
Data Type
float16
float16
float16
float16
int8
int8
int32
int32
Atlas Training Series Product :Tensor
x
filter
bias
y
Data Type
float16
float16
float16
float16
float16
float16
float32
float32
int8
int8
int32
int32
Format
- ND: any format, applicable to operators that take singular inputs, such as Square and Tanh.
- NC1HWC0: self-developed 5D data format. C0 is closely related to the micro-architecture, and the value is equal to the Cube Unit size, for example, 16. C1 is obtained by dividing the C dimension by C0, that is, C1 = C/C0. When the division is not exact, the last data segment is padded to C0.
- FRACTAL_Z: a format of the convolution weight.
TensorType
struct TensorType {
explicit TensorType(DataType dt);
TensorType(const std::initializer_list<DataType> &types);
static TensorType ALL() {
return TensorType{DT_BOOL, DT_COMPLEX128, DT_COMPLEX64, DT_DOUBLE, DT_FLOAT, DT_FLOAT16, DT_INT16, DT_INT32,
DT_INT64, DT_INT8, DT_QINT16, DT_QINT32, DT_QINT8, DT_QUINT16, DT_QUINT8, DT_RESOURCE,
DT_STRING, DT_UINT16, DT_UINT32, DT_UINT64, DT_UINT8};
}
static TensorType QuantifiedType() { return TensorType{DT_QINT16, DT_QINT32, DT_QINT8, DT_QUINT16, DT_QUINT8}; }
static TensorType OrdinaryType() {
return TensorType{DT_BOOL, DT_COMPLEX128, DT_COMPLEX64, DT_DOUBLE, DT_FLOAT, DT_FLOAT16, DT_INT16,
DT_INT32, DT_INT64, DT_INT8, DT_UINT16, DT_UINT32, DT_UINT64, DT_UINT8};
}
static TensorType BasicType() {
return TensorType{DT_COMPLEX128, DT_COMPLEX64, DT_DOUBLE, DT_FLOAT, DT_FLOAT16, DT_INT16,
DT_INT32, DT_INT64, DT_INT8, DT_QINT16, DT_QINT32, DT_QINT8,
DT_QUINT16, DT_QUINT8, DT_UINT16, DT_UINT32, DT_UINT64, DT_UINT8};
}
static TensorType NumberType() {
return TensorType{DT_COMPLEX128, DT_COMPLEX64, DT_DOUBLE, DT_FLOAT, DT_FLOAT16, DT_INT16, DT_INT32, DT_INT64,
DT_INT8, DT_QINT32, DT_QINT8, DT_QUINT8, DT_UINT16, DT_UINT32, DT_UINT64, DT_UINT8};
}
static TensorType RealNumberType() {
return TensorType{DT_DOUBLE, DT_FLOAT, DT_FLOAT16, DT_INT16, DT_INT32, DT_INT64,
DT_INT8, DT_UINT16, DT_UINT32, DT_UINT64, DT_UINT8};
}
static TensorType ComplexDataType() { return TensorType{DT_COMPLEX128, DT_COMPLEX64}; }
static TensorType IntegerDataType() {
return TensorType{DT_INT16, DT_INT32, DT_INT64, DT_INT8, DT_UINT16, DT_UINT32, DT_UINT64, DT_UINT8};
}
static TensorType SignedDataType() { return TensorType{DT_INT16, DT_INT32, DT_INT64, DT_INT8}; }
static TensorType UnsignedDataType() { return TensorType{DT_UINT16, DT_UINT32, DT_UINT64, DT_UINT8}; }
static TensorType FloatingDataType() { return TensorType{DT_DOUBLE, DT_FLOAT, DT_FLOAT16}; }
static TensorType IndexNumberType() { return TensorType{DT_INT32, DT_INT64}; }
static TensorType UnaryDataType() { return TensorType{DT_COMPLEX128, DT_COMPLEX64, DT_DOUBLE, DT_FLOAT, DT_FLOAT16}; }
static TensorType FLOAT() { return TensorType{DT_FLOAT, DT_FLOAT16}; }
std::shared_ptr<TensorTypeImpl> tensor_type_impl_;
};Type Promotion
If the input tensor data types of some operators (such as Add and Mul) are different, the data type is automatically promoted during operator computation. The following table lists the rules for data type promotion.
Data Type |
f32 |
f16 |
bf16 |
s8 |
u8 |
s16 |
u16 |
s32 |
u32 |
s64 |
u64 |
bool |
c32 |
c64 |
f32 |
f32 |
f32 |
f32 |
f32 |
f32 |
f32 |
× |
f32 |
× |
f32 |
× |
f32 |
c64 |
c64 |
f16 |
f32 |
f16 |
f32 |
f16 |
f16 |
f16 |
× |
f16 |
× |
f16 |
× |
f16 |
c32 |
c64 |
bf16 |
f32 |
f64 |
bf16 |
bf16 |
bf16 |
bf16 |
× |
bf16 |
× |
bf16 |
× |
bf16 |
c32 |
c64 |
s8 |
f32 |
f16 |
bf16 |
s8 |
s16 |
s16 |
× |
s32 |
× |
s64 |
× |
s8 |
c32 |
c64 |
u8 |
f32 |
f16 |
bf16 |
s16 |
u8 |
s16 |
× |
s32 |
× |
s64 |
× |
u8 |
c32 |
c64 |
s16 |
f32 |
f16 |
bf16 |
s16 |
s16 |
s16 |
× |
s32 |
× |
s64 |
× |
s16 |
c32 |
c64 |
u16 |
× |
× |
× |
× |
× |
× |
u16 |
× |
× |
× |
× |
× |
× |
× |
s32 |
f32 |
f16 |
bf16 |
s32 |
s32 |
s32 |
× |
s32 |
× |
s64 |
× |
s32 |
c32 |
c64 |
u32 |
× |
× |
× |
× |
× |
× |
× |
× |
u32 |
× |
× |
× |
× |
× |
s64 |
f32 |
f16 |
bf16 |
s64 |
s64 |
s64 |
× |
s64 |
× |
s64 |
× |
s64 |
c32 |
c64 |
u64 |
× |
× |
× |
× |
× |
× |
× |
× |
× |
× |
u64 |
× |
× |
× |
bool |
f32 |
f16 |
bf16 |
s8 |
u8 |
s16 |
× |
s32 |
× |
s64 |
× |
bool |
c32 |
c64 |
c32 |
c64 |
c32 |
c32 |
c32 |
c32 |
c32 |
× |
c32 |
× |
c32 |
× |
c32 |
c32 |
c64 |
c64 |
c64 |
c64 |
c64 |
c64 |
c64 |
c64 |
× |
c64 |
× |
c64 |
× |
c64 |
c64 |
c64 |
- For ease of description, the data types used in the table are abbreviated: DT_FLOAT (f32), DT_FLOAT16 (f16), DT_BF16 (bf16), DT_INT8 (s8), DT_UINT8 (u8), DT_INT16 (s16), DT_UINT16 (u16), DT_INT32 (s32), DT_UINT32 (u32), DT_INT64 (s64), DT_UINT64 (u64), DT_BOOL (bool), DT_COMPLEX64 (c32), and DT_COMPLEX64 (c64).
- Currently, the AI Core engine does not support the DT_DOUBLE and DT_COMPLEX128 types for operator precision promotion. For example, if the data types of the input parameters of the Mul operator are float32 and double, the float32 data type cannot be promoted to double for the AI Core engine, and the input will be allocated to the AI CPU engine.
- The table heading and the leftmost column in the table indicate the two input data types to be deduced. The corresponding intersections in the table indicate the deduced data types.
- × indicates that the two data types cannot be deduced.
Deterministic Computing
Asynchronous multi-thread executions during operator implementation change the accumulation sequence of floating point numbers. The results of multiple executions of an operator with the same hardware and input may be different. When deterministic computing is enabled, multiple executions of an operator with the same hardware and input generate the same output.
If an operator listed in the following table is not in the corresponding specification list, the current processor version does not support the operator.
- The following operators involve but do not support deterministic computing:
- resizegradD
- WeightQuantBatchMatmulV2
- The following operators involve and support deterministic computing:
- AvgPool3DGrad
- BatchMatMul
- BatchMatMulV2
- BiasAddGrad
- BinaryCrossEntropy
- BN3DTrainingReduce
- BN3DTrainingUpdateGrad
- BNTrainingReduce
- BNTrainingUpdateGrad
- Conv2DBackpropFilter
- Conv3DBackpropFilter
- EmbeddingDenseGrad
- FusedMulAddNL2loss
- FullyConnection
- GroupNormGrad
- Histogram
- IndexPut
- IndexPutV2
- InplaceIndexAdd
- KLDiv
- LayerNormBetaGammaBackpropV2
- LayerNormGradV3
- LpNormReduceV2
- LpNormV2
- MseLoss
- MatMul
- MatMulV2
- MseLoss
- NLLLoss
- ReduceMean
- ReduceMeanD
- ReduceSum
- ReduceSumD
- ScatterAdd
- ScatterElements
- ScatterNd
- ScatterNdAdd
- SquareSumV1
- UnsortedSegmentSum