Terminology
This section describes the terms, acronyms, and abbreviations that you may encounter when using the ATC tool.
- AIPP
The Artificial Intelligence Pre-Processing (AIPP) module is introduced to the Ascend AI Processor for hardware-based image preprocessing including CSC, image normalization (by subtracting the mean value or multiplying a factor), image cropping (by specifying the crop start and cropping the image to the size required by the neural network), and much more.
Atlas 200/300/500 Inference Product andAtlas Training Series Product : DVPP outputs aligned YUV420SP images rather than RGB images. If the model requires RGB images, AIPP can be used to convert the aligned YUV420SP images and then crop them to the size required by the model. - YUV420SP
It is a lossy image color encoding format, which can be YUV420SP_UV or YUV420SP_VU.
- Repository
It stores the tiling policy tuned for direct calls at operator build time.
- Cost model
It is an evaluator that selects the optimal tiling policy in the tiling space when no match is found in the existing repository during tuning.
- Format
Format is the physical layout of data and defines the dimensions for data interpretation, such as 1D, 2D, 3D, 4D, and 5D.
- NCHW and NHWC
In deep learning frameworks, n-dimensional data is stored by using an n-dimensional array. For example, a feature map of a convolutional neural network (CNN) is stored by using a 4D array, including:
- N: batch size, for example, the number of images.
- H: height of the feature map, that is, the number of pixels in the vertical direction.
- W: width of the feature map, that is, the number of pixels in the horizontal direction.
- C: channels. For example, an RGB image has 3 channels.
Data can be stored only in linear mode because the dimensions have a fixed order. Different deep learning frameworks store feature maps in different layouts. For example, Caffe uses the layout [Batch, Channels, Height, Width], that is, NCHW, and TensorFlow uses the layout [Batch, Height, Width, Channels], that is, NHWC.
Figure 1 uses an RGB image as an example. In NCHW format, C is arranged at the outermost layer, and pixels are close to each other in progressive mode in each channel, stored in the layout of RRRRRRGGGGGGBBBBBB. In NHWC format, C is arranged at the innermost layer, and pixels are close to each other in interlaced mode, stored in the layout of RGBRGBRGBRGBRGBRGB.
- NC1HWC0
To improve data access efficiency of General Matrix Multiply (GEMM) data blocks, tensors in the Ascend AI Processor are stored in the 5D format NC1HWC0, as shown in the following figure.
Figure 2 NC1HWC0
C0, closely related to the micro architecture, is the size of the Cube in the AI Core. C1=(C + C0 – 1) /C0. If the division is not exact, the result is rounded down.
Steps of NHWC-to-NC1HWC0 conversion:- Tile the NHWC data into C1 pieces of NHWC0 along the C dimension.
- Arrange the C1 pieces of NHWC0 in the memory contiguously, obtaining NC1HWC0.
Applications of NHWC-to-NC1HWC0 conversion:- Convert RGB images at the input layer into the NC1HWC0 format by using AIPP.
- Rearrange intermediate-layer feature maps output in NC1HWC0 during data movement.
- NCHW and NHWC
