Format
Format is the physical layout format of data and defines the dimensions for data interpretation, such as 1D, 2D, 3D, 4D, and 5D.
NCHW and NHWC
- N: batch size, for example, the number of images.
- H: height of the feature map, that is, the number of pixels in the vertical direction.
- W: width of the feature map, that is, the number of pixels in the horizontal direction.
- C: channels. For example, an RGB image has 3 channels.
Data can be stored only in linear mode because the dimensions have a fixed order. Different deep learning frameworks store feature maps with different layouts. For example, Caffe uses the layout [Batch, Channels, Height, Width], that is, NCHW, while TensorFlow uses the layout [Batch, Height, Width, Channels], that is, NHWC.
As shown in Figure 1, for an RGB image, the pixel values of each channel are clustered in sequence as RRRRRRGGGGGGBBBBBB with the NCHW layout. However, with the NHWC layout, the pixel values are interleaved as RGBRGBRGBRGBRGBRGB.
Data access characteristics vary with the data storage sequence, though the stored data is the same. As such, the compute performance varies correspondingly even with same operation.
NC1HWC0
To improve data access efficiency of General Matrix Multiply (GEMM) data blocks, the tensor data on Ascend AI Processor is stored in NC1HWC0, a 5D format. C0, closely related to the micro architecture, is the size of the Cube Unit in the AI Core.
C1 = (C + C0 – 1)/C0. When the division is not exact, the result is rounded down.
Steps of NHWC/NCHW -> NC1HWC0 conversion: Tile data into C1 pieces of NHWC0/NC0HW along the C dimension, and arrange them in the memory into NC1HWC0, as shown in the following figure.

- Formula for NHWC -> NC1HWC0 conversion:
Tensor.reshape( [N, H, W, C1, C0]).transpose( [0, 3, 1, 2, 4] )
- Formula for NCHW -> NC1HWC0 conversion:
Tensor.reshape( [N, C1, C0, H, W]).transpose( [0, 1, 3, 4, 2] )
FRACTAL_NZ
FRACTAL_NZ is a fractal format for storing data such as feature maps. For example, the Cube Unit outputs matrices in NW1H1H0W0 format. The matrix is divided into (H1 x W1) fractals in column-major order, which looks like an N-shape layout. Each fractal consists of (H0 x W0) elements in row-major order, resembling a z-shaped layout. Therefore the NW1H1H0W0 format is referred to as the Nz format. (H0 x W0) indicates the size of a fractal, as shown in the following figure.

ND-to-FRACTAL_NZ conversion:
(..., N, H, W) -> pad -> (..., N, H1 x H0, W1 x W0) -> reshape -> (..., N, H1, H0, W1, W0) -> transpose -> (..., N, W1, H1, H0, W0)
FRACTAL_Z
FRACTAL_Z is a format to define convolution weights, which is converted from the Filter Matrix. It is transferred to the Cube Unit in 4D format of "C1HW,N1,N0,C0".
The data is tiled into two layers, as shown in the following figure.

The data of input layer, related to the cube size, is contiguously stored in column-major order (n format). The data of the second layer, related to the matrix size, is contiguously stored in row-major order (Z format).
For example, HWCN = (2, 2, 32, 32) can be reshaped into FRACTAL_Z (C1HW, N1, N0, C0) = (8, 2, 16, 16).
HWCN-to-FRACTAL_Z conversion:
Tensor.padding([ [0,0], [0,0], [0,(C0-C%C0)%C0], [0,(N0-N%N0)%N0] ]).reshape( [H, W, C1, C0, N1, N0]).transpose( [2, 0, 1, 4, 5, 3] ).reshape( [C1*H*W, N1, N0, C0])
NCHW-to-FRACTAL_Z conversion:
Tensor.padding([ [0,(N0-N%N0)%N0], [0,(C0-C%C0)%C0], [0,0], [0,0] ]).reshape( [N1, N0, C1, C0, H, W,]).transpose( [2, 4, 5, 0, 1, 3] ).reshape( [C1*H*W, N1, N0, C0])
