pooling2d

Description

Samples tensor_in in the area where the kernel convolves in different pooling modes.

The pooling mode can be MAX, AVG, GMP, or GAP.

MAX: max pooling. Outputs the maximum values of each patch of the feature map.
AVG: avg pooling. Outputs the average values of each patch of the feature map.
GMP: global max pooling, a special mode of max pooling. Particularly, the window size is the same as the feature map size.
GAP: global avg pooling, a special mode of avg pooling. Particularly, the window size is the same as the feature map size.

When pooling_mode = MAX and padding_mode = SAME, the pooling result of tensor_in is as follows.

where

input_w: width of tensor_in
input_h: height of tensor_in
kernel_w: width of window
kernel_h: height of window
pad_top: top padding along the H dimension of tensor_in, which is 1 in this example.
pad_bottom: bottom padding along the H dimension of tensor_in, which is 1 in this example.
pad_left: left padding along the W dimension of tensor_in, which is 1 in this example.
pad_right: right padding along the W dimension of tensor_in, which is 1 in this example.
stride_w: width of stride
stride_h: height of stride

Prototype

pooling2d(tensor_in, window, stride, pooling_mode, padding_mode="SAME", pad = (0,0,0,0), dilation = (1,1), data_mode=1, ceil_mode=0, fusion_params=None, impl_mode="high_performance")

Parameters

tensor_in: a tvm.tensor of type float16, for the feature map. Has a 5D format of NC1HWC0. The last dimension C0 must be 16.
window: a list or tuple for the size of the input slider.
window is a 2D list or tuple of positive integers within the range of [1, 32768].

window[0] indicates the width of the input window and window[1] indicates the height of the input window.
stride: a list or tuple for the strides of the input slider.
stride is a 2D list or tuple of positive integers. The width and height of stride is within the range of [1, 63].

stride[0] indicates the width stride of the window for the feature map and stride[1] indicates height stride of the window for the feature map.
pooling_mode: pooling mode selected from MAX, AVG, GMP, and GAP.
- MAX: max pooling. Outputs the maximum values of each patch of the feature map.
- AVG: avg pooling. Outputs the average values of each patch of the feature map.
- GMP: global max pooling, which is a special mode of max pooling and returns the maximum element of the feature map. Particularly, the feature map size is the same as the window size.
- GAP: global avg pooling, which is a special mode of avg pooling and returns the average value of feature map elements. Particularly, the feature map size is the same as the window size.
padding_mode: padding mode, either VALID (padding disabled) or SAME (padding enabled).
- In VALID mode:
  When the window movement along the W or H direction can cover only some parts of the feature map, the data that does not cover a complete window is discarded. That is, the data in the feature map is not involved in the computation.
  
  MAX, AVG, GMP, and GAP all involve the VALID mode.
- In SAME mode:
  When the window movement along the W or H direction can cover only some parts of the feature map, pad 0 to ensure that a complete window can be covered. That is, the data in the feature map is involved in the computation.
  
  MAX and AVG involve the SAME mode, while GMP and GAP do not involve the SAME mode.
pad: (optional) a list or tuple for the padding sizes, which is for the compatibility with Caffe pooling.
pad is a 4D list or tuple of integers whose values are greater than or equal to 0.

pad[0], pad[1], pad[2], and pad[3] indicate the padding lines in the top, bottom, left, and right, respectively. Defaults to (0, 0, 0, 0).
dilation: (optional) a list or tuple for the dilation factors.
dilation is a 2D list or tuple of positive integers within the range of [1, 255].

dilation[0] and dilation[1] indicate the dilation factors of the window in terms of height and width. Defaults to (1,1).
data_mode: template type. 0: CAFFE_DATA_MODE; 1: TENSORFLOW_DATA_MODE.
ceil_mode: equivalent of round_mode in Caffe. 0 (default): ceiling; 1: floor.
fusion_params: reserved.
impl_mode: specifies if precision takes priority over performance.
- high_precision: Precision takes priority. In this case, performance is compromised due to the complex compute process.
- high_performance: Performance takes priority. In this case, the precision decreases.
Defaults to high_performance.

Returns

res_tensor: a tvm.tensor for the result tensor. Has a 5D of NC1HWC0

The shape of tensor_in is [N, C1, H, W, C0=16]; the shape of window is [F, F]; and the shape of stride is [S, S]

In VALID mode and SAME mode of MAX pooling and AVG pooling, the shape of the output tensor is computed as follows:

In VALID mode:
- The N and C dimensions remain unchanged.
- The dimensions of Hout and Wout are as follows:
In SAME mode:
- The N and C dimensions remain unchanged.
- The dimensions of Hout and Wout are as follows:
  
  W is the input size; F is the filter size; S is the stride; and [] is the round-up sign.

In VALID modes of GMP pooling and GAP pooling, the shape of the output tensor is computed as follows:

The N and C dimensions remain unchanged.
Hout = Wout = 1

Restrictions

This API cannot be used in conjunction with other TBE DSL APIs.

When pooling_mode is set to MAX or AVG in VALID mode, the following condition must be met:
out_w * window_h * window_w * C0 * SIZE_OF_FP16 + out_w * C0 * SIZE_OF_FP16 < ub_size
When pooling_mode is set to AVG in SAME mode, the following condition must be met:
out_w * window_h * window_w * C0 * SIZE_OF_FP16 + out_w * C0 * SIZE_OF_FP16

+ out_w * C0 * SIZE_OF_FP16 < ub_size
When pooling_mode is set to AVG, the following conditions must be met: stride_h ≤ 2 x window_h, and stride_w ≤ 2 x window_w
When pooling_mode is set to AVG, the following condition must be met: window width x window height < 256
When pooling_mode is set to MAX, the following conditions must be met: window width ≤ 20, and window height ≤ 20
When pooling_mode is set to MAX or AVG, then tensor_in, pad, and window must meet the following conditions:
stride_h <= in_size_h + pad_top + pad_bottom – window_h

stride_w <= in_size_w + pad_left + pad_right – window_w

When pooling_mode is set to GAP or GMP, the following conditions must be met: window_h = in_size_h, and window_w = in_size_w
When pooling_mode is set to GAP or GMP, the following condition must be met: padding_mode = "VALID"

ub_size indicates the available size of Unified Buffer.
out_w indicates the width of the output tensor.
window_h indicates the height of window.
window_w indicates the width of window.
C0 indicates the size of C0 of tensor_in.
SIZE_OF_FP16 indicates the size of float16 data.

Applicability

Atlas 200/300/500 Inference Product

Atlas Training Series Product

Example

from tbe import tvm
from tbe import dsl
shape = (1, 2, 416, 416, 16) 
input_dtype = "float16"
data = tvm.placeholder(shape, name="data", dtype=input_dtype) 
res = dsl.pooling2d(data, (3, 3), (2, 2), "AVG", "SAME")
# res.shape = (1, 2, 208, 208, 16)

Parent topic: NN Compute APIs