vec_trans_scatter

Description

Converts NCHW into NC1HWC0. If the data type is float32, int32, uint32, int16, unint16, or float16, then C0 is 16. If the data type is uint8 or int8, then C0 is 32.

Prototype

vec_trans_scatter (dst_high_half, src_high_half, dst_list, src_list, repeat_times, dst_rep_stride, src_rep_stride)

Parameters

Table 1 Parameter description

Parameter

Input/Output

Description

dst_high_half

Input

A bool specifying whether dst_list[*] stores data to the upper or lower half of the block. dst_list[*] stores data of type int8 or uint8 only.

Selected from:

  • True: upper half
  • False: lower half

src_high_half

Input

A bool specifying whether to read the data of src_list[*] from the upper or lower half of the block. Only the int8 or uint8 data type is supported.

Selected from:

  • True: upper half
  • False: lower half

dst_list

Output

A list for the destination Vector operand sequence. Each element marks the start of a destination operand. The supported data types are as follows:

The scope of the tensor is the Unified Buffer.

Atlas 200/300/500 Inference Product : Tensor of type int8/uint8/int16/uint16/float16

Atlas Training Series Product : Tensor of type int8/uint8/int16/uint16/float16

src_list

Input

A list for the source Vector operand sequence. Each element marks the start of a destination operand. Has the same data type as dst_list.

The scope of the tensor is the Unified Buffer.

repeat_times

Input

Repeat times (or iterations), in blocks. Must be in the range of [0, 255]. Must be a Scalar of type int16/int32/int64/uint16/uint32/uint64, an immediate of type int, or an Expr of type int16/int32/int64/uint16/uint32/uint64.

Notes:

1. When repeat_times =1, the valid start of a destination or source operand is the start of dst_list or src_list plus dst_rep_stride or src_rep_stride.

2. When repeat_times > 1, the valid start of a destination or source operand in the first repeat is the start of dst_list or src_list. In the second repeat, dst_rep_stride or src_rep_stride needs to be added. This rule applies.

dst_rep_stride

Input

Start address stride (in blocks) between iterations of the destination operand. Must be in the range of [0, 65535]. Must be a Scalar of type int16/int32/int64/uint16/uint32/uint64, an immediate of type int, or an Expr of type int16/int32/int64/uint16/uint32/uint64.

src_rep_stride

Input

Start address stride (in blocks) between iterations of the source operand. Must be in the range of [0, 65535]. Must be a Scalar of type int16/int32/int64/uint16/uint32/uint64, an immediate of type int, or an Expr of type int16/int32/int64/uint16/uint32/uint64.

Applicability

Atlas 200/300/500 Inference Product

Atlas Training Series Product

Restrictions

  • Generally, each element in src_list or dst_list is configured as the start of each HW plane.
  • For better performance, it is recommended that dstHighHalf and srcHighHalf be fixed when the data type is int8 or uint8, and be changed after the repeat in the H and W directions.
  • The mask value does not affect the execution of the API.
  • To save memory space, you can define a tensor reused by the source and destination operands (which means they have overlapped addresses). The general instruction restrictions are as follows.
    • For a single repeat (repeat_times = 1), the source operand sequence and the target operand sequence must be completely the same. Partial overlapping is not supported. Instead, each block must be the same.
    • In the event of multiple iteration repeats (repeat_times > 1), if there is a dependency between the source operand sequence and the destination operand sequence, that is, the destination operand of the Nth iteration is the source operand of the (N+1)th iteration, address overlapping is not allowed.
  • For details about the alignment requirements of the operand address offset, see General Restrictions.

Returns

None

Example

Example 1

tik_instance = tik.Tik()
shape = (3, 32, 16)
dtype = "int8"
# Number of iterations
repeat_time = 3
# Address stride between two adjacent iterations, in 32 bytes.
dst_rep_stride = 16
src_rep_stride = 16

src_gm = tik_instance.Tensor(dtype, shape, name="src_gm", scope=tik.scope_gm)
src_ub = tik_instance.Tensor(dtype, shape, name="src_ub", scope=tik.scope_ubuf)
dst0_gm = tik_instance.Tensor(dtype, shape, name="dst0_gm", scope=tik.scope_gm)
dst_gm = tik_instance.Tensor(dtype, shape, name="dst_gm", scope=tik.scope_gm)
dst_ub = tik_instance.Tensor(dtype, shape, name="dst_ub", scope=tik.scope_ubuf)
# Specifies whether dst/src_list[*] stores data to the upper or lower half of the block. Only int8 and uint8 are supported.
dstHighHalf = False
srcHighHalf = True

# Copy the user input to the source Unified Buffer. For details about data_move, see the corresponding section.
tik_instance.data_move(src_ub, src_gm, 0, 1, 48, 0, 0)
# vec_dup does not support int8. To facilitate observation, set the value to 0.
tik_instance.data_move(dst_ub, dst0_gm, 0, 1, 48, 0, 0)

dst_list = [dst_ub[32 * i] for i in range(16)]
src_list = [src_ub[32 * i] for i in range(16)]
tik_instance.vec_trans_scatter(dstHighHalf, srcHighHalf, dst_list, src_list, repeat_time, dst_rep_stride, src_rep_stride)
# Copy the compute result to the destination Global Memory. For details about data_move, see the corresponding section.
tik_instance.data_move(dst_gm, dst_ub, 0, 1, 48, 0, 0)

tik_instance.BuildCCE(kernel_name="vec_trans_scatter", inputs=[src_gm, dst0_gm], outputs=[dst_gm])

Result example:

Input (src_gm):
[[[  0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15]
  [ 16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31]
  [ 32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47]
  [ 48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63]
  [ 64  65  66  67  68  69  70  71  72  73  74  75  76  77  78  79]
  [ 80  81  82  83  84  85  86  87  88  89  90  91  92  93  94  95]
  [ 96  97  98  99 100 101 102 103 104 105 106 107 108 109 110 111]
  [112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127]
  [  0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15]
  [ 16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31]
  [ 32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47]
  [ 48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63]
  [ 64  65  66  67  68  69  70  71  72  73  74  75  76  77  78  79]
  [ 80  81  82  83  84  85  86  87  88  89  90  91  92  93  94  95]
  [ 96  97  98  99 100 101 102 103 104 105 106 107 108 109 110 111]
  [112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127]
  [  0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15]
  [ 16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31]
  [ 32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47]
  [ 48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63]
  [ 64  65  66  67  68  69  70  71  72  73  74  75  76  77  78  79]
  [ 80  81  82  83  84  85  86  87  88  89  90  91  92  93  94  95]
  [ 96  97  98  99 100 101 102 103 104 105 106 107 108 109 110 111]
  [112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127]
  [  0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15]
  [ 16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31]
  [ 32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47]
  [ 48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63]
  [ 64  65  66  67  68  69  70  71  72  73  74  75  76  77  78  79]
  [ 80  81  82  83  84  85  86  87  88  89  90  91  92  93  94  95]
  [ 96  97  98  99 100 101 102 103 104 105 106 107 108 109 110 111]
  [112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127]]

 [[  0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15]
  [ 16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31]
  [ 32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47]
  [ 48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63]
  [ 64  65  66  67  68  69  70  71  72  73  74  75  76  77  78  79]
  [ 80  81  82  83  84  85  86  87  88  89  90  91  92  93  94  95]
  [ 96  97  98  99 100 101 102 103 104 105 106 107 108 109 110 111]
  [112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127]
  [  0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15]
  [ 16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31]
  [ 32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47]
  [ 48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63]
  [ 64  65  66  67  68  69  70  71  72  73  74  75  76  77  78  79]
  [ 80  81  82  83  84  85  86  87  88  89  90  91  92  93  94  95]
  [ 96  97  98  99 100 101 102 103 104 105 106 107 108 109 110 111]
  [112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127]
  [  0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15]
  [ 16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31]
  [ 32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47]
  [ 48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63]
  [ 64  65  66  67  68  69  70  71  72  73  74  75  76  77  78  79]
  [ 80  81  82  83  84  85  86  87  88  89  90  91  92  93  94  95]
  [ 96  97  98  99 100 101 102 103 104 105 106 107 108 109 110 111]
  [112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127]
  [  0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15]
  [ 16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31]
  [ 32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47]
  [ 48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63]
  [ 64  65  66  67  68  69  70  71  72  73  74  75  76  77  78  79]
  [ 80  81  82  83  84  85  86  87  88  89  90  91  92  93  94  95]
  [ 96  97  98  99 100 101 102 103 104 105 106 107 108 109 110 111]
  [112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127]]

 [[  0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15]
  [ 16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31]
  [ 32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47]
  [ 48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63]
  [ 64  65  66  67  68  69  70  71  72  73  74  75  76  77  78  79]
  [ 80  81  82  83  84  85  86  87  88  89  90  91  92  93  94  95]
  [ 96  97  98  99 100 101 102 103 104 105 106 107 108 109 110 111]
  [112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127]
  [  0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15]
  [ 16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31]
  [ 32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47]
  [ 48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63]
  [ 64  65  66  67  68  69  70  71  72  73  74  75  76  77  78  79]
  [ 80  81  82  83  84  85  86  87  88  89  90  91  92  93  94  95]
  [ 96  97  98  99 100 101 102 103 104 105 106 107 108 109 110 111]
  [112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127]
  [  0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15]
  [ 16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31]
  [ 32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47]
  [ 48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63]
  [ 64  65  66  67  68  69  70  71  72  73  74  75  76  77  78  79]
  [ 80  81  82  83  84  85  86  87  88  89  90  91  92  93  94  95]
  [ 96  97  98  99 100 101 102 103 104 105 106 107 108 109 110 111]
  [112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127]
  [  0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15]
  [ 16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31]
  [ 32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47]
  [ 48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63]
  [ 64  65  66  67  68  69  70  71  72  73  74  75  76  77  78  79]
  [ 80  81  82  83  84  85  86  87  88  89  90  91  92  93  94  95]
  [ 96  97  98  99 100 101 102 103 104 105 106 107 108 109 110 111]
  [112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127]]]
Output (dst_gm):
[[[ 16  48  80 112  16  48  80 112  16  48  80 112  16  48  80 112]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 17  49  81 113  17  49  81 113  17  49  81 113  17  49  81 113]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 18  50  82 114  18  50  82 114  18  50  82 114  18  50  82 114]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 19  51  83 115  19  51  83 115  19  51  83 115  19  51  83 115]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 20  52  84 116  20  52  84 116  20  52  84 116  20  52  84 116]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 21  53  85 117  21  53  85 117  21  53  85 117  21  53  85 117]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 22  54  86 118  22  54  86 118  22  54  86 118  22  54  86 118]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 23  55  87 119  23  55  87 119  23  55  87 119  23  55  87 119]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 24  56  88 120  24  56  88 120  24  56  88 120  24  56  88 120]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 25  57  89 121  25  57  89 121  25  57  89 121  25  57  89 121]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 26  58  90 122  26  58  90 122  26  58  90 122  26  58  90 122]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 27  59  91 123  27  59  91 123  27  59  91 123  27  59  91 123]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 28  60  92 124  28  60  92 124  28  60  92 124  28  60  92 124]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 29  61  93 125  29  61  93 125  29  61  93 125  29  61  93 125]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 30  62  94 126  30  62  94 126  30  62  94 126  30  62  94 126]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 31  63  95 127  31  63  95 127  31  63  95 127  31  63  95 127]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]]

 [[ 16  48  80 112  16  48  80 112  16  48  80 112  16  48  80 112]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 17  49  81 113  17  49  81 113  17  49  81 113  17  49  81 113]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 18  50  82 114  18  50  82 114  18  50  82 114  18  50  82 114]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 19  51  83 115  19  51  83 115  19  51  83 115  19  51  83 115]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 20  52  84 116  20  52  84 116  20  52  84 116  20  52  84 116]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 21  53  85 117  21  53  85 117  21  53  85 117  21  53  85 117]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 22  54  86 118  22  54  86 118  22  54  86 118  22  54  86 118]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 23  55  87 119  23  55  87 119  23  55  87 119  23  55  87 119]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 24  56  88 120  24  56  88 120  24  56  88 120  24  56  88 120]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 25  57  89 121  25  57  89 121  25  57  89 121  25  57  89 121]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 26  58  90 122  26  58  90 122  26  58  90 122  26  58  90 122]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 27  59  91 123  27  59  91 123  27  59  91 123  27  59  91 123]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 28  60  92 124  28  60  92 124  28  60  92 124  28  60  92 124]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 29  61  93 125  29  61  93 125  29  61  93 125  29  61  93 125]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 30  62  94 126  30  62  94 126  30  62  94 126  30  62  94 126]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 31  63  95 127  31  63  95 127  31  63  95 127  31  63  95 127]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]]

 [[ 16  48  80 112  16  48  80 112  16  48  80 112  16  48  80 112]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 17  49  81 113  17  49  81 113  17  49  81 113  17  49  81 113]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 18  50  82 114  18  50  82 114  18  50  82 114  18  50  82 114]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 19  51  83 115  19  51  83 115  19  51  83 115  19  51  83 115]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 20  52  84 116  20  52  84 116  20  52  84 116  20  52  84 116]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 21  53  85 117  21  53  85 117  21  53  85 117  21  53  85 117]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 22  54  86 118  22  54  86 118  22  54  86 118  22  54  86 118]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 23  55  87 119  23  55  87 119  23  55  87 119  23  55  87 119]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 24  56  88 120  24  56  88 120  24  56  88 120  24  56  88 120]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 25  57  89 121  25  57  89 121  25  57  89 121  25  57  89 121]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 26  58  90 122  26  58  90 122  26  58  90 122  26  58  90 122]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 27  59  91 123  27  59  91 123  27  59  91 123  27  59  91 123]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 28  60  92 124  28  60  92 124  28  60  92 124  28  60  92 124]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 29  61  93 125  29  61  93 125  29  61  93 125  29  61  93 125]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 30  62  94 126  30  62  94 126  30  62  94 126  30  62  94 126]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
  [ 31  63  95 127  31  63  95 127  31  63  95 127  31  63  95 127]
  [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]]]

Example 2

tik_instance = tik.Tik(disable_debug=False)
shape = (3, 16, 16)
dtype = "float16"
# Number of iterations.
repeat_time = 3
# Address stride between two adjacent iterations, in 32 bytes.
dst_rep_stride = 16
src_rep_stride = 16

src_gm = tik_instance.Tensor(dtype, shape, name="src_gm", scope=tik.scope_gm)
src_ub = tik_instance.Tensor(dtype, shape, name="src_ub", scope=tik.scope_ubuf)
dst0_gm = tik_instance.Tensor(dtype, shape, name="dst0_gm", scope=tik.scope_gm)
dst_gm = tik_instance.Tensor(dtype, shape, name="dst_gm", scope=tik.scope_gm)
dst_ub = tik_instance.Tensor(dtype, shape, name="dst_ub", scope=tik.scope_ubuf)
# Specifies whether dst/src_list[*] stores data to the upper or lower half of the block. Only int8 and uint8 are supported.
dstHighHalf = False
srcHighHalf = False

# Copy the user input to the source Unified Buffer. For details about data_move, see the corresponding section.
tik_instance.data_move(src_ub, src_gm, 0, 1, 48, 0, 0)
dst_list = [dst_ub[16 * i] for i in range(16)]
src_list = [src_ub[16 * i] for i in range(16)]
tik_instance.vnchwconv(dstHighHalf, srcHighHalf, dst_list, src_list, repeat_time, dst_rep_stride, src_rep_stride)
# Copy the compute result to the destination Global Memory. For details about data_move, see the corresponding section.
tik_instance.data_move(dst_gm, dst_ub, 0, 1, 48, 0, 0)

tik_instance.BuildCCE(kernel_name="vec_trans_scatter", inputs=[src_gm], outputs=[dst_gm])

Result example:

Input (src_gm):
[[[  0.   1.   2.   3.   4.   5.   6.   7.   8.   9.  10.  11.  12.  13.
    14.  15.]
  [ 16.  17.  18.  19.  20.  21.  22.  23.  24.  25.  26.  27.  28.  29.
    30.  31.]
  [ 32.  33.  34.  35.  36.  37.  38.  39.  40.  41.  42.  43.  44.  45.
    46.  47.]
  [ 48.  49.  50.  51.  52.  53.  54.  55.  56.  57.  58.  59.  60.  61.
    62.  63.]
  [ 64.  65.  66.  67.  68.  69.  70.  71.  72.  73.  74.  75.  76.  77.
    78.  79.]
  [ 80.  81.  82.  83.  84.  85.  86.  87.  88.  89.  90.  91.  92.  93.
    94.  95.]
  [ 96.  97.  98.  99. 100. 101. 102. 103. 104. 105. 106. 107. 108. 109.
   110. 111.]
  [112. 113. 114. 115. 116. 117. 118. 119. 120. 121. 122. 123. 124. 125.
   126. 127.]
  [128. 129. 130. 131. 132. 133. 134. 135. 136. 137. 138. 139. 140. 141.
   142. 143.]
  [144. 145. 146. 147. 148. 149. 150. 151. 152. 153. 154. 155. 156. 157.
   158. 159.]
  [160. 161. 162. 163. 164. 165. 166. 167. 168. 169. 170. 171. 172. 173.
   174. 175.]
  [176. 177. 178. 179. 180. 181. 182. 183. 184. 185. 186. 187. 188. 189.
   190. 191.]
  [192. 193. 194. 195. 196. 197. 198. 199. 200. 201. 202. 203. 204. 205.
   206. 207.]
  [208. 209. 210. 211. 212. 213. 214. 215. 216. 217. 218. 219. 220. 221.
   222. 223.]
  [224. 225. 226. 227. 228. 229. 230. 231. 232. 233. 234. 235. 236. 237.
   238. 239.]
  [240. 241. 242. 243. 244. 245. 246. 247. 248. 249. 250. 251. 252. 253.
   254. 255.]]

 [[256. 257. 258. 259. 260. 261. 262. 263. 264. 265. 266. 267. 268. 269.
   270. 271.]
  [272. 273. 274. 275. 276. 277. 278. 279. 280. 281. 282. 283. 284. 285.
   286. 287.]
  [288. 289. 290. 291. 292. 293. 294. 295. 296. 297. 298. 299. 300. 301.
   302. 303.]
  [304. 305. 306. 307. 308. 309. 310. 311. 312. 313. 314. 315. 316. 317.
   318. 319.]
  [320. 321. 322. 323. 324. 325. 326. 327. 328. 329. 330. 331. 332. 333.
   334. 335.]
  [336. 337. 338. 339. 340. 341. 342. 343. 344. 345. 346. 347. 348. 349.
   350. 351.]
  [352. 353. 354. 355. 356. 357. 358. 359. 360. 361. 362. 363. 364. 365.
   366. 367.]
  [368. 369. 370. 371. 372. 373. 374. 375. 376. 377. 378. 379. 380. 381.
   382. 383.]
  [384. 385. 386. 387. 388. 389. 390. 391. 392. 393. 394. 395. 396. 397.
   398. 399.]
  [400. 401. 402. 403. 404. 405. 406. 407. 408. 409. 410. 411. 412. 413.
   414. 415.]
  [416. 417. 418. 419. 420. 421. 422. 423. 424. 425. 426. 427. 428. 429.
   430. 431.]
  [432. 433. 434. 435. 436. 437. 438. 439. 440. 441. 442. 443. 444. 445.
   446. 447.]
  [448. 449. 450. 451. 452. 453. 454. 455. 456. 457. 458. 459. 460. 461.
   462. 463.]
  [464. 465. 466. 467. 468. 469. 470. 471. 472. 473. 474. 475. 476. 477.
   478. 479.]
  [480. 481. 482. 483. 484. 485. 486. 487. 488. 489. 490. 491. 492. 493.
   494. 495.]
  [496. 497. 498. 499. 500. 501. 502. 503. 504. 505. 506. 507. 508. 509.
   510. 511.]]

 [[512. 513. 514. 515. 516. 517. 518. 519. 520. 521. 522. 523. 524. 525.
   526. 527.]
  [528. 529. 530. 531. 532. 533. 534. 535. 536. 537. 538. 539. 540. 541.
   542. 543.]
  [544. 545. 546. 547. 548. 549. 550. 551. 552. 553. 554. 555. 556. 557.
   558. 559.]
  [560. 561. 562. 563. 564. 565. 566. 567. 568. 569. 570. 571. 572. 573.
   574. 575.]
  [576. 577. 578. 579. 580. 581. 582. 583. 584. 585. 586. 587. 588. 589.
   590. 591.]
  [592. 593. 594. 595. 596. 597. 598. 599. 600. 601. 602. 603. 604. 605.
   606. 607.]
  [608. 609. 610. 611. 612. 613. 614. 615. 616. 617. 618. 619. 620. 621.
   622. 623.]
  [624. 625. 626. 627. 628. 629. 630. 631. 632. 633. 634. 635. 636. 637.
   638. 639.]
  [640. 641. 642. 643. 644. 645. 646. 647. 648. 649. 650. 651. 652. 653.
   654. 655.]
  [656. 657. 658. 659. 660. 661. 662. 663. 664. 665. 666. 667. 668. 669.
   670. 671.]
  [672. 673. 674. 675. 676. 677. 678. 679. 680. 681. 682. 683. 684. 685.
   686. 687.]
  [688. 689. 690. 691. 692. 693. 694. 695. 696. 697. 698. 699. 700. 701.
   702. 703.]
  [704. 705. 706. 707. 708. 709. 710. 711. 712. 713. 714. 715. 716. 717.
   718. 719.]
  [720. 721. 722. 723. 724. 725. 726. 727. 728. 729. 730. 731. 732. 733.
   734. 735.]
  [736. 737. 738. 739. 740. 741. 742. 743. 744. 745. 746. 747. 748. 749.
   750. 751.]
  [752. 753. 754. 755. 756. 757. 758. 759. 760. 761. 762. 763. 764. 765.
   766. 767.]]]
Output (dst_gm):
[[[  0.  16.  32.  48.  64.  80.  96. 112. 128. 144. 160. 176. 192. 208.
   224. 240.]
  [  1.  17.  33.  49.  65.  81.  97. 113. 129. 145. 161. 177. 193. 209.
   225. 241.]
  [  2.  18.  34.  50.  66.  82.  98. 114. 130. 146. 162. 178. 194. 210.
   226. 242.]
  [  3.  19.  35.  51.  67.  83.  99. 115. 131. 147. 163. 179. 195. 211.
   227. 243.]
  [  4.  20.  36.  52.  68.  84. 100. 116. 132. 148. 164. 180. 196. 212.
   228. 244.]
  [  5.  21.  37.  53.  69.  85. 101. 117. 133. 149. 165. 181. 197. 213.
   229. 245.]
  [  6.  22.  38.  54.  70.  86. 102. 118. 134. 150. 166. 182. 198. 214.
   230. 246.]
  [  7.  23.  39.  55.  71.  87. 103. 119. 135. 151. 167. 183. 199. 215.
   231. 247.]
  [  8.  24.  40.  56.  72.  88. 104. 120. 136. 152. 168. 184. 200. 216.
   232. 248.]
  [  9.  25.  41.  57.  73.  89. 105. 121. 137. 153. 169. 185. 201. 217.
   233. 249.]
  [ 10.  26.  42.  58.  74.  90. 106. 122. 138. 154. 170. 186. 202. 218.
   234. 250.]
  [ 11.  27.  43.  59.  75.  91. 107. 123. 139. 155. 171. 187. 203. 219.
   235. 251.]
  [ 12.  28.  44.  60.  76.  92. 108. 124. 140. 156. 172. 188. 204. 220.
   236. 252.]
  [ 13.  29.  45.  61.  77.  93. 109. 125. 141. 157. 173. 189. 205. 221.
   237. 253.]
  [ 14.  30.  46.  62.  78.  94. 110. 126. 142. 158. 174. 190. 206. 222.
   238. 254.]
  [ 15.  31.  47.  63.  79.  95. 111. 127. 143. 159. 175. 191. 207. 223.
   239. 255.]]

 [[256. 272. 288. 304. 320. 336. 352. 368. 384. 400. 416. 432. 448. 464.
   480. 496.]
  [257. 273. 289. 305. 321. 337. 353. 369. 385. 401. 417. 433. 449. 465.
   481. 497.]
  [258. 274. 290. 306. 322. 338. 354. 370. 386. 402. 418. 434. 450. 466.
   482. 498.]
  [259. 275. 291. 307. 323. 339. 355. 371. 387. 403. 419. 435. 451. 467.
   483. 499.]
  [260. 276. 292. 308. 324. 340. 356. 372. 388. 404. 420. 436. 452. 468.
   484. 500.]
  [261. 277. 293. 309. 325. 341. 357. 373. 389. 405. 421. 437. 453. 469.
   485. 501.]
  [262. 278. 294. 310. 326. 342. 358. 374. 390. 406. 422. 438. 454. 470.
   486. 502.]
  [263. 279. 295. 311. 327. 343. 359. 375. 391. 407. 423. 439. 455. 471.
   487. 503.]
  [264. 280. 296. 312. 328. 344. 360. 376. 392. 408. 424. 440. 456. 472.
   488. 504.]
  [265. 281. 297. 313. 329. 345. 361. 377. 393. 409. 425. 441. 457. 473.
   489. 505.]
  [266. 282. 298. 314. 330. 346. 362. 378. 394. 410. 426. 442. 458. 474.
   490. 506.]
  [267. 283. 299. 315. 331. 347. 363. 379. 395. 411. 427. 443. 459. 475.
   491. 507.]
  [268. 284. 300. 316. 332. 348. 364. 380. 396. 412. 428. 444. 460. 476.
   492. 508.]
  [269. 285. 301. 317. 333. 349. 365. 381. 397. 413. 429. 445. 461. 477.
   493. 509.]
  [270. 286. 302. 318. 334. 350. 366. 382. 398. 414. 430. 446. 462. 478.
   494. 510.]
  [271. 287. 303. 319. 335. 351. 367. 383. 399. 415. 431. 447. 463. 479.
   495. 511.]]

 [[512. 528. 544. 560. 576. 592. 608. 624. 640. 656. 672. 688. 704. 720.
   736. 752.]
  [513. 529. 545. 561. 577. 593. 609. 625. 641. 657. 673. 689. 705. 721.
   737. 753.]
  [514. 530. 546. 562. 578. 594. 610. 626. 642. 658. 674. 690. 706. 722.
   738. 754.]
  [515. 531. 547. 563. 579. 595. 611. 627. 643. 659. 675. 691. 707. 723.
   739. 755.]
  [516. 532. 548. 564. 580. 596. 612. 628. 644. 660. 676. 692. 708. 724.
   740. 756.]
  [517. 533. 549. 565. 581. 597. 613. 629. 645. 661. 677. 693. 709. 725.
   741. 757.]
  [518. 534. 550. 566. 582. 598. 614. 630. 646. 662. 678. 694. 710. 726.
   742. 758.]
  [519. 535. 551. 567. 583. 599. 615. 631. 647. 663. 679. 695. 711. 727.
   743. 759.]
  [520. 536. 552. 568. 584. 600. 616. 632. 648. 664. 680. 696. 712. 728.
   744. 760.]
  [521. 537. 553. 569. 585. 601. 617. 633. 649. 665. 681. 697. 713. 729.
   745. 761.]
  [522. 538. 554. 570. 586. 602. 618. 634. 650. 666. 682. 698. 714. 730.
   746. 762.]
  [523. 539. 555. 571. 587. 603. 619. 635. 651. 667. 683. 699. 715. 731.
   747. 763.]
  [524. 540. 556. 572. 588. 604. 620. 636. 652. 668. 684. 700. 716. 732.
   748. 764.]
  [525. 541. 557. 573. 589. 605. 621. 637. 653. 669. 685. 701. 717. 733.
   749. 765.]
  [526. 542. 558. 574. 590. 606. 622. 638. 654. 670. 686. 702. 718. 734.
   750. 766.]
  [527. 543. 559. 575. 591. 607. 623. 639. 655. 671. 687. 703. 719. 735.
   751. 767.]]]