SetVectorMask
Applicability
Product |
Supported/Unsupported |
|---|---|
√ |
|
√ |
|
√ |
|
√ |
|
x |
|
√ |
Function Usage
- In normal mode, the mask parameter is used to control the number of elements involved in computation in a single repeat. In this case, there are two modes:
- Contiguous mode: indicates the number of contiguous elements that participate in computation. The value range is related to the operand data type. The maximum number of elements that can be processed in each repeat varies according to the data type. When the operand is 16-bit, mask ∈ [1, 128]. When the operand is 32-bit, mask ∈ [1, 64]. When the operand is 64-bit, mask ∈ [1, 32].
- Bitwise mode: controls which elements are involved in computation by bit. If the value of a bit is 1, the element is involved in computation. If the value of a bit is 0, the element is not involved in computation. There are maskHigh (high-order mask) and maskLow (low-order mask). The parameter value range is related to the operand data type. The maximum number of elements that can be processed in each repeat varies according to the data type. When the operand is 16-bit, maskLow and maskHigh ∈ [0, 264 – 1] and cannot be 0 at the same time. When the operand is 32-bit, maskHigh is 0 and maskLow ∈ (0, 264 – 1]. When the operand is 64-bit, maskHigh is 0 and maskLow ∈ (0, 232 – 1].
- In counter mode, the mask parameter indicates the number of elements involved in the entire vector computation.
Prototype
- Applicable to the bitwise mask mode and counter mode in normal mode
1 2
template <typename T, MaskMode mode = MaskMode::NORMAL> __aicore__ static inline void SetVectorMask(const uint64_t maskHigh, const uint64_t maskLow)
- Applies to the contiguous mask mode in normal mode and counter mode.
1 2
template <typename T, MaskMode mode = MaskMode::NORMAL> __aicore__ static inline void SetVectorMask(int32_t len)
Parameters
Parameter |
Description |
||
|---|---|---|---|
T |
Data type of the vector computation operand. |
||
mode |
Mask mode, of the MaskMode type. The definition is as follows:
|
Parameter |
Input/Output |
Description |
|---|---|---|
maskHigh |
Input |
Normal mode: corresponds to the bitwise mask mode in normal mode and controls the elements that participate in computation by bit. High-order mask values are input. Counter mode: Set this parameter to 0. This parameter does not take effect. |
maskLow |
Input |
Normal mode: corresponds to the bitwise mask mode in normal mode and controls the elements that participate in computation by bit. Low-order mask values are input. Counter mode: indicates the number of elements involved in the entire vector computation. |
len |
Input |
Normal mode: corresponds to the contiguous mask mode in normal mode, indicating the number of contiguous elements that participate in computation in a single repeat. Counter mode: indicates the number of elements involved in the entire vector computation. |
Returns
None
Constraints
This API takes effect only when the isSetMask template parameter of the vector computation API is set to false. After using this API, you need to use ResetMask to restore mask to the default value.
Examples
This API can be used together with SetMaskCount and SetMaskNorm. Set the mask mode before setting mask.
- Calling examples (normal mode)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
AscendC::LocalTensor<half> dstLocal; AscendC::LocalTensor<half> src0Local; AscendC::LocalTensor<half> src1Local; // Normal mode AscendC::SetMaskNorm(); AscendC::SetVectorMask<half, AscendC::MaskMode::NORMAL>(0xffffffffffffffff, 0xffffffffffffffff); // Bitwise mode // SetVectorMask<half, MaskMode::NORMAL>(128); // Contiguous mode // To call the vector computation APIs for multiple times, use the normal mode and set the mask parameter. You do not need to set the parameter repeatedly in the API, which improves the performance. // dstBlkStride, src0BlkStride, src1BlkStride = 1. Data is continuously read and written in a single repeat. // dstRepStride, src0RepStride, src1RepStride = 8. Data is continuously read and written between adjacent repeats. AscendC::Add<half, false>(dstLocal, src0Local, src1Local, AscendC::MASK_PLACEHOLDER, 1, { 2, 2, 2, 8, 8, 8 }); AscendC::Sub<half, false>(src0Local, dstLocal, src1Local, AscendC::MASK_PLACEHOLDER, 1, { 2, 2, 2, 8, 8, 8 }); AscendC::Mul<half, false>(src1Local, dstLocal, src0Local, AscendC::MASK_PLACEHOLDER, 1, { 2, 2, 2, 8, 8, 8 }); AscendC::ResetMask();
- Calling examples (counter mode)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
// The counter mode is used together with the API for high-dimensional tensor sharding computation. AscendC::LocalTensor<half> dstLocal; AscendC::LocalTensor<half> src0Local; AscendC::LocalTensor<half> src1Local; int32_t len = 128; // Number of elements involved in computation AscendC::SetMaskCount(); AscendC::SetVectorMask<half, AscendC::MaskMode::COUNTER>(len); AscendC::Add<half, false>(dstLocal, src0Local, src1Local, AscendC::MASK_PLACEHOLDER, 1, { 1, 1, 1, 8, 8, 8 }); AscendC::Sub<half, false>(src0Local, dstLocal, src1Local, AscendC::MASK_PLACEHOLDER, 1, { 1, 1, 1, 8, 8, 8 }); AscendC::Mul<half, false>(src1Local, dstLocal, src0Local, AscendC::MASK_PLACEHOLDER, 1, { 1, 1, 1, 8, 8, 8 }); AscendC::SetMaskNorm(); AscendC::ResetMask(); // The counter mode is used together with the API for computing the first n data elements of the tensor. AscendC::LocalTensor<half> dstLocal; AscendC::LocalTensor<half> src0Local; half num = 2; AscendC::SetMaskCount(); AscendC::SetVectorMask<half, AscendC::MaskMode::COUNTER>(128); // The number of elements involved in computation is 128. AscendC::Adds<half, false>(dstLocal, src0Local, num, 1); AscendC::Muls<half, false>(dstLocal, src0Local, num, 1); AscendC::SetMaskNorm(); AscendC::ResetMask();