SetVectorMask
Function Usage
- In normal mode, the mask parameter is used to control the number of elements involved in computation in a single iteration. In this case, the following two modes are available. For details, see Mask Operations.
- Contiguous mode: indicates the number of contiguous elements that participate in computation. The value range is related to the operand data type. The maximum number of elements that can be processed in each iteration varies according to the data type. When the operand is 16-bit, mask ∈ [1, 128]. When the operand is 32-bit, mask ∈ [1, 64]. When the operand is 64-bit, mask ∈ [1, 32].
- Bitwise mode: controls the elements that participate in computation by bit. If a bit is set to 1, the corresponding element participates in the computation. If a bit is set to 0, the corresponding element is masked in the computation. There are maskHigh and maskLow. The parameter value range is related to the operand data type. The maximum number of elements that can be processed in each iteration varies according to the data type. When the operand is 16-bit, maskLow and maskHigh ∈ [0, 264 – 1] and cannot be 0 at the same time. When the operand is 32-bit, maskHigh is 0 and maskLow ∈ (0, 264 – 1]. When the operand is 64-bit, maskHigh is 0 and maskLow ∈ (0, 232 – 1].
- In counter mode, the mask parameter indicates the number of elements involved in the entire vector computation.
Prototype
- Applies to the bitwise mask mode in normal mode and counter mode.
1 2
template <typename T, MaskMode mode = MaskMode::NORMAL> __aicore__ static inline void SetVectorMask(const uint64_t maskHigh, const uint64_t maskLow)
- Applies to the contiguous mask mode in normal mode and counter mode.
1 2
template <typename T, MaskMode mode = MaskMode::NORMAL> __aicore__ static inline void SetVectorMask(int32_t len)
Parameters
|
Parameter |
Description |
|---|---|
|
T |
Data type of the vector computation operand. |
|
mode |
mask mode of the MaskMode type. The values are as follows:
|
|
Parameter |
Input/Output |
Description |
|---|---|---|
|
maskHigh |
Input |
Normal mode: corresponds to the bitwise mask mode in normal mode and controls the elements that participate in computation by bit. High-order mask values are input. Counter mode: Set this parameter to 0. This parameter does not take effect. |
|
maskLow |
Input |
Normal mode: corresponds to the bitwise mask mode in normal mode and controls the elements that participate in computation by bit. Low-order mask values are input. Counter mode: indicates the number of elements involved in the entire vector computation. |
|
len |
Input |
Normal mode: corresponds to the contiguous mask mode in normal mode, indicating the number of contiguous elements that participate in computation in a single iteration. Counter mode: indicates the number of elements involved in the entire vector computation. |
Returns
None
Availability
Constraints
This API takes effect only when the isSetMask template parameter of the vector computation API is set to false. After using this API, you need to use ResetMask to restore mask to the default value.
Example
This API can be used together with SetMaskCount and SetMaskNorm. Set the mask mode before setting mask.
- Calling examples (normal mode)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
AscendC::LocalTensor<half> dstLocal; AscendC::LocalTensor<half> src0Local; AscendC::LocalTensor<half> src1Local; // Normal mode AscendC::SetMaskNorm(); AscendC::SetVectorMask<half, AscendC::MaskMode::NORMAL>(0xffffffffffffffff, 0xffffffffffffffff); // Bitwise mode // SetVectorMask<half, MaskMode::NORMAL>(128); // Contiguous mode // To call the vector computation APIs for multiple times, use the normal mode and set the mask parameter. You do not need to set the parameter repeatedly in the API, which improves the performance. // dstBlkStride, src0BlkStride, src1BlkStride = 1. Data is continuously read and written in a single iteration. // dstRepStride, src0RepStride, src1RepStride = 8. Data is continuously read and written between adjacent iterations. AscendC::Add<half, false>(dstLocal, src0Local, src1Local, AscendC::MASK_PLACEHOLDER, 1, { 2, 2, 2, 8, 8, 8 }); AscendC::Sub<half, false>(src0Local, dstLocal, src1Local, AscendC::MASK_PLACEHOLDER, 1, { 2, 2, 2, 8, 8, 8 }); AscendC::Mul<half, false>(src1Local, dstLocal, src0Local, AscendC::MASK_PLACEHOLDER, 1, { 2, 2, 2, 8, 8, 8 }); AscendC::ResetMask();
- Calling examples (counter mode)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
// The counter mode is used together with the API for high-dimensional tensor sharding computation. AscendC::LocalTensor<half> dstLocal; AscendC::LocalTensor<half> src0Local; AscendC::LocalTensor<half> src1Local; int32_t len = 128; // Number of elements involved in computation AscendC::SetMaskCount(); AscendC::SetVectorMask<half, AscendC::MaskMode::COUNTER>(len); AscendC::Add<half, false>(dstLocal, src0Local, src1Local, AscendC::MASK_PLACEHOLDER, 1, { 1, 1, 1, 8, 8, 8 }); AscendC::Sub<half, false>(src0Local, dstLocal, src1Local, AscendC::MASK_PLACEHOLDER, 1, { 1, 1, 1, 8, 8, 8 }); AscendC::Mul<half, false>(src1Local, dstLocal, src0Local, AscendC::MASK_PLACEHOLDER, 1, { 1, 1, 1, 8, 8, 8 }); AscendC::SetMaskNorm(); AscendC::ResetMask(); // The counter mode is used together with the API for computing the first n tensor elements AscendC::LocalTensor<half> dstLocal; AscendC::LocalTensor<half> src0Local; half num = 2; AscendC::SetMaskCount(); AscendC::SetVectorMask<half, AscendC::MaskMode::COUNTER>(128); // The number of elements involved in computation is 128. AscendC::Adds<half, false>(dstLocal, src0Local, num, 1); AscendC::Muls<half, false>(dstLocal, src0Local, num, 1); AscendC::SetMaskNorm(); AscendC::ResetMask();