SetVectorMask

Applicability

Product

Supported/Unsupported

Atlas A3 training products/Atlas A3 inference products

Atlas A2 training products/Atlas A2 inference products

Atlas 200I/500 A2 inference products

Atlas inference product's AI Core

Atlas inference product's Vector Core

x

Atlas training products

Function Usage

Sets mask during vector computation. Before using this API, you need to call SetMaskCount/SetMaskNorm to set the mask mode. mask has different meanings in different modes:
  • In normal mode, the mask parameter is used to control the number of elements involved in computation in a single repeat. In this case, there are two modes:
    • Contiguous mode: indicates the number of contiguous elements that participate in computation. The value range is related to the operand data type. The maximum number of elements that can be processed in each repeat varies according to the data type. When the operand is 16-bit, mask ∈ [1, 128]. When the operand is 32-bit, mask ∈ [1, 64]. When the operand is 64-bit, mask ∈ [1, 32].
    • Bitwise mode: controls which elements are involved in computation by bit. If the value of a bit is 1, the element is involved in computation. If the value of a bit is 0, the element is not involved in computation. There are maskHigh (high-order mask) and maskLow (low-order mask). The parameter value range is related to the operand data type. The maximum number of elements that can be processed in each repeat varies according to the data type. When the operand is 16-bit, maskLow and maskHigh ∈ [0, 264 – 1] and cannot be 0 at the same time. When the operand is 32-bit, maskHigh is 0 and maskLow ∈ (0, 264 – 1]. When the operand is 64-bit, maskHigh is 0 and maskLow ∈ (0, 232 – 1].
  • In counter mode, the mask parameter indicates the number of elements involved in the entire vector computation.

Prototype

  • Applicable to the bitwise mask mode and counter mode in normal mode
    1
    2
    template <typename T, MaskMode mode = MaskMode::NORMAL>
    __aicore__ static inline void SetVectorMask(const uint64_t maskHigh, const uint64_t maskLow)
    
  • Applies to the contiguous mask mode in normal mode and counter mode.
    1
    2
    template <typename T, MaskMode mode = MaskMode::NORMAL>
    __aicore__ static inline void SetVectorMask(int32_t len)
    

Parameters

Table 1 Parameters in the template

Parameter

Description

T

Data type of the vector computation operand.

mode

Mask mode, of the MaskMode type. The definition is as follows:
1
2
3
4
enum class MaskMode : uint8_t {
    NORMAL = 0, // Normal mode
    COUNTER // Counter mode
};
Table 2 Parameters

Parameter

Input/Output

Description

maskHigh

Input

Normal mode: corresponds to the bitwise mask mode in normal mode and controls the elements that participate in computation by bit. High-order mask values are input.

Counter mode: Set this parameter to 0. This parameter does not take effect.

maskLow

Input

Normal mode: corresponds to the bitwise mask mode in normal mode and controls the elements that participate in computation by bit. Low-order mask values are input.

Counter mode: indicates the number of elements involved in the entire vector computation.

len

Input

Normal mode: corresponds to the contiguous mask mode in normal mode, indicating the number of contiguous elements that participate in computation in a single repeat.

Counter mode: indicates the number of elements involved in the entire vector computation.

Returns

None

Constraints

This API takes effect only when the isSetMask template parameter of the vector computation API is set to false. After using this API, you need to use ResetMask to restore mask to the default value.

Examples

This API can be used together with SetMaskCount and SetMaskNorm. Set the mask mode before setting mask.

  • Calling examples (normal mode)
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    AscendC::LocalTensor<half> dstLocal;
    AscendC::LocalTensor<half> src0Local;
    AscendC::LocalTensor<half> src1Local;
    
    // Normal mode
    AscendC::SetMaskNorm();
    AscendC::SetVectorMask<half, AscendC::MaskMode::NORMAL>(0xffffffffffffffff, 0xffffffffffffffff);  // Bitwise mode
    
    // SetVectorMask<half, MaskMode::NORMAL>(128);  // Contiguous mode
    // To call the vector computation APIs for multiple times, use the normal mode and set the mask parameter. You do not need to set the parameter repeatedly in the API, which improves the performance.
    // dstBlkStride, src0BlkStride, src1BlkStride = 1. Data is continuously read and written in a single repeat.
    // dstRepStride, src0RepStride, src1RepStride = 8. Data is continuously read and written between adjacent repeats.
    AscendC::Add<half, false>(dstLocal, src0Local, src1Local, AscendC::MASK_PLACEHOLDER, 1, { 2, 2, 2, 8, 8, 8 });
    AscendC::Sub<half, false>(src0Local, dstLocal, src1Local, AscendC::MASK_PLACEHOLDER, 1, { 2, 2, 2, 8, 8, 8 });
    AscendC::Mul<half, false>(src1Local, dstLocal, src0Local, AscendC::MASK_PLACEHOLDER, 1, { 2, 2, 2, 8, 8, 8 });
    AscendC::ResetMask();
    
  • Calling examples (counter mode)
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    // The counter mode is used together with the API for high-dimensional tensor sharding computation.
    AscendC::LocalTensor<half> dstLocal;
    AscendC::LocalTensor<half> src0Local;
    AscendC::LocalTensor<half> src1Local;
    int32_t len = 128;  // Number of elements involved in computation
    AscendC::SetMaskCount();
    AscendC::SetVectorMask<half, AscendC::MaskMode::COUNTER>(len);
    AscendC::Add<half, false>(dstLocal, src0Local, src1Local, AscendC::MASK_PLACEHOLDER, 1, { 1, 1, 1, 8, 8, 8 });
    AscendC::Sub<half, false>(src0Local, dstLocal, src1Local, AscendC::MASK_PLACEHOLDER, 1, { 1, 1, 1, 8, 8, 8 });
    AscendC::Mul<half, false>(src1Local, dstLocal, src0Local, AscendC::MASK_PLACEHOLDER, 1, { 1, 1, 1, 8, 8, 8 });
    AscendC::SetMaskNorm();
    AscendC::ResetMask();
    
    // The counter mode is used together with the API for computing the first n data elements of the tensor.
    AscendC::LocalTensor<half> dstLocal;
    AscendC::LocalTensor<half> src0Local;
    half num = 2; 
    AscendC::SetMaskCount();
    AscendC::SetVectorMask<half, AscendC::MaskMode::COUNTER>(128); // The number of elements involved in computation is 128.
    AscendC::Adds<half, false>(dstLocal, src0Local, num, 1);
    AscendC::Muls<half, false>(dstLocal, src0Local, num, 1);
    AscendC::SetMaskNorm();
    AscendC::ResetMask();