CheckLocalMemoryIA(ISASI)

Supported Products

Product

Supported (√/x)

Atlas A3 training products/Atlas A3 inference products

Atlas A2 training products/Atlas A2 inference products

Atlas 200I/500 A2 inference products

Atlas inference product's AI Core

Atlas inference product's Vector Core

x

Atlas training products

x

Function Usage

Checks the UB read/write behavior within the specified range. If the read/write behavior is within the specified range, an EXCEPTION error is reported. If the read/write behavior is not within the specified range, no error is reported.

Prototype

1
__aicore__ inline void CheckLocalMemoryIA(const CheckLocalMemoryIAParam& checkParams)

Parameters

Table 1 Parameters

Parameter

Input/Output

Meaning

checkParams

Input

Checks the UB access behavior. It is of the CheckLocalMemoryIAParam type.

For details, see ${INSTALL_DIR}/include/ascendc/basic_api/interface/kernel_struct_mm.h. Replace ${INSTALL_DIR} with the actual CANN component directory.

For details about the parameter description, see Table 2.

Table 2 Parameters in the CheckLocalMemoryIAParam structure

Parameter

Meaning

enableBit

Configures the exception register. enableBit ∈ [0,3]. The default value is 0.

  • 0: exception register 0.
  • 1: exception register 1.
  • 2: exception register 2.
  • 3: exception register 3.

startAddr

Start address of CheckLocalMemoryIAParam, 32-byte aligned. Value range: startAddr ∈ [0, 65535]. The default value is 0. For example, startAddr may be obtained by using LocalTensor.GetPhyAddr()/32.

endAddr

End address of the check, which is 32-byte aligned. The value range is [0, 65535]. The default value is 0.

isScalarRead

Checks scalar read access.

  • false: disabled. The default value is false.
  • true: enabled.

isScalarWrite

Checks scalar write access.

  • false: disabled. The default value is false.
  • true: enabled.

isVectorRead

Checks vector read access.

  • false: disabled. The default value is false.
  • true: enabled.

isVectorWrite

Checks vector write access.

  • false: disabled. The default value is false.
  • true: enabled.

isMteRead

Checks Mte read access.

  • false: disabled. The default value is false.
  • true: enabled.

isMteWrite

Checks Mte write access.

  • false: disabled. The default value is false.
  • true: enabled.

isEnable

Enables or disables the exception register configured by the enableBit parameter.

  • false: disabled. The default value is false.
  • true: enabled

reserved

Reserved. This parameter is reserved for future functions. You can use the default value.

Constraints

  • The unit of startAddr/endAddr is 32 bytes. The check range does not contain startAddr but contains endAddr, that is, (startAddr, endAddr].
  • This API needs to be reset each time it is called (by setting isEnable to false).
  • For details about the operand address alignment requirements, see General Address Alignment Restrictions.

Example

This example checks whether vector write access is within the set range of (startAddr, endAddr]. In the current example, if the vector write is within the specified range, an error (ACL_ERROR_RT_VECTOR_CORE_EXCEPTION) is reported.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
#include "kernel_operator.h"

class KernelAdd {
public:
    __aicore__ inline KernelAdd() {}
    __aicore__ inline void Init(__gm__ uint8_t* src0Gm, __gm__ uint8_t* src1Gm, __gm__ uint8_t* dstGm)
    {
        src0Global.SetGlobalBuffer((__gm__ half*)src0Gm);
        src1Global.SetGlobalBuffer((__gm__ half*)src1Gm);
        dstGlobal.SetGlobalBuffer((__gm__ half*)dstGm);
        pipe.InitBuffer(inQueueSrc0, 1, 512 * sizeof(half));
        pipe.InitBuffer(inQueueSrc1, 1, 512 * sizeof(half));
        pipe.InitBuffer(outQueueDst, 1, 512 * sizeof(half));
    }
    __aicore__ inline void Process()
    {
        CopyIn();
        Compute();
        CopyOut();
    }

private:
    __aicore__ inline void CopyIn()
    {
        AscendC::LocalTensor<half> src0Local = inQueueSrc0.AllocTensor<half>();
        AscendC::LocalTensor<half> src1Local = inQueueSrc1.AllocTensor<half>();
        AscendC::DataCopy(src0Local, src0Global, 512);
        AscendC::DataCopy(src1Local, src1Global, 512);
        inQueueSrc0.EnQue(src0Local);
        inQueueSrc1.EnQue(src1Local);
    }
    __aicore__ inline void Compute()
    {
        AscendC::LocalTensor<half> src0Local = inQueueSrc0.DeQue<half>();
        AscendC::LocalTensor<half> src1Local = inQueueSrc1.DeQue<half>();
        AscendC::LocalTensor<half> dstLocal = outQueueDst.AllocTensor<half>();
        AscendC::CheckLocalMemoryIA({ 0, (uint32_t)(dstLocal.GetPhyAddr() / 32),
            (uint32_t)((dstLocal.GetPhyAddr() + 512 * sizeof(half)) / 32), false, false, false, true, false, false,
            true });
        AscendC::Add(dstLocal, src0Local, src1Local, 512);

        outQueueDst.EnQue<half>(dstLocal);
        inQueueSrc0.FreeTensor(src0Local);
        inQueueSrc1.FreeTensor(src1Local);
    }
    __aicore__ inline void CopyOut()
    {
        AscendC::LocalTensor<half> dstLocal = outQueueDst.DeQue<half>();
        AscendC::DataCopy(dstGlobal, dstLocal, 512);
        outQueueDst.FreeTensor(dstLocal);
    }

private:
    AscendC::TPipe pipe;
    AscendC::TQue<AscendC::TPosition::VECIN, 1> inQueueSrc0, inQueueSrc1;
    AscendC::TQue<AscendC::TPosition::VECOUT, 1> outQueueDst;
    AscendC::GlobalTensor<half> src0Global, src1Global, dstGlobal;
};

extern "C" __global__ __aicore__ void add_simple_kernel(__gm__ uint8_t* src0Gm, __gm__ uint8_t* src1Gm, __gm__ uint8_t* dstGm)
{
    KernelAdd op;
    op.Init(src0Gm, src1Gm, dstGm);
    op.Process();
}