CheckLocalMemoryIA(ISASI)

Product Support

Product	Supported
Atlas A3 training products/Atlas A3 inference products	√
Atlas A2 training products/Atlas A2 inference products	√
Atlas 200I/500 A2 inference products	√
Atlas inference product's AI Core	√
Atlas inference product's Vector Core	x
Atlas training products	x

Function

Checks the UB read and write operations within the specified range. If the UB read and write operations are within the specified range, an EXCEPTION error is reported. If the UB read and write operations are not within the specified range, no error is reported.

Prototype

__aicore__ inline void CheckLocalMemoryIA(const CheckLocalMemoryIAParam& checkParams)

Parameters

**Table 1** Parameters
Parameter	Input/Output	Description
checkParams	Input	UB access check configuration. The parameter type is CheckLocalMemoryIAParam. For details about the definition, see ${INSTALL_DIR}/include/ascendc/basic_api/interface/kernel_struct_mm.h. Replace ${INSTALL_DIR} with the file storage path after the CANN software is installed. For details about the parameter description, see Table 2.

**Table 2** Parameters in the CheckLocalMemoryIAParam structure
Parameter	Description
enableBit	Configures the exception register. enableBit ∈ [0,3]. The default value is 0. 0: exception register 0. 1: exception register 1. 2: exception register 2. 3: exception register 3.
startAddr	Start address of CheckLocalMemoryIAParam, 32-byte aligned. Value range: startAddr ∈ [0, 65535]. The default value is 0. For example, startAddr may be obtained by using LocalTensor.GetPhyAddr()/32.
endAddr	End address of CheckLocalMemoryIAParam, 32-byte aligned. Value range: endAddr ∈ [0, 65535]. The default value is 0.
isScalarRead	Checks scalar read access. false: disabled. The default value is false. true: enabled.
isScalarWrite	Checks scalar write access. false: disabled. The default value is false. true: enabled.
isVectorRead	Checks vector read access. false: disabled. The default value is false. true: enabled.
isVectorWrite	Checks vector write access. false: disabled. The default value is false. true: enabled.
isMteRead	Checks Mte read access. false: disabled. The default value is false. true: enabled.
isMteWrite	Checks Mte write access. false: disabled. The default value is false. true: enabled.
isEnable	Enables or disables the exception register configured by the enableBit parameter. false: disabled. The default value is false. true: enabled
reserved	Reserved parameter for future use. You can use the default value.

Restrictions

The unit of startAddr/endAddr is 32 bytes. The check range does not contain startAddr but contains endAddr, that is, (startAddr, endAddr].
This API needs to be reset each time it is called (by setting isEnable to false).
For details about the operand address alignment requirements, see General Address Alignment Restrictions.

Example

This example checks whether vector write access is within the set range of (startAddr, endAddr]. In the current example, if the vector write is within the specified range, an error (ACL_ERROR_RT_VECTOR_CORE_EXCEPTION) is reported.

#include "kernel_operator.h"

class KernelAdd {
public:
    __aicore__ inline KernelAdd() {}
    __aicore__ inline void Init(__gm__ uint8_t* src0Gm, __gm__ uint8_t* src1Gm, __gm__ uint8_t* dstGm)
    {
        src0Global.SetGlobalBuffer((__gm__ half*)src0Gm);
        src1Global.SetGlobalBuffer((__gm__ half*)src1Gm);
        dstGlobal.SetGlobalBuffer((__gm__ half*)dstGm);
        pipe.InitBuffer(inQueueSrc0, 1, 512 * sizeof(half));
        pipe.InitBuffer(inQueueSrc1, 1, 512 * sizeof(half));
        pipe.InitBuffer(outQueueDst, 1, 512 * sizeof(half));
    }
    __aicore__ inline void Process()
    {
        CopyIn();
        Compute();
        CopyOut();
    }

private:
    __aicore__ inline void CopyIn()
    {
        AscendC::LocalTensor<half> src0Local = inQueueSrc0.AllocTensor<half>();
        AscendC::LocalTensor<half> src1Local = inQueueSrc1.AllocTensor<half>();
        AscendC::DataCopy(src0Local, src0Global, 512);
        AscendC::DataCopy(src1Local, src1Global, 512);
        inQueueSrc0.EnQue(src0Local);
        inQueueSrc1.EnQue(src1Local);
    }
    __aicore__ inline void Compute()
    {
        AscendC::LocalTensor<half> src0Local = inQueueSrc0.DeQue<half>();
        AscendC::LocalTensor<half> src1Local = inQueueSrc1.DeQue<half>();
        AscendC::LocalTensor<half> dstLocal = outQueueDst.AllocTensor<half>();
        AscendC::CheckLocalMemoryIA({ 0, (uint32_t)(dstLocal.GetPhyAddr() / 32),
            (uint32_t)((dstLocal.GetPhyAddr() + 512 * sizeof(half)) / 32), false, false, false, true, false, false,
            true });
        AscendC::Add(dstLocal, src0Local, src1Local, 512);

        outQueueDst.EnQue<half>(dstLocal);
        inQueueSrc0.FreeTensor(src0Local);
        inQueueSrc1.FreeTensor(src1Local);
    }
    __aicore__ inline void CopyOut()
    {
        AscendC::LocalTensor<half> dstLocal = outQueueDst.DeQue<half>();
        AscendC::DataCopy(dstGlobal, dstLocal, 512);
        outQueueDst.FreeTensor(dstLocal);
    }

private:
    AscendC::TPipe pipe;
    AscendC::TQue<AscendC::TPosition::VECIN, 1> inQueueSrc0, inQueueSrc1;
    AscendC::TQue<AscendC::TPosition::VECOUT, 1> outQueueDst;
    AscendC::GlobalTensor<half> src0Global, src1Global, dstGlobal;
};

extern "C" __global__ __aicore__ void add_simple_kernel(__gm__ uint8_t* src0Gm, __gm__ uint8_t* src1Gm, __gm__ uint8_t* dstGm)
{
    KernelAdd op;
    op.Init(src0Gm, src1Gm, dstGm);
    op.Process();
}

Parent topic: Exception Detection