ReadSpmBuffer

Applicability

Product

Supported/Unsupported

Atlas A3 training products/Atlas A3 inference products

Atlas A2 training products/Atlas A2 inference products

Atlas 200I/500 A2 inference products

x

Atlas inference product's AI Core

Atlas inference product's Vector Core

x

Atlas training products

Function Usage

Reads data from the SPM buffer back to the local data.

Prototype

  • Applicable to continuous and discontinuous data readback:
    1
    2
    template <typename T>
    __aicore__ inline void ReadSpmBuffer(const LocalTensor<T>& readBuffer, const DataCopyParams& copyParams, int32_t readOffset = 0)
    
  • Applicable to continuous temporary data readback:
    1
    2
    template <typename T>
    __aicore__ inline void ReadSpmBuffer(const LocalTensor<T>& readBuffer, const int32_t readSize, int32_t readOffset = 0)
    

Parameters

Table 1 Parameters

Parameter

Input/Output

Meaning

readBuffer

Input

Readback target local memory.

copyParams

Input

Movement parameter, DataCopyParams type. For details about the structure definition of DataCopyParams, see Table 2.

readSize

Input

Number of read elements.

readOffset

Input

Offset of the SPM buffer, in bytes.

Table 2 Parameters in the DataCopyParams structure

Field

Meaning

blockCount

Number of consecutive data blocks to be transferred. The value is of the uint16_t type. The value range is as follows: blockCount ∈ [1, 4095].

blockLen

Length of each consecutive data block to be transferred, in the unit of DataBlock (32 bytes). The value is of the uint16_t type. The value range is as follows: blockLen ∈ [1, 65535].

Specifically, when dst is located in C2PIPE2GM, the unit is 128 bytes. When dst is located in C2, the unit is 64 bytes, indicating the length of the consecutive data block to be transferred of the source operand.

srcGap

Interval between adjacent consecutive data blocks of the source operand (the interval between the end of the previous data block and the beginning of the next data block), in the unit of DataBlock (32 bytes). The value is of the uint16_t type. The value of srcGap must be within the value range of this data type.

In the L1 Buffer -> Fixpipe Buffer scenario, srcGap refers to the interval between adjacent consecutive data blocks of the source operand (the interval between the beginning of the previous data block and the beginning of the next data block), in the unit of DataBlock (32 bytes). The value is of the uint16_t type. The value of srcGap must be within the value range of this data type.

dstGap

Interval between adjacent consecutive data blocks of the destination operand (the interval between the end of the previous data block and the beginning of the next data block), in the unit of DataBlock (32 bytes). The value is of the uint16_t type. The value of dstGap must be within the value range of this data type.

Specifically, when dstLocal is located in C2PIPE2GM, the unit is 128 bytes. When dstLocal is located in C2, the unit is 64 bytes.

In the L1 Buffer -> Fixpipe Buffer scenario, dstGap refers to the interval between adjacent consecutive data blocks of the source operand (the interval between the beginning of the previous data block and the beginning of the next data block), in the unit of DataBlock (32 bytes). The value is of the uint16_t type. The value of dstGap must be within the value range of this data type.

Restrictions

None

Return Value Description

None

Example

1
2
3
4
5
6
7
8
AscendC::TPipe pipe;
AscendC::TQue<AscendC::TPosition::VECIN, 1> inQueueSrcVecIn;
int dataSize = 32; // Assume that T is of the half type. Allocate a buffer from UB (32 x sizeof(half) bytes).
int offset = 32; // 32 bytes offset on spmBuffer when read back
pipe.InitBuffer(inQueueSrcVecIn, 1, dataSize * sizeof(half));
AscendC::LocalTensor<half> writeLocal = inQueueSrcVecIn.AllocTensor<half>();
AscendC::DataCopyParams copyParams{1, 2, 0, 0};// Move a data chunk. The length of the data chunk is two data blocks, and the length of a data block is 32 bytes.
pipe.ReadSpmBuffer(writeLocal, copyParams, offset);
1
2
3
4
5
6
7
AscendC::TPipe pipe;
AscendC::TQue<AscendC::TPosition::VECIN, 1> inQueueSrcVecIn;
int dataSize = 64; // Allocate a buffer from UB (64 x sizeof(half) bytes).
int offset = 32; // 32 bytes offset on spmBuffer when read back
pipe.InitBuffer(inQueueSrcVecIn, 1, dataSize * sizeof(half));
AscendC::LocalTensor<half> writeLocal = inQueueSrcVecIn.AllocTensor<half>();
pipe.ReadSpmBuffer(writeLocal, dataSize, offset);