ReadSpmBuffer

Function Usage

Reads data from the SPM buffer back to the local data.

For details about the SPM buffer and its usage example, see SPM Buffer.

Prototype

  • Applicable to continuous and discontinuous data readback:
    1
    2
    template <typename T>
    __aicore__ inline void ReadSpmBuffer(const LocalTensor<T>& readLocal, const DataCopyParams& copyParams, int32_t readOffset = 0)
    
  • Applicable to continuous temporary data readback:
    1
    2
    template <typename T>
    __aicore__ inline void ReadSpmBuffer(const LocalTensor<T>& readLocal, const int32_t readSize, int32_t readOffset = 0)
    

Parameters

Table 1 API parameters

Parameter

Input/Output

Meaning

readLocal

Input

Readback target local buffer.

copyParams

Input

Movement parameter, DataCopyParams type. For details about the structure definition of DataCopyParams, see Table 2.

readSize

Input

Number of read elements.

readoffset

Input

Offset of the SPM buffer. The unit is byte.

Table 2 Parameters in the DataCopyParams structure

Parameter

Meaning

blockCount

Specifies the number of data chunks to be consecutively transmitted in the command. The value range is [1, 4095].

blockLen

The length of each data chunk to be consecutively transmitted. The unit is data block (32 bytes). The value range is [1, 65535].

Particularly, when dstLocal is located in C2PIPE2GM, the unit is 128 bytes; when dstLocal is located in C2, the unit is 64 bytes.

srcStride

Interval between adjacent consecutive data chunks of the source operand (the interval between the tail of the previous data chunk and the header of the subsequent data chunk). The unit is data block (32 bytes). The data type is uint16_t. The value of srcStride cannot exceed the value range of this data type.

dstStride

Interval between adjacent consecutive data chunks of the destination operand (the interval between the tail of the previous data chunk and the header of the subsequent data chunk). The unit is data block (32 bytes). The data type is uint16_t. The value of dstStride cannot exceed the value range of this data type.

Particularly, when dstLocal is located in C2PIPE2GM, the unit is 128 bytes; when dstLocal is located in C2, the unit is 64 bytes.

Availability

Atlas Training Series Product

Precautions

None

Returns

None

Example

1
2
3
4
5
6
7
8
AscendC::TPipe pipe;
AscendC::TQue<AscendC::QuePosition::VECIN, 1> inQueueSrcVecIn;
int dataSize = 32; // Assume that T is of the half type. Allocate a memory block from the UB (32 x sizeof(half) bytes).
int offset = 32; // 32 bytes offset on spmBuffer when read back
pipe.InitBuffer(inQueueSrcVecIn, 1, dataSize * sizeof(half));
AscendC::LocalTensor<half> writeLocal = inQueueSrcVecIn.AllocTensor<half>();
AscendC::DataCopyParams copyParams{1, 2, 0, 0};// Move a continuous data block. The length of the continuous data block is two data blocks. One data block is 32 bytes.
pipe.ReadSpmBuffer(writeLocal, copyParams, offset);
1
2
3
4
5
6
7
AscendC::TPipe pipe;
AscendC::TQue<AscendC::QuePosition::VECIN, 1> inQueueSrcVecIn;
int dataSize = 64; // Allocate a 64*sizeof (half) memory block from the UB.
int offset = 32; // 32 bytes offset on spmBuffer when read back
pipe.InitBuffer(inQueueSrcVecIn, 1, dataSize * sizeof(half));
AscendC::LocalTensor<half> writeLocal = inQueueSrcVecIn.AllocTensor<half>();
pipe.ReadSpmBuffer(writeLocal, dataSize, offset);