ReadSpmBuffer
Function Usage
Reads data from the SPM buffer back to the local data.
For details about the SPM buffer and its usage example, see SPM Buffer.
Prototype
- Applicable to continuous and discontinuous data readback:
1 2
template <typename T> __aicore__ inline void ReadSpmBuffer(const LocalTensor<T>& readLocal, const DataCopyParams& copyParams, int32_t readOffset = 0)
- Applicable to continuous temporary data readback:
1 2
template <typename T> __aicore__ inline void ReadSpmBuffer(const LocalTensor<T>& readLocal, const int32_t readSize, int32_t readOffset = 0)
Parameters
|
Parameter |
Input/Output |
Meaning |
|---|---|---|
|
readLocal |
Input |
Readback target local buffer. |
|
copyParams |
Input |
Movement parameter, DataCopyParams type. For details about the structure definition of DataCopyParams, see Table 2. |
|
readSize |
Input |
Number of read elements. |
|
readoffset |
Input |
Offset of the SPM buffer. The unit is byte. |
|
Parameter |
Meaning |
|---|---|
|
blockCount |
Specifies the number of data chunks to be consecutively transmitted in the command. The value range is [1, 4095]. |
|
blockLen |
The length of each data chunk to be consecutively transmitted. The unit is data block (32 bytes). The value range is [1, 65535]. Particularly, when dstLocal is located in C2PIPE2GM, the unit is 128 bytes; when dstLocal is located in C2, the unit is 64 bytes. |
|
srcStride |
Interval between adjacent consecutive data chunks of the source operand (the interval between the tail of the previous data chunk and the header of the subsequent data chunk). The unit is data block (32 bytes). The data type is uint16_t. The value of srcStride cannot exceed the value range of this data type. |
|
dstStride |
Interval between adjacent consecutive data chunks of the destination operand (the interval between the tail of the previous data chunk and the header of the subsequent data chunk). The unit is data block (32 bytes). The data type is uint16_t. The value of dstStride cannot exceed the value range of this data type. Particularly, when dstLocal is located in C2PIPE2GM, the unit is 128 bytes; when dstLocal is located in C2, the unit is 64 bytes. |
Availability
Precautions
None
Returns
None
Example
1 2 3 4 5 6 7 8 |
AscendC::TPipe pipe; AscendC::TQue<AscendC::QuePosition::VECIN, 1> inQueueSrcVecIn; int dataSize = 32; // Assume that T is of the half type. Allocate a memory block from the UB (32 x sizeof(half) bytes). int offset = 32; // 32 bytes offset on spmBuffer when read back pipe.InitBuffer(inQueueSrcVecIn, 1, dataSize * sizeof(half)); AscendC::LocalTensor<half> writeLocal = inQueueSrcVecIn.AllocTensor<half>(); AscendC::DataCopyParams copyParams{1, 2, 0, 0};// Move a continuous data block. The length of the continuous data block is two data blocks. One data block is 32 bytes. pipe.ReadSpmBuffer(writeLocal, copyParams, offset); |
1 2 3 4 5 6 7 |
AscendC::TPipe pipe; AscendC::TQue<AscendC::QuePosition::VECIN, 1> inQueueSrcVecIn; int dataSize = 64; // Allocate a 64*sizeof (half) memory block from the UB. int offset = 32; // 32 bytes offset on spmBuffer when read back pipe.InitBuffer(inQueueSrcVecIn, 1, dataSize * sizeof(half)); AscendC::LocalTensor<half> writeLocal = inQueueSrcVecIn.AllocTensor<half>(); pipe.ReadSpmBuffer(writeLocal, dataSize, offset); |