WriteSpmBuffer
Function Usage
Copies the data to be overflowed and temporarily stored to the SPM buffer.
For details about the SPM buffer and its usage example, see SPM Buffer.
Prototype
- Applicable to continuous and discontinuous temporary data storage:
1 2
template <typename T> __aicore__ inline void WriteSpmBuffer(const LocalTensor<T>& writeLocal, const DataCopyParams& copyParams, int32_t writeOffset = 0)
- Applicable to continuous temporary data storage:
1 2
template <typename T> __aicore__ inline void WriteSpmBuffer(const LocalTensor<T>& writeLocal, const int32_t writeSize, int32_t writeOffset = 0)
Parameters
|
Parameter |
Input/Output |
Meaning |
|---|---|---|
|
writeLocal |
Input |
Local buffer to be overflowed and temporarily stored. |
|
copyParams |
Input |
Movement parameter, DataCopyParams type. For details about the structure definition of DataCopyParams, see Table 2. |
|
writeSize |
Input |
Number of copied elements. |
|
writeoffset |
Input |
Offset copied to the SPM buffer. The unit is byte. |
|
Parameter |
Meaning |
|---|---|
|
blockCount |
Specifies the number of data chunks to be consecutively transmitted in the command. The value range is [1, 4095]. |
|
blockLen |
The length of each data chunk to be consecutively transmitted. The unit is data block (32 bytes). The value range is [1, 65535]. Particularly, when dstLocal is located in C2PIPE2GM, the unit is 128 bytes; when dstLocal is located in C2, the unit is 64 bytes. |
|
srcStride |
Interval between adjacent consecutive data chunks of the source operand (the interval between the tail of the previous data chunk and the header of the subsequent data chunk). The unit is data block (32 bytes). The data type is uint16_t. The value of srcStride cannot exceed the value range of this data type. |
|
dstStride |
Interval between adjacent consecutive data chunks of the destination operand (the interval between the tail of the previous data chunk and the header of the subsequent data chunk). The unit is data block (32 bytes). The data type is uint16_t. The value of dstStride cannot exceed the value range of this data type. Particularly, when dstLocal is located in C2PIPE2GM, the unit is 128 bytes; when dstLocal is located in C2, the unit is 64 bytes. |
Availability
Precautions
- Ensure that writeSize and writeOffset are 32-byte aligned when the data is temporarily stored and copied to L1.
- The size of the copied buffer cannot exceed the size of the initialized SPM buffer. Otherwise, problems such as overflow violation may occur.
Returns
None
Example
1 2 3 4 5 6 7 8 |
AscendC::TPipe pipe; AscendC::TQue<AscendC::QuePosition::VECIN, 1> inQueueSrcVecIn; int dataSize = 32; // Assume that T is of the half type. Allocate a memory block from the UB (32 x sizeof(half) bytes). int offset = 32; // 32 bytes offset when copied to spmBuffer pipe.InitBuffer(inQueueSrcVecIn, 1, dataSize * sizeof(half)); AscendC::LocalTensor<half> writeLocal = inQueueSrcVecIn.AllocTensor<half>(); AscendC::DataCopyParams copyParams{1, 2, 0, 0}; // Move a continuous data block from the UB. The length of a data block is two data blocks, and the length of a data block is 32 bytes. pipe.WriteSpmBuffer(writeLocal, copyParams, offset); |
1 2 3 4 5 6 7 |
AscendC::TPipe pipe; AscendC::TQue<AscendC::QuePosition::VECIN, 1> inQueueSrcVecIn; int dataSize = 32; // Assume that T is of the half type. Allocate a memory block from the UB (32 x sizeof(half) bytes). int offset = 32; // 32 bytes offset when copied to spmBuffer pipe.InitBuffer(inQueueSrcVecIn, 1, dataSize * sizeof(half)); AscendC::LocalTensor<half> writeLocal = inQueueSrcVecIn.AllocTensor<half>(); pipe.WriteSpmBuffer(writeLocal, dataSize, offset); |