aclrtMemcpyBatchAsync

Note: This feature is for trial use and may be changed in later versions. It is not available in commercial products.

Applicability

Product

Supported

Atlas A3 training products/Atlas A3 inference products

Atlas A2 training products/Atlas A2 inference products

Atlas 200I/500 A2 inference products

Atlas inference products

Atlas training products

Description

Copies memory in batches.

The host memory in this API can be page-locked memory (for example, the memory allocated by the aclrtMallocHost API) or non-page-locked memory (allocated by the malloc API). If the host memory is non-page-locked memory, this API returns only after the memory copy task is complete. If the host memory is page-locked memory, this API is asynchronous. A successful API call only indicates that the task is delivered, not executed. After this API is called, call the synchronization API (for example, aclrtSynchronizeStream) to ensure that the memory copy task is complete.

Prototype

aclError aclrtMemcpyBatchAsync(void **dsts, size_t *destMaxs, void **srcs, size_t *sizes, size_t numBatches, aclrtMemcpyBatchAttr *attrs, size_t *attrsIndexes, size_t numAttrs, size_t *failIndex, aclrtStream stream)

Parameters

Parameter

Input/Output

Description

dsts

Input

Destination memory address array.

destMaxs

Input

Maximum length array of the memory to be copied. This array stores the maximum length of each memory segment to be copied. The unit is byte.

srcs

Input

Source memory address array.

sizes

Input

Length array of the memory to be copied. This array stores the size of each memory segment to be copied. The unit is byte.

numBatches

Input

Length of the dsts, srcs, and sizes arrays.

attrs

Input

Memory copy attribute array.

attrsIndexes

Input

Array of memory copy attribute indexes. This array specifies the copy range applicable to each entry in the attrs array. The attribute specified in attrs[k] is applied to the copy operation from attrsIndexes[k] to attrsIndexes[k+1] - 1, and the attribute specified in attrs[numAttrs-1] is applied to the copy operation from attrsIndexes[numAttrs-1] to numBatches - 1.

numAttrs

Input

Length of the attrs and attrsIndexes arrays.

failIndex

Output

Index of the copy item that encounters an error. Only the memory attribute and copy direction can be verified. If the error does not involve a copy operation, the value will be SIZE_MAX.

stream

Input

Stream for executing the memory copy task.

Returns

0 on success; else, failure. For details, see aclError.

Restrictions

  • The memory copy in a batch is unordered and does not copy the elements in the array in sequence.
  • This API copies the data specified in srcs to the memory region specified in dsts. The size of each copy operation is specified by sizes. The dsts, srcs, and sizes arrays must have the same length specified by numBatches.
  • Each copy operation in the batch must be associated with the attribute set specified in the attrs array. Each entry in the attrs array can be applied to multiple copy operations. The start copy index of an attribute entry is specified in the attrsIndexes array. The attrs and attrsIndexes arrays must have the same length specified by numAttrs. For example, if the batch contains 10 copy operations listed in dsts, srcs, and sizes, the first six use one group of attributes, and the last four use another group of attributes, then numAttrs is 2, attrsIndexes is {0,6}, and attrs contains two groups of attributes. Note that the first entry of attrsIndexes must be 0. Each entry must be greater than the previous one, and the last entry must be less than numBatches. In addition, numAttrs must be less than or equal to numBatches.
  • The direction of batch memory copy can only be from host to device or from device to host.