aclSetOutputTensorAddr

Function Usage

After aclOpExecutor reuse is enabled by the aclSetAclOpExecutorRepeatable call, if the output device memory address changes, the device memory address recorded in the output aclTensor needs to be updated.

Prototype

aclnnStatus aclSetOutputTensorAddr(aclOpExecutor *executor, const size_t index, aclTensor *tensor, void *addr)

Parameters

Parameter

Input/Output

Description

executor

Input

aclOpExecutor that is set to the reusable state.

index

Input

Index of the output aclTensor to be updated, starting from 0.

tensor

Input

aclTensor pointer to be updated.

addr

Input

Device storage address to be updated to the specified aclTensor.

Returns

0 on success; else, failure.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
// Create the input and output aclTensor and aclTensorList.
std::vector<int64_t> shape = {1, 2, 3};
aclTensor tensor1 = aclCreateTensor(shape.data(), shape.size(), aclDataType::ACL_FLOAT,
nullptr, 0, aclFormat::ACL_FORMAT_ND, shape.data(), shape.size(), nullptr);
aclTensor tensor2 = aclCreateTensor(shape.data(), shape.size(), aclDataType::ACL_FLOAT,
nullptr, 0, aclFormat::ACL_FORMAT_ND, shape.data(), shape.size(), nullptr);
aclTensor tensor3 = aclCreateTensor(shape.data(), shape.size(), aclDataType::ACL_FLOAT,
nullptr, 0, aclFormat::ACL_FORMAT_ND, shape.data(), shape.size(), nullptr);
aclTensor tensor4= aclCreateTensor(shape.data(), shape.size(), aclDataType::ACL_FLOAT,
nullptr, 0, aclFormat::ACL_FORMAT_ND, shape.data(), shape.size(), nullptr);
aclTensor output= aclCreateTensor(shape.data(), shape.size(), aclDataType::ACL_FLOAT,
nullptr, 0, aclFormat::ACL_FORMAT_ND, shape.data(), shape.size(), nullptr);
aclTensor *list[] = {tensor3, tensor4};
auto tensorList = aclCreateTensorList(list, 2);
uint64_t workspace_size = 0;
aclOpExecutor *executor;
// The AddCustom operator has two inputs (aclTensor) and two outputs (aclTensor and aclTensorList).
// Call the first-phase API.
aclnnAddCustomGetWorkspaceSize(tensor1, tensor2, outpu, tensorList , &workspace_size, &executor);
// Set the executor to be reusable.
aclSetAclOpExecutorRepeatable(executor); 
void *addr;
aclSetOutputTensorAddr(executor, 0, output, addr);  // Update the device address of the output aclTensor.
aclSetOutputTensorAddr(executor, 1, output, addr);  // Update the device address of the first aclTensor in the output tensor list.
aclSetOutputTensorAddr(executor, 2, output, addr);  // Update the device address of the second aclTensor in the output tensor list.
.......
// Call the second-phase API.
aclnnAddCustom(workspace, workspace_size, executor, stream);
// Clear the executor.
aclDestroyAclOpExecutor(executor);