Tensor Operation
Function Description
Vision SDK tensor operation specifies the assigned input tensor and allocates memory to the output tensor. The tensor operation API performs the operation and assigns the result to the output tensor.
For details about related APIs, see TensorOperations.
API Calling Process
Before calling the tensor operation APIs, create the input and output tensors, allocate the memory, and assign a value to the input tensor.
The input and output data types must be the same.
For four arithmetic operations and bitwise operations, the shapes of the input and output tensors must be the same. For tensor transposition, rotation, channel splitting, channel merging, cropping, and extension APIs, the shapes of the input and output tensors must comply with the corresponding operation specifications.
For details about the API functions, see TensorOperations.
The following uses Add as an example to describe the tensor operation process:
Key steps are demonstrated as follows:
- Perform global initialization by calling MxInit().
- Initialize the tensors and allocate the memory. You need to initialize the input and output and allocate memory.
- For the input tensor, create the tensor data and input the tensor shape and type for initialization. You can also specify the device where the tensor is located.
- For the output tensor, input the tensor shape and type for initialization. You can also specify the device where the tensor is located, and allocate memory for the tensor by calling Tensor.Malloc().
- For a tensor whose device is not specified during initialization, use the ToDevice(int DeviceId) method to compute the tensor on the specified device after initialization.
- Select the synchronous or asynchronous calling mode of the operator API based on the service requirements for tensor computing.
- Synchronous execution
No stream is created. The Add method is passed to the input tensor to obtain the result of tensor addition.
- Asynchronous execution
- Create a stream. For details, see Asynchronous Invocation.
- Pass the input tensor, created stream, and other parameters to the Add method to obtain the tensor addition result.
- Synchronous execution
- Deinitialize the initialized global resources by calling MxDeInit().
Sample Code
The following is a code example of key steps of functions and features of the tensor add API, which is for reference only and cannot be directly copied for compilation or running.
- Synchronous calling
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
// Initialization MxBase::MxInit(); { // 1. Tensor initialization // 1.1 Create the data of the input tensor. uint8_t input1[1][3][16][16]; uint8_t input2[1][3][16][16]; for (int i = 0; i < 1; i++) { for (int j = 0; j < 3; j++) { for (int k = 0; k < 16; k++) { for (int l = 0; l < 16; l++) { input1[i][j][k][l] = 8; input2[i][j][k][l] = 2; } } } } // 1.2 Specify the tensor shape. std::vector<uint32_t> shape{1, 3, 16, 16}; // 1.3 Create the input and output tensor objects. MxBase::Tensor tensor1(&input1[0][0][0][0], shape, MxBase::TensorDType::UINT8); MxBase::Tensor tensor2(&input2[0][0][0][0], shape, MxBase::TensorDType::UINT8); MxBase::Tensor tensor3(shape, MxBase::TensorDType::UINT8); tensor3.Malloc(); tensor1.ToDevice(device_id); tensor2.ToDevice(device_id); tensor3.ToDevice(device_id); // 2. Call the operator API. tensor3 is the operator computation output result. APP_ERROR ret = MxBase::Add(tensor1, tensor2, tensor3); } // Deinitialization MxBase::MxDeInit();
- Asynchronous calling
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41
// Initialization MxBase::MxInit(); { // 1. Create a stream and register its thread. // 1.1 Create a stream. MxBase::AscendStream stream = AscendStream(deviceId); // 1.2 Register the stream thread. stream.CreateAscendStream(); // 2. Tensor initialization // 2.1 Create the data of the input tensor. uint8_t input1[1][3][16][16]; uint8_t input2[1][3][16][16]; for (int i = 0; i < 1; i++) { for (int j = 0; j < 3; j++) { for (int k = 0; k < 16; k++) { for (int l = 0; l < 16; l++) { input1[i][j][k][l] = 8; input2[i][j][k][l] = 2; } } } } // 2.2 Specify the tensor shape. std::vector<uint32_t> shape{1, 3, 16, 16}; // 2.3 Create the input and output tensor objects. MxBase::Tensor tensor1(&input1[0][0][0][0], shape, MxBase::TensorDType::UINT8); MxBase::Tensor tensor2(&input2[0][0][0][0], shape, MxBase::TensorDType::UINT8); MxBase::Tensor tensor3(shape, MxBase::TensorDType::UINT8); tensor3.Malloc(); tensor1.ToDevice(device_id); tensor2.ToDevice(device_id); tensor3.ToDevice(device_id); // 3. Call the operator API. tensor3 is the operator computation output result. APP_ERROR ret = MxBase::Add(tensor1, tensor2, tensor3, stream); // 4. Synchronize the stream and obtain the computing result. stream.Synchronize(); // 5. Destroy the stream. stream.DestroyAscendStream(); } // Deinitialization MxBase::MxDeInit();