Matrix-Matrix Multiplication

Sample Obtaining

Refer to gemm to obtain the sample.

Description

This sample implements matrix-matrix multiplication: C = αAB + βC, where A, B, and C are 16 x 16 matrices, meaning that m = 16, n = 16, and k = 16. The result is also a 16 x 16 matrix.

Figure 1 Sample diagram

Principles

The following table lists the key functions involved in this sample.

Table 1 Key functions

Initialization

  • aclInit: initializes AscendCL.
  • aclFinalize: deinitializes AscendCL.

Device Management

  • aclrtSetDevice: sets the compute device.
  • aclrtGetRunMode: obtains the run mode of the software stack. The internal processing varies depending on the run mode.
  • aclrtResetDevice: resets the compute device and cleans up all resources associated with the device.

Stream Management

  • aclrtCreateStream: creates a stream.
  • aclrtDestroyStream: destroys a stream.
  • aclrtSynchronizeStream: waits for stream tasks to complete.

Memory Management

  • aclrtMallocHost: allocates host memory.
  • aclrtFreeHost: frees host memory.
  • aclrtMalloc: allocates device memory.
  • aclrtFree: frees device memory.

Data Transfer

aclrtMemcpy (used when the app runs on the host):

  • Transfers decode source data from the host to the device.
  • Transfers the inference result from the device to the host.

Data transfer is not required if your app runs in the board environment.

Single-Operator Execution

  • aclblasGemmEx: implements matrix-matrix multiplication. You can specify the data types of the elements in the matrices. The matrix multiplication operator GEMM has been encapsulated in the aclblasGemmEx API.
  • Ascend Tensor Compiler (ATC): compiles the operator description information (including the input and output tensor descriptions and operator attributes) of the built-in matrix multiplication operator GEMM into an .om offline model adapted to the Ascend AI Processor to verify the operator execution result.

Directory Structure

The sample directory is organized as follows:

├── inc                                 
│   ├── common.h                    //Header file that declares common functions (such as the file reading function)
│   ├── gemm_runner.h               //Header file that declares the functions related to matrix multiplication
                 
├── run
│   ├── out  
│   │   ├──test_data
│   │   │   ├── config                           
│   │   │   │     ├── acl.json           //Configuration file for system initialization
│   │   │   │     ├── gemm.json          //Description information file of the matrix multiplication operator
│   │   │   ├── data                           
│   │   │   │     ├── generate_data.py   //Script for generating the data of matrix A and matrix B

├── src
│   ├── CMakeLists.txt             //Build script
│   ├── common.cpp                 //Implementation file of common functions (such as the file reading function)
│   ├── gemm_main.cpp              //Implementation file of the main function
│   ├── gemm_runner.cpp            //Implementation file for executing functions related to matrix multiplication

App Build and Run (Ascend EP Mode)

Refer to gemm to obtain the sample. View the README file in the sample.

App Build and Run (Ascend RC Mode)

Refer to gemm to obtain the sample. View the README file in the sample.