Heterogeneous Compilation Procedure
The BiSheng Compiler supports AI Core heterogeneous compilation. To compile heterogeneous programs containing AI Core, perform the following steps:
- Obtain the path of the BiSheng Compiler program (bisheng) from the CANN release package, and set environment variables. Replace ${INSTALL_DIR} with the CANN component directory. For example, if the installation is performed by the root user, the default file storage path is /usr/local/Ascend/cann.
# Set environment variables of the compiler. For example: $export PATH=${INSTALL_DIR}/compiler/ccec_compiler/bin:$PATH - Develop code. The basic procedure of compiling an AI Core heterogeneous program is as follows:
- Include the required Runtime management header files.
- Create the device and stream by calling the Runtime API.
- Allocate device memory, and copy the input data to the device memory as required.
- Compile the kernel function of the device.
- Run the kernel function of the device through the <<<>>> kernel launch symbol.
- Copy the output data to the host memory as required, and release the device memory.
- Obtain the dynamic link library and the header file location of Runtime. In this example, compilation dependencies and their locations are as follows.
Compilation Dependency
Location
Short For
Runtime dynamic link library
${INSTALL_DIR}/lib64/
$RT_LIB
Runtime header file
${INSTALL_DIR}/include
$RT_INC
- Compile and execute compilation commands, and generate executable files in one step. In the following example, the development environment and the runtime environment are the same.
// Simplified writing. These commands are used as a demo and cannot be run in meaningless scenarios. #ifdef ASCENDC_CPU_DEBUG #define __aicore__ #else #define __aicore__ [aicore] #endif #define BLOCKS 5 __global__ __aicore__ void foo(__gm__ int* buf) { *buf = 1; } int main(int argc, char* argv[]) { aclrtStream stream; aclrtCreateStream(&stream); int a[100]; // nullptr - No L2 usage here foo<<<BLOCKS, nullptr, stream>>>(a); return 0; }The compilation command is as follows:
# Function: Compile the host and device code together, and generate the executable file axpy. # Compilation command: $bisheng --npu-arch=dav-2201 -O2 axpy.cce -o axpy -I$RT_INC -L$RT_LIB
- Run the executable file.
# Execute: $export LD_LIBRARY_PATH=$RT_LIB:$LD_LIBRARY_PATH $./axpy
Parent topic: AI Core Heterogeneous Compilation