Heterogeneous Compilation

Compilation Options

Common compilation options are described as follows:

Option

Required (Yes/No)

Description

-help

No

Displays the help information.

-o <file>

No

Specifies the name and location of the output file.

-O

No

Specifies the optimization level of the compiler. Currently, -O2 and -O0 are supported.

-fPIC

No

Instructs the compiler to generate location-independent code.

--shared

No

Compiles and generates dynamic link libraries.

--cce-auto-sync

No

Enables automatic synchronization. The execution units in the AI Core are asynchronous and parallel. The read and write operations of the LocalTensor may depend on data. You can use this option to perform automatic synchronization without manually inserting the synchronization by calling the API. For details, see Automatic Synchronization.

--cce-auto-sync-log

No

Outputs the synchronization insertion information to the <file> file by configuring --cce-auto-sync-log to <file>.

--std=c++17

No

Configures the C++ standard. This option is required for the Ascend C operator compilation which uses the C++17 standard.

-xcce

No

Specifies the user code format. This option needs to be added if device code is included but the extension is not .cce. This option is required for the Ascend C operator compilation.

Product options are described as follows:

Option

Required (Yes/No)

Description

--cce-soc-version

Yes

Ascend AI Processor model. If this option is set, the related binary file will be generated.

If the AI processor model cannot be determined, run the npu-smi info command on the server where the AI processor is installed. Add the prefix Ascend to the queried Name. For example, if the value of Name is xxxyy, the actual value is Ascendxxxyy.

--cce-soc-core-type

Yes

Binary file of the core. Values: VecCore, CubeCore, AICore.

  • VecCore: compiled Vector code.
  • CubeCore: compiled Cube code.
  • AICore: compiled Vector and Cube code.

Heterogeneous Compilation

The BiSheng Compiler supports heterogeneous compilation. To compile heterogeneous programs, perform the following steps:

  1. Obtain the path of the BiSheng Compiler program (bisheng) from the CANN release package, and set environment variables.
    # Set environment variables of the compiler. For example:
    $export PATH=${INSTALL_DIR}/compiler/ccec_compiler/bin:$PATH
  2. Develop code. The basic procedure of compiling a heterogeneous program is as follows:
    1. Include the required AscendCL runtime management header files.
    2. Create the device and stream by calling the Runtime API.
    3. Allocate device memory, and copy the input data to the device memory as required.
    4. Compile the kernel function of the device.
    5. Run the kernel function of the device through the <<<>>> kernel launch symbol.
    6. Copy the output data to the host memory as required, and release the device memory.
  3. Obtain the dynamic link library and the header file location of Runtime. In this example, compilation dependencies and their locations are as follows.

    Compilation Dependency

    Location

    Short For

    Runtime dynamic link library

    ${INSTALL_DIR}/lib64/

    $RT_LIB

    Runtime header file

    ${INSTALL_DIR}/include

    $RT_INC

  4. Compile and execute compilation commands, and generate executable files in one step. In the following example, the development environment and the runtime environment are the same.
    // Simplified writing. These commands are used as a demo and cannot be run in meaningless scenarios.
    #ifdef ASCENDC_CPU_DEBUG
    #define __aicore__
    #else
    #define __aicore__ [aicore]
    #endif
    
    __global__ __aicore__ void foo(__gm__ int* buf)
    {
        *buf = 1;
    }
    
    int main(int argc, char* argv[])
    {
      int a[100];
      // nullptr - No L2 usage here
      foo<<<5, nullptr, nullptr>>>(a);
    
      return 0;
    }
    

    The compilation command is as follows:

    # Function: Compile the host and device code together, and generate the executable file axpy.
    # Compilation command:
    $bisheng --cce-soc-version=AscendXXXYY --cce-soc-core-type=VecCore -O2 axpy.cce -o axpy -I$RT_INC -L$RT_LIB
  5. Run the executable file.
     # Execute:
     $export LD_LIBRARY_PATH=$RT_LIB:$LD_LIBRARY_PATH
     $./axpy