Introduction
The operator compiler (op_compiler) is a command line tool provided by Ascend CANN for compiling operators and generating operator binary files. The tool has the following functions and restrictions:
Detailed functions:
- Static-shape operator compilation: Call operators through single-operator API execution. If the operator shape is fixed or seldom changes, you can use this tool to compile and install the static kernel package to improve the operator calling performance. If you need to further optimize the operator calling performance, you can enable the tuning mode.
This document describes only how to use op_compiler. For details about the scenario samples, see Using the Static Kernel to Improve the Model Execution Performance.
- Debug build compilation: When locating operator problems, you can use this tool to compile the binary file of the debug build for problem reproduction and locating.
Restrictions:
- Static-shape operator compilation
- Product models supported by the default compilation mode:
Atlas Training Series Product
- The restrictions on the tuning mode are as follows:
- Product model support:
Atlas Training Series Product : not supported
- Different users are not allowed to use the same device for tuning at the same time.
- Before tuning, disable the profiling function to avoid affecting the tuning result. For details about how to disable profiling, see Performance Tuning Tool User Guide .
- Product model support:
- Product models supported by the default compilation mode:
- Debug build compilation
Supported product models:
Atlas Training Series Product
ATC Architecture
Figure 1 shows the architecture of op_compiler.
- When running this tool, you need to transfer the operator information file or operator kernel_name (kernel_name is transferred only in debugging scenarios) and other options to this tool.
- In a scenario where the operator information file is transferred, op_compiler obtains and parses the file.
- Check whether the debug build compilation option is enabled.
- Once this option is enabled, the debug build is compiled. The debug build is subject to the static- or dynamic-shape operator.
- If this option is not enabled, check whether the tuning and compilation options have been enabled. If they have not been enabled, the default static-shape operator compilation process is executed. If they have been enabled, the tuning and compilation process is executed.
- The compilation tool generates kernel files and packs them into an operator kernel package.
Static Compilation
In static compilation, shape sizes are specified during compilation but not running. During static compilation, op_compiler obtains shape information from the input operator information file and compiles an operator binary file for each shape. The following figure shows the compilation principle.

Static compilation has the following advantages:
- The sizes of all tensors are determined before compilation, improving the memory utilization.
- During compilation, strategic optimization can be implemented based on the actual shape size.
- AI processors show better performance in parallel instruction execution than logic computation. Frequent scalar operations may interrupt parallel instruction execution, resulting in performance deterioration. Scalar computation can be completed during static compilation. This can improve the performance.
- The operation data size is fixed, and the compiler will not insert extra synchronization instructions. In this way, instructions can be executed in parallel, which improves the execution performance.
Tuning and Compilation
To further improve the operator performance, you can use the tuning mode of op_compiler for tuning.
Debug Build Compilation
If problems such as AI Core error or memory overwriting occur during the use of the operator binary, you can compile a debug build that contains related debugging information for problem analysis and locating.
