Overview

AscendNPU Intermediate Representation (IR) is built based on Multi-Level Intermediate Representation (MLIR). It is used for Ascend affinity operator compilation and provides comprehensive expression capabilities of Ascend, leveraging compilation optimizations to maximize computing efficiency of Ascend AI Processor and enabling deep performance tuning through ecosystem integration.

AscendNPU IR provides multi-level abstract APIs. The high-level APIs simplify operator development by shielding the details of Ascend computing, transfer, and synchronization instructions, automatically detecting the hardware architecture during compilation optimization, and mapping hardware-independent expressions to underlying instructions. Simultaneously, the low-level control APIs allow for granular performance tuning, including explicit on-chip memory management, synchronization insertion, and ping pong buffer optimizations.

AscendNPU IR provides open APIs through the open-source community to support flexible interconnection with the ecosystem frameworks and efficiently enable the Ascend AI Processor.

Precautions

AscendNPU IR is supported by the following products:

Atlas 800T A2 training server

Atlas 800I A2 inference server