CANN Commercial Version
Compute architecture for neural networks (CANN) is an Ascend heterogeneous compute architecture designed for AI scenarios. It offers support for various AI frameworks such as MindSpore, PyTorch, and TensorFlow, and serves AI processors and programming. It is a crucial platform for enhancing computing efficiency of Ascend AI processors. Additionally, it provides hierarchical programming APIs for diverse application scenarios, allowing users to quickly build AI applications and services based on the Ascend platform.
- Release Notes
Describes the mapping between CANN and firmware and drivers as well as version feature changes.
- Quick Start
Describes how to quickly install CANN and execute a simple sample to experience the operator build and run process.
- Ascend Product Models
Provides the names of Ascend products.
Environment Setup
Application Development
- Application Development
Describes how to develop AI applications using the C&C++ and Python APIs to implement functions such as target recognition and image classification. For details about the APIs, see Application Development APIs.
- ISP Image Tuning
Describes how to tune algorithms and functions related to the image signal processing (ISP). For details about the APIs, see ISP Image Tuning APIs.
Operator Development
- Ascend C Operator Development
Describes how to develop operators based on the Ascend C operator programming language. For details about the APIs, see Ascend C Operator Development APIs.
- TBE & AI CPU Operator Development
Describes how to develop TBE and AI CPU custom operators based on TBE and AI CPU APIs. For details about the APIs, see TBE & AI CPU Operator Development APIs.
- BiSheng Compiler
Describes how to use the BiSheng compiler to compile operator code into binary executable files and dynamic libraries.
- CCE Intrinsic Development
Describes the CCE Intrinsic–based heterogeneous programming and multi-pipeline parallel programming. For details about the APIs, see CCE Intrinsic Development APIs.
- AscendNPU IR
Describes the intermediate representation (IR) built based on MLIR, which is used for building Ascend affinity operators.
Graph Development
Communication Library
- Huawei Collective Communication Library (HCCL)
Provides a high-performance collective communication library based on the Ascend AI processors, which supports data-parallel and model-parallel collective communication between single-server multi-device and multi-server multi-device, and supports custom development of communication operators. For details about the APIs, see HCCL APIs.
- HIXL Unilateral Communication Library
Provides guidance on developing unilateral communication libraries. It is used to guide developers on how to use the unilateral communication library APIs to implement data transmission between clusters and build a separated framework for foundation model inference. For details about related APIs, see HIXL Unilateral Communication Library APIs.
Domain-Specific Acceleration Library
- ATB Acceleration Library
Describes how to use the Ascend Transformer Boost acceleration library to improve the efficiency of Transformer model training and inference development. For details about the APIs, see ATB Acceleration Library APIs.
- SiP Acceleration Library
Describes how to use high-performance operators in the signal processing domain. For details about related APIs, see SiP Acceleration Library APIs.
- LLM DataDist Development
Deploys the inference of the foundation model separately using the LLM DataDist APIs to improve the throughput performance of the foundation model inference. For details about the APIs, see LLM DataDist APIs.
APIs
- Application Development APIs
Provides C&C++ and Python APIs for system configuration, runtime management, single-operator execution, model execution, and media data preprocessing.
- ISP Image Tuning APIs
Provides image tuning APIs of the ISP.
- Ascend C Operator Development APIs
Provides basic and high-level Ascend C APIs.
- TBE & AI CPU Operator Development APIs
Provides APIs required for TBE & AI CPU operator development.
- CCE Intrinsic Development APIs
Provides Ascend hardware APIs extended based on the C language, and implements fine-grained control over memory allocation, data synchronization, and double buffer through the CCE Intrinsic APIs.
- GE APIs
Constructs graphs that run directly on the Ascend platform through GE APIs.
- DataFlow Graph Construction APIs
Constructs, modifies, compiles, and executes computational graphs through DataFlow C++ and Python APIs, and provides UDF APIs for users to write custom processing functions through FuncProcessPoint and GraphProcessPoint.
- HCCL APIs
Provides communication operator APIs and communicator management APIs to implement distributed capabilities. In addition, HCCL provides communication operator development APIs for developers to customize communication operators.
- HIXL Unilateral Communication Library APIs
Provides APIs in C++ and Python to provide simple, reliable, and efficient point-to-point data transmission capabilities in cluster scenarios.
- Operator Library APIs
Provides a wide array of high-performance operators with deep optimization and hardware affinity.
- ATB Acceleration Library APIs
Provides APIs required for using the ATB acceleration library, including the public class definitions such as the Operation class, single-operator class, and graph operator class.
- SiP Acceleration Library APIs
Provides APIs required for using the SiP acceleration library, including high-performance operators related to signal processing.
- LLM DataDist APIs
Provides APIs to manage KV data in a cluster, supporting separate deployment of full graphs and incremental graphs.
- AOE APIs
Provides APIs for automatic tuning, allowing you to query previously generated repository files through the APIs and obtain the tiling result.
- Basic Data Structures and APIs
Describes the basic data structures and APIs on which operator development and graph development depend.
- Open Code Basic Function Support APIs
Describes APIs on which the CANN open code depends, including the error reporting APIs and log APIs.
Development Tools
- Development Tool Quick Start
Provides quick start for using the PyTorch training scenario development tool, foundation model inference development tool, and operator development tool.
- Operator Development Tools
Describes how to use operator development tools (such as msKPP, msOpGen, msOpST, msSanitizer, msDebug, and msProf).
- Operator Compilation Tool
Compiles operators to generate operator binary files.
- ATC Tool
Converts a network model into an offline model (.om) supported by Ascend AI processors.
- AOE Tool
Implements automatic optimization to make full use of hardware resources and improve network performance.
- Analysis and Migration Tool
Migrates the PyTorch training script to the Ascend NPU in one-click mode.
- Accuracy Debugging Tool
Allows you to compare accuracy and locate model accuracy problems.
- Profiling Tool
Collects and analyzes profile data in the training and inference phases.
- HCCL Performance Tester
Tests the correctness and performance of the HCCL function.
- AMCT Tool
Compresses a model, including quantization and tensor decomposition.
- Memory Leak Detection Tool (msLeaks)
Locates memory problems during model training and inference.
References
- Troubleshooting
Describes how to quickly locate and rectify faults.
- RPing Function Development
Describes the RDMA-based network detection technology RPing, which is used to send detection packets, record network latency, and collect statistics on packet sending and receiving.
- Log Reference
Describes the log content format, and how to view logs and set log levels.
- Environment Variables
Describes the environment variables that can be used to build AI applications and services based on CANN.
- Graph Fusion and UB Fusion Patterns
Provides some built-in graph fusion and UB fusion patterns of Ascend AI processors. Graph fusion and UB fusion are key methods for improving the performance of the entire network.
- Communication Matrix
Describes the open ports, transport layer protocols used by the ports, authentication modes, and functions.