CANN Commercial Version

Compute architecture for neural networks (CANN) is an Ascend heterogeneous compute architecture designed for AI scenarios. It offers support for various AI frameworks such as MindSpore, PyTorch, and TensorFlow, and serves AI processors and programming. It is a crucial platform for enhancing computing efficiency of Ascend AI processors. Additionally, it provides hierarchical programming APIs for diverse application scenarios, allowing users to quickly build AI applications and services based on the Ascend platform.

  • Release Notes

    Describes the mapping between CANN and firmware and drivers as well as version feature changes.

  • Quick Start

    Describes how to quickly install CANN and execute a simple sample to experience the operator build and run process.

  • Ascend Product Models

    Provides the names of Ascend products.

Environment Setup

  • Software installation

    Describes how to install, upgrade, and uninstall CANN in different operating systems and service scenarios.

Application Development

Operator Development

Graph Development

Communication Library

  • Huawei Collective Communication Library (HCCL)

    Provides a high-performance collective communication library based on the Ascend AI processors, which supports data-parallel and model-parallel collective communication between single-server multi-device and multi-server multi-device, and supports custom development of communication operators. For details about the APIs, see HCCL APIs.

  • HIXL Unilateral Communication Library

    Provides guidance on developing unilateral communication libraries. It is used to guide developers on how to use the unilateral communication library APIs to implement data transmission between clusters and build a separated framework for foundation model inference. For details about related APIs, see HIXL Unilateral Communication Library APIs.

Domain-Specific Acceleration Library

APIs

  • Application Development APIs

    Provides C&C++ and Python APIs for system configuration, runtime management, single-operator execution, model execution, and media data preprocessing.

  • ISP Image Tuning APIs

    Provides image tuning APIs of the ISP.

  • Ascend C Operator Development APIs

    Provides basic and high-level Ascend C APIs.

  • TBE & AI CPU Operator Development APIs

    Provides APIs required for TBE & AI CPU operator development.

  • CCE Intrinsic Development APIs

    Provides Ascend hardware APIs extended based on the C language, and implements fine-grained control over memory allocation, data synchronization, and double buffer through the CCE Intrinsic APIs.

  • GE APIs

    Constructs graphs that run directly on the Ascend platform through GE APIs.

  • DataFlow Graph Construction APIs

    Constructs, modifies, compiles, and executes computational graphs through DataFlow C++ and Python APIs, and provides UDF APIs for users to write custom processing functions through FuncProcessPoint and GraphProcessPoint.

  • HCCL APIs

    Provides communication operator APIs and communicator management APIs to implement distributed capabilities. In addition, HCCL provides communication operator development APIs for developers to customize communication operators.

  • HIXL Unilateral Communication Library APIs

    Provides APIs in C++ and Python to provide simple, reliable, and efficient point-to-point data transmission capabilities in cluster scenarios.

  • Operator Library APIs

    Provides a wide array of high-performance operators with deep optimization and hardware affinity.

  • ATB Acceleration Library APIs

    Provides APIs required for using the ATB acceleration library, including the public class definitions such as the Operation class, single-operator class, and graph operator class.

  • SiP Acceleration Library APIs

    Provides APIs required for using the SiP acceleration library, including high-performance operators related to signal processing.

  • LLM DataDist APIs

    Provides APIs to manage KV data in a cluster, supporting separate deployment of full graphs and incremental graphs.

  • AOE APIs

    Provides APIs for automatic tuning, allowing you to query previously generated repository files through the APIs and obtain the tiling result.

  • Basic Data Structures and APIs

    Describes the basic data structures and APIs on which operator development and graph development depend.

  • Open Code Basic Function Support APIs

    Describes APIs on which the CANN open code depends, including the error reporting APIs and log APIs.

Development Tools

  • Development Tool Quick Start

    Provides quick start for using the PyTorch training scenario development tool, foundation model inference development tool, and operator development tool.

  • Operator Development Tools

    Describes how to use operator development tools (such as msKPP, msOpGen, msOpST, msSanitizer, msDebug, and msProf).

  • Operator Compilation Tool

    Compiles operators to generate operator binary files.

  • ATC Tool

    Converts a network model into an offline model (.om) supported by Ascend AI processors.

  • AOE Tool

    Implements automatic optimization to make full use of hardware resources and improve network performance.

  • Analysis and Migration Tool

    Migrates the PyTorch training script to the Ascend NPU in one-click mode.

  • Accuracy Debugging Tool

    Allows you to compare accuracy and locate model accuracy problems.

  • Profiling Tool

    Collects and analyzes profile data in the training and inference phases.

  • HCCL Performance Tester

    Tests the correctness and performance of the HCCL function.

  • AMCT Tool

    Compresses a model, including quantization and tensor decomposition.

  • Memory Leak Detection Tool (msLeaks)

    Locates memory problems during model training and inference.

References

  • Troubleshooting

    Describes how to quickly locate and rectify faults.

  • RPing Function Development

    Describes the RDMA-based network detection technology RPing, which is used to send detection packets, record network latency, and collect statistics on packet sending and receiving.

  • Log Reference

    Describes the log content format, and how to view logs and set log levels.

  • Environment Variables

    Describes the environment variables that can be used to build AI applications and services based on CANN.

  • Graph Fusion and UB Fusion Patterns

    Provides some built-in graph fusion and UB fusion patterns of Ascend AI processors. Graph fusion and UB fusion are key methods for improving the performance of the entire network.

  • Communication Matrix

    Describes the open ports, transport layer protocols used by the ports, authentication modes, and functions.