HCCL Overview
Huawei Collective Communication Library (HCCL) is a high-performance collective communication library based on Ascend hardware. It provides a high-performance and high-reliability communication solution for computing clusters.
Key Functions
- Provides high-performance collective communication and point-to-point communication in single-server and multi-server environments.
- Supports collective communication primitives such as AllReduce, Broadcast, AllGather, ReduceScatter, AlltoAll, Send, and Receive.
- Supports communication algorithms such as Ring, Mesh, and Recursive Halving-Doubling (RHD).
- Supports high-speed communication links such as HCCS, RoCE, PCIe.
- Supports two execution modes: single-operator and graph.
- Supports custom development of communication operators. Currently, only
Atlas A3 training products /Atlas A3 inference products andAtlas A2 training products /Atlas A2 inference products are supported. For theAtlas A2 training products /Atlas A2 inference products , only the Atlas 800I A2 inference server, Atlas 300I A2 inference card, and A200I A2 Box heterogeneous components are supported.
Software Architecture
HCCL is a core component of CANN and provides a high-performance and high-reliability communication solution for NPU clusters. HCCL supports multiple AI frameworks and implements efficient interconnection between multiple Ascend AI processors, as shown in the following figure.

- HCCL: includes built-in communication operators and extended communication operators, and provides external communication operator APIs.
- Built-in communication operators: basic communication operators provided by HCCL, including collective communication operators and point-to-point communication operators.
- Extended communication operators: communication operators customized using the APIs provided by the HCOMM library.
- HCOMM library: uses the layered decoupling design to divide communication capabilities into the control plane and data plane capabilities.
- Control plane: provides topology information query and communication resource management functions.
- Data plane: provides data movement and computing functions such as local operations, inter-operator synchronization, and communication operations.
The control plane provides communication resources, and the data plane provides methods for operating resources. The provided APIs enable communication operator developers to focus on service innovation without paying attention to complex implementation details at the bottom layer of the chip.
Supported Products
Atlas A3 training products /Atlas A3 inference products Atlas A2 training products /Atlas A2 inference products Atlas training products Atlas inference products
For
For the