Introduction
The unilateral communication library provides simple, reliable, and efficient point-to-point data transfer for cluster scenarios through simple APIs, bridging multiple AI applications and transmission links. It can be used in various scenarios such as large language model (LLM) Prefill-Decode (PD) disaggregation, RL post-training parameter switching, and model parameter caching.
This document serves as an unilateral communication library development guide to instruct developers on how to use unilateral communication library APIs to implement inter-cluster data transmission and build a disaggregated framework for LLM inference.
Manual |
Introduction |
|---|---|
unilateral communication library Development Guide (C++) |
Describes Huawei Xfer Library (HIXL) APIs for C++, including link management, memory management, and data transmission. In distributed memory pool scenarios, HIXL provides a pure transmission capability based on local and remote addresses. D2D, D2H, and H2D transmission are supported in this scenario. Describes how to perform link management and KV cache management via the C++-based LLM-DataDist APIs. This scenario supports unilateral link establishment. Decode and Prompt can bidirectionally pull and push the KV cache. |
unilateral communication library Development Guide (Python) |
Describes how to perform link management and KV cache management via the Python-based LLM-DataDist APIs in CacheManager mode. This scenario supports unilateral and bilateral link establishment. That is, all LLM-DataDist instances involved in communication initiate link establishment simultaneously (for bilateral link establishment mode). Decode and Prompt can bidirectionally pull and push the KV cache. D2D, D2H, and H2D transmission are supported in this scenario. |