Overview

The cluster scheduling components now support Ascend AI processors (NPUs) by utilizing Kubernetes, a popular cluster scheduling system in the industry, and provide functions such as NPU resource management, scheduling optimization, and collective communication for distributed training. These components effectively reduce the workload for developing underlying resource scheduling software for deep learning platform vendors and quickly enable partners to develop MindCluster-based deep learning platforms.

This document describes how to use cluster scheduling components. Before installing and using them, you need to understand the features of each cluster scheduling component and install the required components based on your requirements on features.

Usage Process

The following figure shows the process of installing and using cluster scheduling components.

Table 1 Usage process

Procedure

Description

Select features.

The cluster scheduling components support multiple features of training jobs and inference jobs. Each feature requires distinct components, and the configurations of these components vary. You can select a feature as required. Multiple features can be used at the same time.

Install the related components.

After selecting a feature, you need to install the related component. Components can be installed manually or using tools.

Refer to examples.

The entire process of using features of cluster scheduling components is provided. Training jobs and inference jobs are included in examples. These examples contain the frameworks, models, and script adaptation operations supported by cluster scheduling components, to help you better understand and use each component.

General Disclaimer

  • This document may include the third-party information covering products, services, software, components, and data. Huawei does not control or assume any liability for any third-party content, including but not limited to its accuracy, compatibility, reliability, availability, legality, appropriateness, performance, non-infringement, or update status, unless expressly stated otherwise in this document. Huawei does not provide any guarantee or authorization for the third-party content mentioned or referenced in this document.
  • If you need a third-party license, obtain it in an authorized or legal way, unless otherwise specified in this document.