Before You Start

MindCluster cluster scheduling components allow containerized deployment, rescheduling upon faults, and elastic scaling of MindIE Motor by generating inference jobs of the acjob type.

This section provides only feature principles and configuration examples, and the provided YAML examples cannot be used directly for MindIE job deployment. For details about how to deploy MindIE Motor, see MindIE Motor Development Guide.

Prerequisite

Before deploying MindIE Motor, ensure that the following components have been installed. If they are not installed, install them by referring to Installation and Deployment.
  • Volcano
  • Ascend Device Plugin
  • Ascend Docker Runtime
  • Ascend Operator (enableGangScheduling = true)
  • ClusterD
  • NodeD

Supported Products

  • Atlas 800I A2 inference server
  • Atlas 800I A3 SuperPoD Server

Instructions

MindCluster cluster scheduling components support containerized deployment, rescheduling upon faults, and elastic scaling of MindIE Motor in the following ways. This section focuses on the CLI method.

  • CLI: Deploy a job through its YAML file.
  • Use after integration: Integrate the cluster scheduling components into an existing third-party AI platform or an AI platform developed based on the cluster scheduling components.