Before You Start
MindCluster cluster scheduling components allow containerized deployment, rescheduling upon faults, and elastic scaling of MindIE Motor by generating inference jobs of the acjob type.
This section provides only feature principles and configuration examples, and the provided YAML examples cannot be used directly for MindIE job deployment. For details about how to deploy MindIE Motor, see MindIE Motor Development Guide.
Prerequisite
Before deploying MindIE Motor, ensure that the following components have been installed. If they are not installed, install them by referring to Installation and Deployment.
- Volcano
- Ascend Device Plugin
- Ascend Docker Runtime
- Ascend Operator (enableGangScheduling = true)
- ClusterD
- NodeD
Supported Products
- Atlas 800I A2 inference server
- Atlas 800I A3 SuperPoD Server
Instructions
MindCluster cluster scheduling components support containerized deployment, rescheduling upon faults, and elastic scaling of MindIE Motor in the following ways. This section focuses on the CLI method.
- CLI: Deploy a job through its YAML file.
- Use after integration: Integrate the cluster scheduling components into an existing third-party AI platform or an AI platform developed based on the cluster scheduling components.
Parent topic: Best Practices of MindIE Motor Inference Jobs