Feature Introduction
The following describes the features supported in the installation and deployment scenarios.
NPU Device Management
Supports NPU device discovery and status monitoring based on the Kubernetes device plug-in mechanism.
NPU Scheduling Optimization
Selects proper NPUs based on the physical topology of NPUs to maximize NPU performance.
Resumable Training
When an NPU or a server is faulty, a training job is automatically rescheduled to a healthy NPU device or node to continue the training job.
Parent topic: Typical Installation Scenarios