Introduction

MindX DL cluster scheduling is based on the popular cluster scheduling system Kubernetes, supporting Ascend AI Processors (in this document, NPU = Ascend AI Processor). It manages and checks Ascend AI Processor resources, tunes and schedules Ascend AI Processors, and generates collective communication configurations for distributed training. It effectively reduces the workload for developing underlying resource scheduling software for deep learning platform vendors and quickly enables partners to develop MindX DL-based deep learning platforms.