Provides the description, feature principles, and usage of cluster scheduling components, including the installation and deployment of each component, integration/adaptation examples, API reference, and principles of some scheduling solutions.
Adds new sections, including "Best Practices of SGLang Inference Jobs", "Best Practices of vLLM Inference Jobs", and "Appliance Feature Guide".
Deletes sections: "Usage" > "Resumable Training Feature Guide" > Using Resumable Training on a Platform".
For other changes, see MindCluster Cluster Scheduling User Guide.
MindCluster ToolBox User Guide
Provides guidance for bandwidth test, computing power test, power consumption test, log collection, and software package signature verification.
Adds the DSA stress test command. For other changes, see MindCluster ToolBox User Guide.
MindCluster Fault Diagnosis User Guide
Provides guidance for log collection, log cleaning and dumping, and fault diagnosis.
Adds the LCN, BMC log cleaning and analysis, inference model/instance-level analysis, and cleaning/diagnosis SDK APIs. For other changes, see MindCluster Fault Diagnosis User Guide.