Training
The components installed in training scenarios correspond to those in installation and deployment scenarios.
Installation Scenario |
Component |
Job Type |
Operations to Be Performed in a Deployment Job |
Description |
|---|---|---|---|---|
Full deployment scenario |
Ascend Docker Runtime Ascend Device Plugin Volcano HCCL-Controller NodeD NPU-Exporter Resilience-Controller Elastic-Agent |
|
Familiarize yourself with the basic process described in NPU Training Job. Then, try the subsequent sections, for example, delivering NPU training jobs using the CLI or by programming. Finally, integrate advanced features. |
For details about the functions provided in each scenario, see the description during installation and deployment. |
Cluster scheduling scenario |
Ascend Docker Runtime Ascend Device Plugin Volcano (Optional) HCCL-Controller (Optional) NodeD (Optional) NPU-Exporter |
|||
Device management scenario |
Ascend Docker Runtime Ascend Device Plugin (The startup parameter volcanoType is set to false.) (Optional) NPU-Exporter |
|