Use on the CLI (Other Schedulers)
The process of using other schedulers on the CLI is the same as that of using Volcano. The only difference is that the required job YAML files are different. Prepare and use the corresponding YAML files by referring to Use on the CLI (Volcano).
Procedure
- Download YAML files from the cluster scheduling code repository.
- Upload the YAML file to any directory on the management node and modify the file content as required.
Table 2 Parameters in the YAML file Parameter
Value
Description
image
-
Inference image name. Change it based on your actual requirements. (It is the name of the image created in the image preparation section.)
replicas
Integer
Number of job replicas. Generally, the value is 1.
requests
- Inference server (equipped with Atlas 300I inference cards)
huawei.com/Ascend310: number of processors
- Atlas inference product in non-mixed insertion mode:
huawei.com/Ascend310P: number of processors
- Atlas inference product in mixed insertion mode:
- huawei.com/Ascend310P-V: number of processors
- huawei.com/Ascend310P-VPro: number of processors
- huawei.com/Ascend310P-IPro: number of processors
- Atlas 800I A2 inference server/A200I A2 Box heterogeneous component/ Atlas 800I A3 SuperPoD Server: huawei.com/Ascend910: number of processors
Example: huawei.com/Ascend310: 1
Type and number of requested NPUs. Change them as required. For requests and limits, the processor name and quantity must be the same.
limits
(Optional) host-arch
ARM environment: huawei-arm
x86_64 environment: huawei-x86
Architecture of the node where an inference job is executed. Set this parameter as required. The Atlas 200I SoC A1 core board supports only huawei-arm.
servertype
soc
Server type.
- To schedule jobs to the Atlas 200I SoC A1 core board, add this parameter and mount the directory by referring to the infer-310p-1usoc.yaml file.
- This parameter is not required for other types of nodes.
- Inference server (equipped with Atlas 300I inference cards)
- Select a YAML example as required and modify the file as follows.
Table 3 Operation examples Feature
Operation Reference
Full NPU scheduling
Creating a Single-Processor Job on Atlas Inference Products (non-Atlas 200I SoC A1 Core Board)
Creating a Single-Processor Job on the Atlas 200I SoC A1 Core Board
Creating a Single-Processor Job on the Atlas 800I A2 Inference Server
Static vNPU scheduling
Creating a Single-Processor Job on Atlas Inference Products (Non-Atlas 200I SoC A1 Core Board)
- The following uses infer.yaml as an example to describe how to create a single-processor inference job in non-mixed insertion mode of the Atlas inference product (non-Atlas 200I SoC A1 core board).
apiVersion: batch/v1 kind: Job metadata: name: resnetinfer1-1 spec: template: spec: nodeSelector: host-arch: huawei-arm # (Optional) Set it as required. affinity: # The job is not scheduled to the Atlas 200I SoC A1 core board. nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: servertype operator: NotIn values: - soc containers: - image: ubuntu-infer:v1 ... resources: requests: huawei.com/Ascend310P: 1 limits: huawei.com/Ascend310P: 1 ... - The following uses infer-310p-1usoc.yaml as an example to describe how to create a single-processor inference job on the Atlas 200I SoC A1 core board (non-mixed insertion mode).
apiVersion: batch/v1 kind: Job metadata: name: resnetinfer1-1-1usoc spec: template: spec: nodeSelector: host-arch: huawei-arm # (Optional) Set it as required. servertype: soc # The job is scheduled only to the Atlas 200I SoC A1 core board. containers: - image: ubuntu-infer:v1 ... resources: requests: huawei.com/Ascend310P: 1 limits: huawei.com/Ascend310P: 1 ...
The directories and files to be mounted to the node of the Atlas 200I SoC A1 core board are different from those to other types of nodes. To avoid inference failure, if Atlas inference product are required and the node of the Atlas 200I SoC A1 core board exists in a cluster but you do not want to schedule jobs to this type of node, add the affinity field to the example YAML file. This prevents scheduling jobs to the nodes with the servertype=soc label.
- Refer to this configuration when using the full NPU scheduling feature. The following uses infer.yaml as an example to describe how to create a single-processor inference job on the Atlas 800I A2 inference server.
apiVersion: batch/v1 kind: Job metadata: name: resnetinfer1-1 spec: template: spec: nodeSelector: host-arch: huawei-arm # (Optional) Set it as required. ... containers: - image: ubuntu-infer:v1 ... resources: requests: huawei.com/Ascend910: 1 limits: huawei.com/Ascend910: 1 ... - The following uses infer.yaml as an example to describe how to create a single-processor inference job using vNPUs of the Atlas inference product (non-Atlas 200I SoC A1 core board).
apiVersion: batch/v1 kind: Job metadata: name: resnetinfer1-1 spec: template: spec: nodeSelector: host-arch: huawei-arm # (Optional) Set it as required. containers: - image: ubuntu-infer:v1 ... resources: requests: huawei.com/Ascend310P-2c: 1 limits: huawei.com/Ascend310P-2c: 1 ...
- The following uses infer.yaml as an example to describe how to create a single-processor inference job in non-mixed insertion mode of the Atlas inference product (non-Atlas 200I SoC A1 core board).