Use on the CLI (Other Schedulers)

The process of using other schedulers on the CLI is the same as that of using Volcano. The only difference is that the required job YAML files are different. Prepare and use the corresponding YAML files by referring to Use on the CLI (Volcano).

Procedure

  1. Download YAML files from the cluster scheduling code repository.
    Table 1 YAML files of different hardware models

    Job Type

    Hardware Model

    YAML File Name

    How to Obtain

    Jobs of Kubernetes or other schedulers

    Atlas 200I SoC A1 core board

    infer-310p-1usoc.yaml

    Click here.

    Inference nodes of other types

    infer.yaml

  2. Upload the YAML file to any directory on the management node and modify the file content as required.
    Table 2 Parameters in the YAML file

    Parameter

    Value

    Description

    image

    -

    Inference image name. Change it based on your actual requirements. (It is the name of the image created in the image preparation section.)

    replicas

    Integer

    Number of job replicas. Generally, the value is 1.

    requests

    • Inference server (equipped with Atlas 300I inference cards)

      huawei.com/Ascend310: number of processors

    • Atlas inference product in non-mixed insertion mode:

      huawei.com/Ascend310P: number of processors

    • Atlas inference product in mixed insertion mode:
      • huawei.com/Ascend310P-V: number of processors
      • huawei.com/Ascend310P-VPro: number of processors
      • huawei.com/Ascend310P-IPro: number of processors
    • Atlas 800I A2 inference server/A200I A2 Box heterogeneous component/ Atlas 800I A3 SuperPoD Server: huawei.com/Ascend910: number of processors

    Example: huawei.com/Ascend310: 1

    Type and number of requested NPUs. Change them as required. For requests and limits, the processor name and quantity must be the same.

    limits

    (Optional) host-arch

    ARM environment: huawei-arm

    x86_64 environment: huawei-x86

    Architecture of the node where an inference job is executed. Set this parameter as required. The Atlas 200I SoC A1 core board supports only huawei-arm.

    servertype

    soc

    Server type.

    • To schedule jobs to the Atlas 200I SoC A1 core board, add this parameter and mount the directory by referring to the infer-310p-1usoc.yaml file.
    • This parameter is not required for other types of nodes.
  3. Select a YAML example as required and modify the file as follows.
    • The following uses infer.yaml as an example to describe how to create a single-processor inference job in non-mixed insertion mode of the Atlas inference product (non-Atlas 200I SoC A1 core board).
      apiVersion: batch/v1
      kind: Job
      metadata:
        name: resnetinfer1-1
      spec:
        template:
          spec:
            nodeSelector:
                host-arch: huawei-arm    # (Optional) Set it as required.
            affinity:        # The job is not scheduled to the Atlas 200I SoC A1 core board.
              nodeAffinity:
                requiredDuringSchedulingIgnoredDuringExecution:
                  nodeSelectorTerms:
                    - matchExpressions:
                        - key: servertype
                          operator: NotIn
                          values:
                            - soc
            containers:
            - image: ubuntu-infer:v1
      ...
              resources:
                requests:
                  huawei.com/Ascend310P: 1
                limits:
                  huawei.com/Ascend310P: 1
      ...
    • The following uses infer-310p-1usoc.yaml as an example to describe how to create a single-processor inference job on the Atlas 200I SoC A1 core board (non-mixed insertion mode).
      apiVersion: batch/v1
      kind: Job
      metadata:
        name: resnetinfer1-1-1usoc
      spec:
        template:
          spec:
            nodeSelector:
               host-arch: huawei-arm    # (Optional) Set it as required.
              servertype: soc               # The job is scheduled only to the Atlas 200I SoC A1 core board.
            containers:
            - image: ubuntu-infer:v1
      ...
              resources:
                requests:
                  huawei.com/Ascend310P: 1
                limits:
                  huawei.com/Ascend310P: 1
      ...

      The directories and files to be mounted to the node of the Atlas 200I SoC A1 core board are different from those to other types of nodes. To avoid inference failure, if Atlas inference product are required and the node of the Atlas 200I SoC A1 core board exists in a cluster but you do not want to schedule jobs to this type of node, add the affinity field to the example YAML file. This prevents scheduling jobs to the nodes with the servertype=soc label.

    • Refer to this configuration when using the full NPU scheduling feature. The following uses infer.yaml as an example to describe how to create a single-processor inference job on the Atlas 800I A2 inference server.
      apiVersion: batch/v1
      kind: Job
      metadata:
        name: resnetinfer1-1
      spec:
        template:
          spec:
            nodeSelector:
                host-arch: huawei-arm   # (Optional) Set it as required.
      ...
            containers:
            - image: ubuntu-infer:v1
      ...
              resources:
                requests:
                  huawei.com/Ascend910: 1
                limits:
                  huawei.com/Ascend910: 1
      ...
    • The following uses infer.yaml as an example to describe how to create a single-processor inference job using vNPUs of the Atlas inference product (non-Atlas 200I SoC A1 core board).
      apiVersion: batch/v1
      kind: Job
      metadata:
        name: resnetinfer1-1
      spec:
        template:
          spec:
            nodeSelector:
                host-arch: huawei-arm    # (Optional) Set it as required.
            containers:
            - image: ubuntu-infer:v1
      ...
              resources:
                requests:
                  huawei.com/Ascend310P-2c: 1
                limits:
                  huawei.com/Ascend310P-2c: 1
      ...