NPU Inference Job
NPU inference jobs in "Typical Scenarios" are classified into the following types:
- Using Volcano as the scheduler: See Basic Process of NPU Inference Jobs Using Volcano as the Scheduler.
- Not using Volcano as the scheduler: See Basic Process of NPU Inference Jobs Not Using Volcano as the Scheduler.
Basic Process of NPU Inference Jobs Using Volcano as the Scheduler
Create an inference job of the Deployment type.
- Deployment resource example
apiVersion: apps/v1 kind: Deployment metadata: name: infer spec: replicas: 1 selector: matchLabels: app: infers template: metadata: labels: app: infers spec: schedulerName: volcano nodeSelector: host-arch: huawei-arm containers: - image: infer:latest imagePullPolicy: IfNotPresent name: infer command: xxxx resources: requests: huawei.com/Ascend310: 1 limits: huawei.com/Ascend310: 1 volumeMounts: - name: ascend-driver mountPath: /usr/local/Ascend/driver volumes: - name: ascend-driver hostPath: path: /usr/local/Ascend/driver
- Generally, the value of replicas is 1.
- The schedulerName of the scheduler must be Volcano.
- By default, nodeSelector supports only the key-value pairs configured in the YAML file when Volcano is started and the host-arch label must be used. For details about how to add a user-defined selector, see Volcano Scheduling Configuration.
- Change the NPU resource name and quantity in the request and limit. You can view the node details in the Kubernetes cluster to determine the NPU resource types that can be used by the nodes, such as the devices, NPUs after virtual instance implementation, and Ascend310/Ascend310P.
- Currently, only one container in a pod can use NPUs.
- Mount driver-related directories. If either of the following conditions is not met, you need to mount driver-related directories.
- When the startup parameter useAscendDocker of the Ascend Device Plugin is set to true and the Ascend Docker Runtime has been installed and takes effect, the driver-related directories installed in /usr/local/Ascend are automatically mounted.
- When the startup parameter useAscendDocker of the Ascend Device Plugin is set to false, the driver-related directories installed in /usr/local/Ascend are automatically mounted.
- You need to mount model code paths, and add other required content, such as environment variables.
- You need to set the container startup command, which corresponds to the command field in the YAML file.
Parent topic: Quick Start