Specifying the Processor Scheduling Policy for an Inference Job

When Volcano is used as the scheduler of inference jobs, you can specify the scheduling policy of the processor. In this case, you need to specify parameters such as the scheduler in the job YAML file. You can obtain the YAML file of the corresponding scheduling type by referring to Table 1. The following table describes related parameters.
Table 1 Parameters in the YAML file

Parameter

Value

Description

npu-310-strategy

  • card: scheduling by device. The number of chips requested by a request cannot exceed 4. The number of chips requested by a request is scheduled to the same device.
  • chip: scheduling by chip. The number of requested chips cannot exceed the maximum value supported by a single node.

-

schedulerName

volcano

To switch the scheduler, release all the previous scheduled jobs.

The following uses infer-deploy.yaml as an example to describe how to set parameters.

apiVersion: apps/v1
kind: Deployment
...
spec:
...
  template:
    metadata: 
      labels:
         app: infers
         npu-310-strategy: card
    spec:
      schedulerName: volcano 
...