Inference Server (with Atlas 300I Duo Inference Cards)
Affinity scheduling is supported by the inference server (with Atlas 300I Duo inference cards). A maximum of four Atlas 300I Duo inference cards can be inserted into an Atlas 800 inference server (model 3000), and each Atlas 300I Duo inference card has two Ascend AI processors. When you deliver a job YAML on an inference server (with Atlas 300I Duo inference cards), you can use duo to specify the Atlas 300I Duo inference card, npu-310-strategy to specify the scheduling mode, and distributed to specify the scheduling policy. For details about the parameters, see Table 1.
Parameter |
Default Value |
Value Description |
|---|---|---|
duo |
false |
|
npu-310-strategy |
chip |
|
distributed |
false |
|