Resource Allocation Constraints

Based on the service model design, an inference job must meet the following requirements:

  • The number of Ascend AI processors allocated to an inference job cannot be greater than the total number of Ascend AI processors on a node.
  • If the number of Ascend AI processors allocated to an inference job is less than or equal to 2, the inference job needs to be scheduled to one Atlas 300I Duo inference card.
  • When distributed inference is enabled, all job replicas can be deployed only on the same node. The total number of allocated Ascend AI processors cannot be greater than the total number of Ascend AI processors on the node.
  • Resource allocation must comply with other constraints of the open source Volcano.