Resource Allocation Constraints

Based on the service model design, an inference job must meet the following requirements:

  • The number of Ascend AI Processors allocated to an inference job cannot be greater than the total number of Ascend AI Processors on a node.
  • If the number of Ascend AI Processors allocated to an inference job is less than or equal to 4, the inference job needs to be scheduled to one Atlas 300I inference card.
  • Resource allocation must comply with other constraints of the open source Volcano.