Computing Power Allocation Precautions for the Atlas 300I Pro Inference Card

  • When deploying a pod application, you can allocate NPU resources to containers. If the NPU resources are divided into vNPUs, only one vNPU can be allocated to each container. If multiple vNPUs are allocated to one container, the deployment fails and the following information is displayed:
    FAILED:container [container-0] vNPU=[2] exceed max value:[1],each container can only allocate [1] vNPU

    As shown in the error information, only one vNPU can be allocated to each container, but two vNPUs are allocated.

  • If the deployed NPU resources are not divided, multiple NPU resources can be allocated to one container.
  • Currently, NPU computing power allocation supports only static allocation, which requires NPU resources to be allocated before starting the AtlasEdge. If NPU resources are allocated and deallocated when the AtlasEdge is running, the resources cannot be identified.
  • Currently, different inference cards cannot be used together.