Checking the Running Status
Procedure
- Run the following command to check the pod running status:
kubectl get pod --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE ... default resnetinfer1-2-scpr5 1/1 Running 0 8s ...
- Run the following command to view details about the node running the inference job:
kubectl describe node <hostname>
Example:
kubectl describe node ubuntu
... Allocated resources: (Total limits may be over 100 percent, i.e., overcommitted.) Resource Requests Limits -------- -------- ------ cpu 4 (2%) 3500m (1%) memory 2140Mi (0%) 4040Mi (0%) ephemeral-storage 0 (0%) 0 (0%) huawei.com/Ascend310 1 1 Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Starting 36m kube-proxy, ubuntu Starting kube-proxy. ...
In the displayed information, find huawei.com/Ascend310 under Allocated resources. The number following huawei.com/Ascend310 increases after the inference job is executed. The increased number is the number of NPUs used by the inference job.
If Atlas inference products are used, Ascend310 in the preceding example is displayed as Ascend310P.
Parent topic: Inference Job