Checking the Running Status

Procedure

Run the following command to check the pod running status:

kubectl get pod --all-namespaces

NAMESPACE        NAME                                       READY   STATUS    RESTARTS   AGE
...
default          resnetinfer1-2-scpr5                      1/1     Running   0          8s
...

Run the following command to view details about the node running the inference job:

kubectl describe node <hostname>

Example:

kubectl describe node ubuntu

...
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource              Requests     Limits
  --------              --------     ------
  cpu                   4 (2%)       3500m (1%)
  memory                2140Mi (0%)  4040Mi (0%)
  ephemeral-storage     0 (0%)       0 (0%)
  huawei.com/Ascend310  1            1
Events:
  Type    Reason    Age   From                Message
  ----    ------    ----  ----                -------
  Normal  Starting  36m   kube-proxy, ubuntu  Starting kube-proxy.
...

In the displayed information, find huawei.com/Ascend310 under Allocated resources. The number following huawei.com/Ascend310 increases after the inference job is executed. The increased number is the number of NPUs used by the inference job.

If Atlas inference products are used, Ascend310 in the preceding example is displayed as Ascend310P.

Parent topic: Inference Job