kubectl get pod --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE ... default hccl-controller-688c7cb8c6-r4zkm 1/1 Running 0 3d21h default resnetinfer1-2-scpr5 1/1 Running 0 8s ... kube-system ascend-device-plugin2-daemonset-8g2hb 1/1 Running 1 4d16h npu-exporter npu-exporter-jwq5l 1/1 Running 0 9h ...
kubectl describe node <hostname>
示例如下:
kubectl describe node ubuntu
... Allocated resources: (Total limits may be over 100 percent, i.e., overcommitted.) Resource Requests Limits -------- -------- ------ cpu 4 (2%) 3500m (1%) memory 2140Mi (0%) 4040Mi (0%) ephemeral-storage 0 (0%) 0 (0%) huawei.com/Ascend310 1 1 Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Starting 36m kube-proxy, ubuntu Starting kube-proxy. ...
在显示的信息中,找到Allocated resources下的huawei.com/Ascend310,huawei.com/Ascend310后面的数字在执行推理任务之后增加,增加数量为推理任务使用的NPU芯片个数。
如果使用的是Atlas 推理系列产品,则上述的Ascend310显示为Ascend310P。
kubectl logs -f resnetinfer1-2-scpr5