部署集群调度组件后,通过命令kubectl get pods --all-namespaces -o wide查看各组件状态,发现Pod处于ContainerCreating状态。以Ascend Device Plugin为例说明。
root@ubuntu:/home# kubectl get pods --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
default Ascend Device Plugin-6bc9bccc4c-n6c7w 0/1 ContainerCreating 0 10m <none> ubuntu <none> <none>
...
执行如下命令查看Pod详情。
kubectl describe pod -n namespace podname
如:
kubectl describe pod -n default Ascend Device Plugin-6bc9bccc4c-n6c7w
显示如下内容:
...
QoS Class: Guaranteed
Node-Selectors: masterselector=dls-master-node
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 16s default-scheduler Successfully assigned default/Ascend Device Plugin-6bc9bccc4c-n6c7w to ubuntu
Warning FailedMount 8s (x5 over 15s) kubelet, ubuntu MountVolume.SetUp failed for volume "device-ascenddeviceplugin" : hostPath type check failed: /var/log/mindx-dl/ascend-device-plugin is not a directory
对应组件的日志目录不存在。
具体操作请参见创建日志目录章节。