NodeD

  1. 通过如下命令查看K8s集群中NodeD的Pod,,需要满足Pod的STATUS为Running,READY为1/1。如果集群中有多个节点安装了NodeD,需要一个一个确认。

    kubectl get pods -n mindx-dl -o wide

    回显示例:

    root@ubuntu:~# kubectl get pods -n mindx-dl -o wide
    NAME                               READY   STATUS    RESTARTS   AGE    IP                NODE         NOMINATED NODE   READINESS GATES
    ...
    noded-bnmwt                        1/1     Running   10         40d    192.168.41.28     ubuntu       <none>           <none>
    ...

  2. 通过如下命令查看NodeD组件日志。

    kubectl logs -n mindx-dl {NodeD组件的Pod名字}

    如果持续出现如下打印信息,表示组件运行正常。

    ...
    [INFO]     2022/10/25 17:58:03.358409 1       pkg/heartbeat.go:104    set node heartbeat time 1666691883  # 1666691883为时间戳,实际可能有差异
    [INFO]     2022/10/25 17:58:03.364407 1       pkg/heartbeat.go:132    patch node annotations success
    ...

    回显示例:

    root@ubuntu-173:~# kubectl logs -n mindx-dl noded-bnmwt
    [INFO]     2022/10/24 18:07:52.740417 1       hwlog@v0.0.0/api.go:91    noded.log's logger init success
    [INFO]     2022/10/24 18:07:52.740996 1       main.go:46    noded starting and the version is v5.0.RC1_linux-x86_64
    [WARN]     2022/10/24 18:07:52.741129 1       K8stool@v0.0.0/self_K8s_client.go:153    Neither --kubeconfig nor --master was specified.Using the inClusterConfig.  This might not work.
    [INFO]     2022/10/24 18:07:52.763703 1       pkg/heartbeat.go:104    set node heartbeat time 1666606072
    [INFO]     2022/10/24 18:07:52.763821 1       pkg/heartbeat.go:114    set node heartbeat interval 5
    [INFO]     2022/10/24 18:07:52.783396 1       pkg/heartbeat.go:132    patch node annotations success
    [INFO]     2022/10/24 18:07:57.784028 1       pkg/heartbeat.go:104    set node heartbeat time 1666606077
    [INFO]     2022/10/24 18:07:57.789906 1       pkg/heartbeat.go:132    patch node annotations success
    ...