Pod Remains in the Terminating State After vcjob Is Manually Deleted

Symptom

After vcjob is deleted using kubectl delete -f xxx.yaml, the pod remains in the Terminating state.

Possible Causes

None

Method 1: Unmounting the NFS Mounting Paths of the Pod

  1. Run the following command to check the NFS mounting paths of the pod:

    mount|grep NFS share IP address

    Figure 1 Query result

    As shown in the figure, xxx.xxx.xxx.xxx:/data/k8s/run and xxx.xxx.xxx.xxx:/data/k8s/dls_data/public/dataset/resnet50 are the NFS mounting paths of the pod.

  2. Run the following command to unmount each NFS mounting path of the pod:

    umount -f NFS mounting path

  3. Run the following command to check whether the NFS mounting paths of the pod have been unmounted:

    mount|grep NFS share IP address

Method 2: Deleting the Docker Process to Which the Pod Belongs

  1. Run the following command to query the Docker process to which the pod belongs:

    docker ps |grep pod name

  2. Run the following command to check the files occupied by the Docker process:

    ll /var/lib/docker/containers |grep Docker process ID

    The following is an example of the command result:

    root@ubuntu:/data/k8s/run# ll /var/lib/docker/containers |grep 95aeeafe2db8
    drwx------ 4 root root 4096 Jun 24 16:00 95aeeafe2db898065094dd34dbfbeca04734d5248316aa802d43a36b4d8b99df/
  3. Run the following command to delete the files occupied by the Docker process:

    rm -rf /var/lib/docker/container/95aeeafe2db898065094dd34dbfbeca04734d5248316aa802d43a36b4d8b99df/

  4. Run the following command to query the ID of the Docker process that occupies the files:

    lsof |grep 95aeeafe2db8

    Figure 2 Query result
  5. Run the following command to stop the process:

    kill -9 PID

  6. Run the following command to check whether the process has been deleted:

    ps -ef | grep PID

    • If yes, go to 7.
    • If no, query and stop the process again. For details, see 4 and 7.
  7. Run the following command to delete the Docker to which the pod belongs:

    docker rm 95aeeafe2db8

    After the pod is deleted, wait for about 1 minute and then view the pod information again.