After Volcano Is Manually Installed, the Pod Status Is CrashLoopBackOff

Symptom

When Volcano is running, its pod status is CrashLoopBackOff.

The following is an example.

View the logs of the pod corresponding to pod.
  • The error message "permission denied" is displayed.

  • Waiting for streamwatcher.go times out.

Cause Analysis

During manual installation, the Volcano log permission is incorrect.

Solution

  • Reset the owner group and permission of the Volcano log directory for the volcano-controller component.
    chown -R hwMindX:hwMindX /var/log/mindx-dl/volcano-controller
    chmod 750 /var/log/mindx-dl/volcano-controller
    chmod 640 /var/log/mindx-dl/volcano-controller/volcano-controller.log
  • Reset the owner group and permission of the Volcano log directory for the volcano-scheduler component.
    chown -R hwMindX:hwMindX /var/log/mindx-dl/volcano-scheduler
    chmod 750 /var/log/mindx-dl/volcano-scheduler
    chmod 640 /var/log/mindx-dl/volcano-scheduler/volcano-scheduler.log

Wait for the pod to restore or delete the faulty pod.