Volcano Works Abnormally, and "Failed to get plugin" Is Displayed in the Log
Symptom
The pod of volcano-scheduler-xxxx of Volcano is in the Running state, but scheduling is abnormal. View the volcano-scheduler logs. The following information is displayed in the log:
E1026 10:55:44.995088 1 framework.go:38] Failed to get plugin volcano-npu_v{version}_linux-aarch64
Cause Analysis
The name of the scheduling plugin to be used is specified in the Volcano startup YAML file.
...
# Source: volcano/templates/scheduler.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: volcano-scheduler-configmap
namespace: volcano-system
data:
volcano-scheduler.conf: |
actions: "enqueue, allocate, backfill"
tiers:
- plugins:
- name: priority
- name: gang
- name: conformance
- name: volcano-npu_v{version}_linux-aarch64 # Name of the scheduling plugin
- plugins:
- name: drf
- name: predicates
...
During image creation, the .so file of the scheduling plugin in the current directory is copied to the container for volcano-scheduler to use.
FROM alpine:latest COPY vc-scheduler /vc-scheduler COPY volcano-npu_*.so plugins/ ...
If the name of the scheduling plugin copied to the container for volcano-scheduler to use is different from that configured in the YAML file, the "Failed to get plugin" error occurs.
Solution
- Use the matching YAML file and SO file of the scheduling plugin to create the volcano-scheduler image again.
- Uninstall Volcano and reinstall it.
Parent topic: Faults During Installation