Volcano Component Works Abnormally, and "Failed to get plugin volcano-npu_xxx_linux-aarch64" Is Displayed in the Log
Symptom
Pod volcano-scheduler-xxxx of Volcano is in the Running state, but the scheduling is abnormal. View the volcano-scheduler logs. The following information is displayed in the log:
E1026 10:55:44.995088 1 framework.go:38] Failed to get plugin volcano-npu_v3.0.RC2_linux-aarch64
Causes
The name of the scheduling plug-in to be used is specified in the Volcano startup YAML file.
...
# Source: volcano/templates/scheduler.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: volcano-scheduler-configmap
namespace: volcano-system
data:
volcano-scheduler.conf: |
actions: "enqueue, allocate, backfill"
tiers:
- plugins:
- name: priority
- name: gang
- name: conformance
- name: volcano-npu_v3.0.RC2_linux-aarch64 # Name of the scheduling plug-in
- plugins:
- name: drf
- name: predicates
...
During image creation, the .so file of the scheduling plug-in in the current directory is copied to the container for volcano-scheduler to use.
FROM alpine:latest COPY vc-scheduler /vc-scheduler COPY volcano-npu_*.so plugins/ ...
If the name of the scheduling plug-in copied to the container for volcano-scheduler to use is different from that configured in the YAML file, the "Failed to get plugin" error occurs.
Solution
- Use the matching YAML file and SO file of the scheduling plug-in to create the volcano-scheduler image again.
- Uninstall and reinstall Volcano.
Parent topic: Troubleshooting