使用Volcano v1.7.0版本时,若当前环境资源不足,使用kubectl get pod --all-namespaces -o wide命令查询Pod状态失败。
使用Volcano v1.7.0版本时,当资源不足时,Pod将不会被创建,无法查询Pod状态。
kubectl get pg -A
NAMESPACE NAME STATUS MINMEMBER RUNNINGS AGE vcjob mindx-xxx-16-p-4bf232e4-bd48-438d-9089-02bfef354fce Inqueue 1 5m32s vcjob mindx-xxx-2-p-8bf7f0f6-8a7e-4621-a0d0-cafa56785914 Pending 1 5m15s
kubectl describe pg -n <namespace> <podgroup-name>
<namespace>和<podgroup-name>需要用实际的命名空间和podgroup名称进行替换。
kubectl describe pg -n vcjob mindx-xxx-2-p-8bf7f0f6-8a7e-4621-a0d0-cafa56785914
Name: mindx-xxx-2-p-8bf7f0f6-8a7e-4621-a0d0-cafa56785914
Namespace: vcjob
Labels: fault-scheduling=force
ring-controller.atlas=ascend-{xxx}b
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"batch.volcano.sh/v1alpha1","kind":"Job","metadata":{"annotations":{},"labels":{"fault-scheduling":"force","ring-controller....
API Version: scheduling.volcano.sh/v1beta1
Kind: PodGroup
Metadata:
Creation Timestamp: 2023-07-05T09:00:02Z
Generation: 7
Owner References:
API Version: batch.volcano.sh/v1alpha1
Block Owner Deletion: true
Controller: true
Kind: Job
Name: mindx-xxx-2-p
UID: 8bf7f0f6-8a7e-4621-a0d0-cafa56785914
Resource Version: 17544644
Self Link: /apis/scheduling.volcano.sh/v1beta1/namespaces/vcjob/podgroups/mindx-xxx-2-p-8bf7f0f6-8a7e-4621-a0d0-cafa56785914
UID: 277cc974-5eec-455f-a860-25d7d19e8335
Spec:
Min Member: 1
Min Resources:
count/pods: 1
huawei.com/Ascend910: 2
Pods: 1
requests.huawei.com/Ascend910: 2
Min Task Member:
Default - Test: 1
Queue: default
Status:
Conditions:
Last Transition Time: 2023-07-05T09:05:46Z
Message: 1/0 tasks in gang unschedulable: pod group is not ready, 1 minAvailable
Reason: NotEnoughResources
Status: True
Transition ID: 33585c5e-d3ad-4bc4-be0c-c09bea59520e
Type: Unschedulable
Phase: Pending
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Unschedulable 6m22s (x12 over 6m34s) volcano 0/0 tasks in gang unschedulable: pod group is not ready, 1 minAvailable
Normal Unschedulable 93s (x280 over 6m34s) volcano queue resource quota insufficient # queue资源配额不足