用户在卸载Ascend Docker Runtime时需要针对不同容器引擎,根据步骤2进行两次卸载操作,每次卸载需要指定相应的安装路径,即--install-path参数。
用户在卸载Ascend Docker Runtime时,只需根据步骤2进行一次卸载操作,卸载完成之后需要手动将另一引擎的daemon.json文件还原为Ascend Docker Runtime安装之前的内容。
若用户需要保留其中一个容器引擎,需要在Ascend Docker Runtime卸载之后,针对相应场景进行重新安装。
kubectl edit cm -n cluster-system pingmesh-config
cd <path to run package>
./Ascend-docker-runtime_{version}_linux-{arch}.run --uninstall
./Ascend-docker-runtime_{version}_linux-{arch}.run --uninstall --install-scene=containerd
1 2 3 | Uncompressing ascend-docker-runtime 100% ... [INFO] Ascend Docker Runtime uninstall success |
cd /usr/local/Ascend/Ascend-Docker-Runtime/script
uninstall.sh docker docker <daemon.json文件路径>
uninstall.sh containerd containerd <config.toml文件路径>
回显示例如下,表示卸载成功。
1 2 3 | [INFO]: You will recover Docker's daemon ... [INFO] uninstall.sh exec success |
systemctl daemon-reload && systemctl restart docker
systemctl daemon-reload && systemctl restart containerd
支持卸载集群调度组件,用户可以卸载组件后重新安装最新版本组件。通过逐一卸载各组件,并删除对应的命名空间、日志目录、配置文件等,请根据安装方式选择对应的卸载方式。
kubectl edit cm -n cluster-system pingmesh-config
cd /home/ascend-device-plugin
kubectl delete -f device-plugin-volcano-v{version}.yaml
1 2 3 4 | serviceaccount "ascend-device-plugin-sa-910" deleted clusterrole.rbac.authorization.k8s.io "pods-node-ascend-device-plugin-role-910" deleted clusterrolebinding.rbac.authorization.k8s.io "pods-node-ascend-device-plugin-rolebinding-910" deleted deployment.apps "ascend-device-plugin-daemonset-910" deleted |
Ascend Device Plugin配合Volcano使用时,会创建ConfigMap,执行如下命令进行删除。
kubectl delete cm mindx-dl-deviceinfo-<node-name> -n kube-system
systemctl stop npu-exporter.service systemctl disable npu-exporter.service chattr -i /etc/systemd/system/npu-exporter.service rm -f /etc/systemd/system/npu-exporter.service systemctl daemon-reload systemctl reset-failed chattr -i /usr/local/bin/npu-exporter rm -f /usr/local/bin/npu-exporter
执行如下命令,卸载安装集群调度组件时创建的namespace。删除namespace会删除该namespace下的所有资源,请确认后再执行。
kubectl delete ns mindx-dl
1 | namespace "mindx-dl" deleted |
rm -rf /var/log/mindx-dl/clusterd
rm -rf /etc/mindx-dl/resilience-controller