Software Packages That Can Be Installed and Upgraded

Table 1 lists software packages that can be installed using MindCluster Ascend Deployer. You can run the bash install.sh --help command to view values supported by --install=<package_name>.

Precautions

  • During the installation, run the date -s command to calibrate the time in the operating environment to the correct UTC time.
  • When you install or upgrade specified software, MindCluster Ascend Deployer only supports the installation or upgrade of the software version installed in the last year.
  • MindCluster Ascend Deployer can install only basic libraries to ensure that TensorFlow and PyTorch can run properly. If you need to run complex inference services or model training, the model code may contain libraries related to specific services. You need to install the libraries by yourself.
  • If the GCC version is earlier than 7.3.0, MindCluster Ascend Deployer automatically installs GCC 7.3.0.
  • MindCluster Ascend Deployer requires many dependencies. You are advised to install or upgrade the software packages uploaded by yourself after the OS dependencies downloaded by this tool are installed.

Software Packages That Can Be Installed and Upgraded

Table 1 Software packages that can be installed and upgraded

Scenario Type

Package Type

Required Parameter

Description

Installation

System component

sys_pkg

In UOS, sys_pkg cannot be specified for installation.

Installation

Python

python

  • Ensure that Python has been installed before installing Python libraries, such as TensorFlow, MindSpore, and PyTorch.
  • By default, Python 3.7.5 is downloaded and installed using MindCluster Ascend Deployer. This document uses Python 3.7.5 as an example.

    If you want to select another Python version (you are not advised to change the default configuration), you can set the environment variable ASCEND_PYTHON_VERSION (for example, running export ASCEND_PYTHON_VERSION=Python-3.7.0) or modify the Python configuration option in the ascend-deployer/ascend_deployer/downloader/config.ini file to specify the Python version to be installed. The available versions are 3.7.0 to 3.7.11, 3.8.0 to 3.8.11, 3.9.0 to 3.9.9, 3.10.0 to 3.10.12, and 3.11.4.

Installation

Distributing software packages

copy_pkgs

All software packages in the environment are forcibly distributed. copy_pkgs distributes all software packages in the resources folder to the device to be installed.

Example: bash install.sh --install=copy_pkgs

Installation and upgrade

NPU driver and firmware

npu (driver, firmware)

  • If the driver version installed using MindCluster Ascend Deployer does not match the system kernel version, you can refer to "Installing Dependencies Required for Compiling Driver Source Code" in CANN Software Installation Guide to manually install the desired driver.
  • The device health status is obtained before the NPU is installed. If the device is faulty, the installation stops.

Upgrade

MCU firmware

mcu

  • Before upgrading a firmware package of MCU 3.3.4 or later, upgrade the npu-smi tool to 22.0.3 or later. Otherwise, the upgrade will fail. After the new MCU version takes effect, the active and standby areas of the MCU will be synchronized. If you need to upgrade the MCU again, wait for five minutes and then perform the upgrade again. If the version after upgrade is not the target version or the upgrade fails, perform the upgrade again. If the upgrade still fails, record the fault information and operations you have performed, and contact Huawei technical support.
  • During the MCU upgrade and within two minutes after the upgrade takes effect, do not perform any operation on the MCU.

Installation and upgrade

CANN

nnae, nnrt, toolkit, kernels

  • For CANN versions earlier than 8.5.0, the kernels component is installed in the same version of NNAE by default. If NNAE is not installed, kernels is installed in Toolkit. If neither Toolkit nor NNAE is installed, kernels is installed in NNRT. If NNAE, Toolkit, and NNRT are not installed, the installation is skipped by default. The installation path (using Toolkit as an example) is Software_package_installation_path/ascend-toolkit/latest/opp/built-in/op_impl/ai_core/tbe/kernel. During Toolkit installation, the HCCL performance tester is automatically compiled and installed. The installation path is Software_package_installation_path/ascend-toolkit/latest/tools/hccl_test.
  • For CANN 8.5.0 and later versions, the CANN software package contains only the Toolkit and ops packages. In the upgrade scenario, uninstall NNAE and NNRT of the earlier version after the upgrade. For example, run the bash /usr/local/Ascend/nnae/latest/script/uninstall.sh command. During ops installation, the HCCL performance tester is automatically compiled and installed. The installation path is Software_package_installation_path/cann/tools/hccl_test.
  • After CANN 8.5.0 or a later version is installed, if a rollback is required, manually uninstall the CANN software package of the new version and deploy the old version again.
  • When CANN 8.5.0 or later is installed, only NNRT is supported in some scenarios. If only NNRT is selected during the installation, only the nnrt command can be used for subsequent upgrade, and NNAE or Toolkit upgrade is not supported.
  • When CANN 8.5.0 or later is installed, if the installation command contains toolkit or nnae, the CANN software package is fully installed.
  • NNRT cannot be upgraded from a version earlier than CANN 8.5.0 to CANN 8.5.0 or later. To upgrade it, you need to uninstall NNRT of a version earlier than CANN 8.5.0 and then install NNRT of CANN 8.5.0 or later.
  • For CANN 8.5.0 and later versions, you need to install Toolkit+ops or NNRT+ops to use the functions.

For details, see "Installing CANN" in CANN Software Installation Guide (community edition).

Installation and upgrade

MindCluster performance test

toolbox

-

Installation and upgrade

MindCluster cluster scheduling

ascend-device-plugin, ascend-docker-runtime, noded, npu-exporter, volcano, ascend-operator, resilience-controller, clusterd

  • They can be installed only in the Kubernetes and Docker scenarios.
  • When installing MindCluster, ensure that the available disk space of the Docker container, file system, or root directory in the system is greater than 30% after adding an estimated 18 GB (for the MindCluster image and training and inference images) to the used space.
  • If Kubernetes has been installed and deployed on the device, check whether the Kubernetes version is between 1.19.16 and 1.28.x and whether the Docker version is 18.09.x or later. (cri-dockerd needs to be installed for Kubernetes 1.24 and later.) If the version does not meet the requirement, the installation fails.

Installation and upgrade

MindCluster fault diagnosis

fault-diag

Only Python 3.7, 3.9, 3.10, and 3.11.4 are supported.

Installation

MindCluster cluster scheduling (MindIO)

mindio

Only the root user can be used for installation.

Installation

AI framework

tensorflow, pytorch, mindspore

When MindCluster Ascend Deployer is used for deployment, MindSpore, TensorFlow, and Torch-npu cannot be downloaded and installed at the same time. Only one AI framework can be downloaded and installed. Select an AI framework based on the actual service scenario.

Installation

Container image tool

docker_images

-

Installation

MindIE image

mindie-image

-

Installation

MindIE image

deepseek_pd

For DeepSeek P/D instance deployment.

Installation

MindIE Server

deepseek_cntr

For DeepSeek deployment in a Docker environment.