Parameters for install.sh

Select corresponding parameters to install the software.

Command:

  • Method 1: bash install.sh [options] for downloading and decompressing the ZIP package
  • Method 2: ascend-deployer [options] for installing MindCluster Ascend Deployer via pip
  • Method 3: bash large_scale_install.sh [options] for downloading and decompressing the ZIP package
  • Method 4: large-scale-deployer [options] for installing MindCluster Ascend Deployer via pip

Table 1 describes the parameters. You can run the bash install.sh --help command to view the options of the following parameters.

Table 1 Parameters

Parameter

Description

--help, -h

Queries help information.

--check

Checks the environment, including the connectivity between the environment and the device where the software is to be installed and the compatibility between the device and the tool, as well as the software package to be installed.

This parameter must be used together with --install=<package_name>, --install-scene=<scene_name>, or --upgrade=<package_name>.

When --check is used, --skip_check cannot be used.

--check_mode

Selects a check mode. The fast or full check is supported. Error data is displayed in a unified manner, and check result files are generated. For details, see Check Result File.

Values: fast (fast check, exit immediately when an exception occurs) and full (full check). The default value is full.

This parameter must be used together with --install=<package_name>, --install-scene=<scene_name>, --upgrade=<package_name>, or --check.

When --check_mode is used, --skip_check cannot be used.

--skip_check

Determines whether to perform an installation check.

The check items include the user, configuration, dependency, compatibility, and card health status.

This parameter can be used together with --install=<package_name>, --install-scene=<scene_name>, or --upgrade=<package_name>.

When --skip_check is used, --check and --check_mode cannot be used.

--clean

Deletes the resources directory and resources_{arch}.tar from the home directory on the device where the software is to be installed.

--nocopy

Does not copy resources during batch installation. In large-scale deployment, this parameter does not affect resource copy.

This parameter must be used together with --install=<package_name>, --install-scene=<scene_name>, or --upgrade=<package_name>.

Example: bash install.sh --install=python --nocopy

--only_copy

Copies resources during batch installation. In large-scale deployment, this parameter does not affect resource copy.

This operation is performed only on the software to be installed but does not install the software. This parameter and --nocopy are mutually exclusive.

This parameter must be used together with --install=<package_name>, --install-scene=<scene_name>, or --upgrade=<package_name>.

Example: bash install.sh --install=python --only_copy

--force_upgrade_npu

Forcibly upgrades the NPUs when not all NPUs are abnormal.

--verbose

Prints the installation status of each task in detail.

This parameter must be used together with install or install-scene.

Example: bash install.sh --install=python --verbose, which prints Python installation details.

--stdout_callback=<callback_name>

Sets the output format of the command. You can run the ansible-doc -t callback -l command to view the available parameters.

--install=<package_name>

Installs the specified software. For details about software packages to be installed, see Software Packages That Can Be Installed and Upgraded.

--upgrade=<package_name>

Upgrades the specified software.

Values: npu(driver, firmware), mcu, nnae, nnrt, toolkit, kernels, toolbox, fault-diag, ascend-device-plugin, ascend-docker-runtime, noded, npu-exporter, volcano, ascend-operator, resilience-controller, and clusterd

--install-scene=<scene_name>

Specifies the installation scenario. For details about the installation scenarios, see Supported Installation and Upgrade Scenarios.

--patch=<package_name>

Patches the specified software.

Values: nnae, nnrt, and toolkit

--patch-rollback=<package_name>

Rolls back the patch of the specified software.

Values: nnae, nnrt, and toolkit

--test=<target>

Checks whether a specified component version can work properly.

Values: all, driver, firmware, mcu, mindspore, nnae, nnrt, pytorch, tensorflow, toolbox, toolkit, ascend-device-plugin, ascend-docker-runtime, noded, npu-exporter, volcano, ascend-operator, resilience-controller, clusterd, mindie_image, and fault-diag

--hccn

Configures the HCCN network. This parameter is not supported in large-scale deployment scenarios.

--hccn --check

Checks the HCCN network.

--retry=<target>

This parameter is used only when a cluster of the ultra-large scale is installed and deployed. The default value is fast.

Retry option. The sub-cluster configuration file generated last time is automatically used to re-deploy the cluster.

Values:

  • full: full redeployment. The software package is uploaded to each server again, which takes a long time. This option cannot be used together with --nocopy and must be used in multi-instance scenarios.
  • fast: fast redeployment. The software package is not uploaded again. Instead, it triggers the original --nocopy option. This option cannot be used together with --only_copy.

Check Result File Example

Use the --check_mode parameter to generate a check_res_output.json file in the ~/.ascend_deployer/deploy_info/ directory, which displays error information. Example:

{
  "CheckList": [
    {
      "check_item": "check_card",
      "desc_en": "Check NPU card compatibility",
      "tip_en": "",
    },
  "check_k8s_version": {
        "check_item": "check_k8s_version",
        "desc_en": "Judgment: 1. kubelet, kubectl, and kubeadm all exist"
                   "2. kubelet --version == kubeadm version == kubectl version "
                   "3.kubelet version < 1.29 "
                   "4. kubelet version >=1.19.16.",
        "tip_en": "Execute the version query command to confirm whether the component has been installed, "
                  "whether the version number is the same, and whether the version is within the supported range.",
        "help_url": ""
    }
  ],
  "HostCheckResList": {
    "xx.xx.xx.x1": [
      {
        "check_item": "check_card",
        "status": "failed",
        "error_msg": "Check card failed: [ASCEND] A300i-pro has no support for MTOS_22.03LTS-SP4_aarch64 on this device"
      }
    ]
  }
}
Table 2 Parameters in the check result file

Parameter

Description

check_item

Check item

desc_en

Check result (English)

desc_zh

Check result (Chinese)

tip_en

Tip (English)

tip_zh

Tip (Chinese)

help_url

Document link

error_msg

Error message

status

Check result

  • success
  • failed