Configuring Batch Installation

Perform the following operations before installing the NPU driver and firmware, CANN packages, AI frameworks, MindCluster components (performance test, fault diagnosis, and cluster scheduling), and MindIE images, as well as configuring HCCN parameters. There are two configuration modes.

This section is suitable for the batch installation scenario. Skip this section for single-server installation.

Editing inventory_file

  1. Log in to the MindCluster Ascend Deployer executor.
  2. Configure the IP addresses and usernames of target devices for batch installation on the MindCluster Ascend Deployer executor.

    Go to the ascend-deployer/ascend_deployer directory (if the pip installation mode is used, go to /root/ascend-deployer/inventory_file), edit inventory_file, and comment out or delete localhost ansible_connection='local' ansible_ssh_user='root' under [worker]. Set related parameters based on Table 1. Then, run the :wq command to save the settings and exit.

    Table 1 Parameter description

    Parameter

    Required or Not

    Description

    IP

    Yes

    Server IP address.

    ansible_ssh_user

    Yes

    Account for logging in to a remote server using SSH. The account must be root.

    ansible_ssh_pass

    No

    Password for logging in to a remote server using SSH.

    If SSH key-based authentication is configured and the root user is allowed for login, you do not need to set this parameter.

    npu_num

    No

    Number of NPUs. You can check whether the number of identified NPUs is the same as the planned number of NPUs.

    davinci

    No. This parameter is optional only when MindIE is installed.

    Da Vinci device mapped to the container. One or more Da Vinci devices can be mounted. You can run the ll /dev/ | grep davinci command to query the name and number of Da Vinci devices.

    If this parameter is not set, all Da Vinci devices are mounted by default.

  3. Configure global variables under the [all:vars] field.

    Parameter

    Required or Not

    Description

    WEIGHTS_PATH

    No.

    Required when MindIE is installed.

    Directory where model weights are stored. It must be the actual file path on the node.

Example:
[worker]
xx.xxx.xx.xx1 ansible_ssh_user="root" ansible_ssh_pass="xxxxxxx"       # Use the actual IP address of the target device.
xx.xxx.xx.xx2 ansible_ssh_user="root" ansible_ssh_pass="xxxxxxx" davinci=0,1,2,3       # Use the actual IP address of the target device.

[all:vars]
WEIGHTS_PATH="/home/weights"             # Replace it with the directory where model weights are located.

Both IPv4 and IPv6 addresses are supported. The type of IP addresses used by an SSH client such as PuTTY to connect to the execution device must be the same as that configured in inventory_file, which should be either IPv4 or IPv6.