Performing Remote AOE Tuning Using NCS

Some products running in Ascend RC mode have small memory. As a result, the model cannot be tuned using the local AOE. In this case, another tuning method is required. The NCS tool is used to remotely connect to the development environment and operating environment, and the AOE tool is used in the development environment for remote debugging.

Installing the CANN Package

  • Setting up the environment

    A common Linux server (referred to as the host) and a server equipped with an Ascend NPU (referred to as the device) are available. The toolkit package needs to be installed on the server. The following uses CANN 8.0.0 as an example. For details about how to install the toolkit package, see the CANN Installation Guide of the corresponding version.

    Figure 1 Setting up the environment
  • Configuring environment variables
    • Host

      Load the environment variable script set_env.sh in the CANN installation path. The following uses ${install_path} as an example.

      source ${install_path}/ascend-toolkit/set_env.sh
    • Device
      export LD_LIBRARY_PATH=${install_path}/latest/tools/ncs/lib64/:${install_path}/latest/runtime/lib64/:$LD_LIBRARY_PATH
      export PATH=${install_path}/latest/tools/ncs/bin/:$PATH

      Replace install_path with the actual installation path of the CANN software.

Configuring Key Certificates

Ensure that the host and device are in the same primary network segment and can ping each other.

  1. Upload the following shell script to the host:
    DEVICE_IP=10.x.x.196   # your device ip
    HOST_IP=10.x.x.66 # your host IP
    PASS_PHRASE=Ncx12345 # your pass phrase
    KEY_LEN=3072 # [3072,4096]
    VALID_DAYS=365 # the cert will expire after the valid days
    COUNTRY=CN # your country name abbr.(2 letter code)
    STATE=Zhejiang # your province name
    LOCATION=Hangzhou # your city name
    ORGANIZATION=ABC # your company name
    ORGANIZATION_UNIT=DEF # your section name
    COMMON_NAME_ROOT=www.aoe.com # your domain name
    ENCRYPT_MODE=aes256 # [aes256,aes128]
    
    ########## Modify the preceding content as required. You are advised not to modify the following content. ##########
    
    #generate conf
    rm -rf host-ext.cnf  device-ext.cnf
    echo "[ ext ]" >> host-ext.cnf
    echo "subjectAltName=IP:${HOST_IP}" >> host-ext.cnf
    echo "[ ext ]" >> device-ext.cnf
    echo "subjectAltName=IP:${DEVICE_IP}" >> device-ext.cnf
    
    # generate root cert
    openssl req -x509 -newkey rsa:${KEY_LEN} -days ${VALID_DAYS} -nodes -keyout ca-key.pem -out ca-cert.pem -subj "/C=${COUNTRY}/ST=${STATE}/L=${LOCATION}/O=${ORGANIZATION}/OU=${ORGANIZATION_UNIT}/CN=${COMMON_NAME_ROOT}" -addext keyUsage=keyCertSign
    
    # generate device cert request
    openssl req -newkey rsa:${KEY_LEN} -nodes -keyout device-key.pem -out device-cert.csr -subj "/C=${COUNTRY}/ST=${STATE}/L=${LOCATION}/O=${ORGANIZATION}/OU=${ORGANIZATION_UNIT}/CN=NCS"
    # generate device cert
    openssl x509 -req -in device-cert.csr -days ${VALID_DAYS} -CA ca-cert.pem -CAkey ca-key.pem -CAcreateserial -out device-cert.pem -extensions ext -extfile device-ext.cnf
    
    # generate host cert request
    openssl req -newkey rsa:${KEY_LEN} -nodes -keyout host-key.pem -out host-cert.csr -subj "/C=${COUNTRY}/ST=${STATE}/L=${LOCATION}/O=${ORGANIZATION}/OU=${ORGANIZATION_UNIT}/CN=NCA"
    #generate host cert
    openssl x509 -req -in host-cert.csr -days ${VALID_DAYS} -CA ca-cert.pem -CAkey ca-key.pem -CAcreateserial -out host-cert.pem -extensions ext -extfile host-ext.cnf
    
    # encrytprivatekey
    openssl rsa -in host-key.pem -passout pass:${PASS_PHRASE} -${ENCRYPT_MODE} -out host-key.pem
    openssl rsa -in device-key.pem -passout pass:${PASS_PHRASE} -${ENCRYPT_MODE} -out device-key.pem
  2. Change the values of Device Ip and Host Ip to the corresponding IP addresses, and run the shell script on the host to generate a key certificate file.
    Table 1 Related files

    File Name

    Function

    ca-cert.pem

    Root CA, which needs to be copied to the development environment and operating environment.

    host-key.pem

    Private key of the development environment, which needs to be copied to the development environment.

    host-cert.pem

    Certificate of the development environment, which needs to be copied to the development environment.

    device-key.pem

    Private key of the operating environment, which needs to be copied to the operating environment.

    device-cert.pem

    Certificate of the operating environment, which needs to be copied to the operating environment.

    ca-key.pem

    Intermediate process file, which can be ignored.

    ca-cert.srl

    Intermediate process file, which can be ignored.

    host-cert.csr

    Intermediate process file, which can be ignored.

    device-cert.csr

    Intermediate process file, which can be ignored.

    host-ext.cnf

    Intermediate process file, which can be ignored.

    device-ext.cnf

    Intermediate process file, which can be ignored.

  3. Copy the device-key.pem, device-cert.pem, and ca-cert.pem files to the device and run the following command in the directory where the key certificate is stored:
    • Host
      akt --private_key host-key.pem --public_cert host-cert.pem --ca_cert ca-cert.pem
    • Device
      akt --private_key device-key.pem --public_cert device-cert.pem --ca_cert ca-cert.pem

      In the example command, the private key file name is host-key.pem or device-key.pem, the device certificate file name is host-cert.pem or device-cert.pem, and the root certificate file name is ca-cert.pem.

  4. After the command is executed, the message "Enter Password:" is displayed. Enter the password PASS_PHRASE for encrypting the private key. The password must be the same as that used for generating the private key.
    If the following information is displayed, the certificate is imported successfully:
    Load cert, password, and key successfully.

Performing Tuning

  1. Before tuning using AOE, run the following commands on the host to configure the environment:
    export TUNE_BANK_PATH={bank_path} # Specify the address for storing the knowledge base.
    
    export TE_PARALLEL_COMPILER=32 # Accelerate AOE tuning.
  2. Run the following command to perform tuning using AOE.

    Device side:

    ncs &
    Check whether the NCS is started successfully.
    ps -ef|grep ncs|grep -v "grep"

    If the following information is displayed, the NCS service is started successfully. The NCS daemon process has the ID of 257435, and the NCS running process has the ID of 257440.

    root257440257435007:54pts/300:00:01ncs--ipXX.XX.XX.XX--portXXXX--daemonfalse

    Host side:

    aoe --framework 5 --model ./model.onnx --output model --job_type 2 --ip xx.xx.xx.xx --aicore_num=1

    Table 2 describes the parameters in the command. For more tuning parameters, see AOE Instructions.

    Table 2 Parameters

    Parameter

    Description

    --model

    Model to be tuned.

    --output

    Name of the model that has been tuned.

    --job_type

    The values 2 and 1 indicate subgraph tuning and operator tuning, respectively. Perform subgraph tuning and then operator tuning.

    --ip

    IP address of the device.

    --aicore_num

    Number of AI Cores.

  3. After the command is executed, the performance optimization ratio is displayed, as shown in Figure 2. The performance is improved by 53%.
    Figure 2 Tuning output