Quick Start

Introduction

The PyTorch GPU2Ascend tool can migrate GPU-based training scripts into NPU-based scripts at a high speed, minimizing the workload of developers. This sample helps developers quickly experience the migration efficiency of automatic migration (recommended) and the PyTorch GPU2Ascend tool.

This sample uses the ResNet-50 model and the ImageNet dataset.

Prerequisites

  • You have prepared a training server equipped with Ascend 910 AI Processors and installed the corresponding driver and firmware. For details about how to install the driver and firmware, see "NPU Driver and Firmware Installation".
  • You have installed the development kit Ascend-CANN-Toolkit and ops operator package. For details, see "Installing CANN".
  • The following uses PyTorch 2.1.0 as an example. For details about how to install the PyTorch framework and torch_npu plugin, see "Installing PyTorch".
  • You have run the following commands to install required dependencies before using PyTorch GPU2Ascend for migration. If you use a user other than user root, add --user to the end of each installation command, for example, pip3 install pandas --user.
    pip3 install pandas         # (Mandatory) The pandas version must be 1.2.4 or later.
    pip3 install libcst         # (Mandatory) The semantic analysis library is used to parse Python files.
    pip3 install prettytable    # (Mandatory) This dependency is used to visualize data in charts.
    pip3 install jedi           # (Mandatory) This dependency is used for cross-file parsing.
  • You have downloaded the main.py file and saved the ResNet-50 model to a custom path (for example, /home/user).

Automatic Migration (Recommended)

You can simply import the library code into a training script and run it directly on the Ascend NPU platform after migration, with minimal modifications required.

  1. Import the library code for automatic migration to the training script main.py.
    from torch.utils.data import Subset
    import torch_npu 
    from torch_npu.contrib import transfer_to_npu   
    .....
  2. Switch to the directory (/home/user is used as an example) where the migrated training script is stored and run the following commands to use the virtual dataset for training. The migrated training script can run properly on the NPU.
    If the iteration log starts to be printed, the training function is successfully migrated.
    cd /home/user
    python main.py -a resnet50 --gpu 1 --epochs 1 --dummy  # --gpu 1 indicates that GPU 1 is used, and --epochs 1 indicates that the number of iterations is 1.
  3. The migration tool automatically saves the weight, indicating that the migration is successful.

Using the PyTorch GPU2Ascend Tool for Migration

  1. Go to the path where the migration tool is located.
    cd ${INSTALL_DIR}/cann/tools/ms_fmk_transplt/  # Replace ${INSTALL_DIR} with the CANN component directory. For example, if the installation is performed by the root user, the default file storage path is /usr/local/Ascend/cann.
  2. Execute the script migration task. For details, see the configuration information in Table 1.
    ./pytorch_gpu2npu.sh -i /home/user -o /home/out -v 2.1.0  # /home/user indicates the path of the original script, /home/out indicates the output path of the script migration result, and 2.1.0 indicates the PyTorch framework version of the original script.
  3. Switch to the directory (/home/user is used as an example) where the migrated training script is stored and run the following commands to use the virtual dataset for training. The migrated training script can run properly on the NPU.
    If the iteration log starts to be printed, the training function is successfully migrated.
    cd /home/user
    python main.py -a resnet50 --gpu 1 --epochs 1 --dummy  # --gpu 1 indicates that GPU 1 is used, and --epochs 1 indicates that the number of iterations is 1.
  4. After the script migration is complete, go to the output path of the script migration result to view the result files.

    During script migration, migration analysis is started. By default, the torch_apis and affinity_apis analysis modes are used. You can view the corresponding result files by referring to Analysis Report Overview.

  5. The migration tool automatically saves the weight, indicating that the migration is successful.