Quick Start

Introduction

The PyTorch GPU2Ascend tool can migrate GPU-based training scripts into NPU-based scripts at a high speed, minimizing the workload of developers. This sample helps developers quickly experience the migration efficiency of the PyTorch GPU2Ascend tool.

This sample uses the ResNet-50 model and the ImageNet dataset.

Prerequisites

  • Prepare a training server equipped with Ascend 910 AI Processors and install the corresponding driver and firmware. For details, see Installing the NPU Driver and Firmware.
  • Install the Ascend-CANN-Toolkit. For details, see Installing the Toolkit Development Kit.
  • The installation of PyTorch 2.1.0 is used as an example. For details about how to install the PyTorch framework and torch_npu plugin, see Procedure.
  • Before using PyTorch GPU2Ascend for migration, run the following commands to install required dependencies. If you use a non-root user, add --user to the end of each installation command.
    pip3 install pandas         # (Mandatory) The pandas version must be 1.2.4 or later.
    pip3 install libcst         # (Mandatory) The semantic analysis library is used to parse Python files.
    pip3 install prettytable    # (Mandatory) This dependency is used to visualize data in charts.
    pip3 install jedi           # (Optional) This dependency is used for cross-file parsing. You are advised to install it.
  • Download the main.py file and save the obtained ResNet-50 model to a user-defined path (for example, /home/user).

Automatic Migration

(Recommended) You only need to import the library code to the training script and run the code on the Ascend NPU platform after migration.

  1. Import the library code for automatic porting to the main.py file of the training script.
    1
    2
    3
    4
    import torch 
    import torch_npu 
    from torch_npu.contrib import transfer_to_npu   
    .....
    
  2. Switch to the directory (/home/train/examples/imagenet) where the ported training script is stored and run the following command to use the virtual dataset for training. The ported training script can run properly on the NPU.
    If the iteration log starts to be printed, the training function is successfully migrated.
    cd /home/train/examples/imagenet
    python main.py -a resnet50 --gpu 1 --epochs 1 --dummy # --gpu 1 indicates that card 1 is used. --epochs 1 indicates that the number of iterations is 1.
  3. If the weight is saved successfully, it indicates that the weight saving migration is successful.

Using the PyTorch GPU2Ascend Tool for Migration

  1. Go to the path where the migration tool is located.
    cd Ascend-cann-toolkit_installation_directory/ascend-toolkit/latest/tools/ms_fmk_transplt/
  2. Execute the script migration task. For details, see the configuration information in Table 1.
    ./pytorch_gpu2npu.sh -i original_script_path    -o output_path_of_the_script_migration_result  -v PyTorch framework_version_of_the_original_script 

    Command example:

    ./pytorch_gpu2npu.sh -i /home/train/examples/imagenet -o /home/out -v 2.1.0
  3. Switch to the directory (/home/train/examples/imagenet) where the ported training script is stored and run the following command to use the virtual dataset for training. The ported training script can run properly on the NPU.
    If the iteration log starts to be printed, the training function is successfully migrated.
    cd /home/train/examples/imagenet
    python main.py -a resnet50 --gpu 1 --epochs 1 --dummy # --gpu 1 indicates that card 1 is used. --epochs 1 indicates that the number of iterations is 1.
  4. After the script migration is complete, go to the output path of the script migration result to view the result file.

    During script migration, migration analysis is started. By default, the torch_apis and affinity_apis analysis modes are used. You can view the corresponding result files by referring to Analysis Report Overview.

  5. If the weight is saved successfully, it indicates that the weight saving migration is successful.