Procedure

Main steps of MindStudio IDE-based model training include:

  1. Creating a Training Project (training for the first time)/Importing a Training Project (using an existing training project)
  2. Setting Run Configurations
  3. Performing Training

Prerequisites

  • If you select a MindSpore project or sample, set up the operating environment by following instructions on the Installation page at the MindSpore official website.
  • If you select a TensorFlow project or sample, ensure that the framework plugin package Ascend-cann-tfplugin_xxx.run has been installed in the operating environment for model training. For details about the installation method, see CANN Software Installation Guide.
  • If you select a PyTorch project or sample, install the PyTorch framework and mixed precision module. For details about the installation method, see "Environment Setup" in the PyTorch Training Model Porting and Tuning Guide.

Creating a Training Project

You can use the MindSpore, TensorFlow, and PyTorch training frameworks as templates to create training projects. The MindSpore training framework provides MindSpore Insight to visualize the training process. For details about how to use MindSpore Insight, see the official MindSpore Insight documents.

  1. Navigate to the Create Ascend Training Project from Template dialog box, as shown in Figure 1.
    • On the MindStudio IDE welcome page, click New Project to access the project creation page.
    • On the MindStudio IDE project page, click in the upper left corner of the project page. On the menu bar, choose File > New > Project... The project creation page is displayed.
    Figure 1 Project creation page
  2. Create a training project.
    • Select a project under Templates.
      1. In the navigation tree on the left, choose Ascend Training, as shown in Figure 1.
        1. Select a CANN Version on the right pane.
        2. Select a project under Templates, for example, MindSpore Project.
          If you select a project under Templates, you need to prepare a training script.
          • If an NPU training script is used, you can directly proceed with model training.
          • If a GPU training script is used, refer to Analysis and Migration to convert it into an NPU training script and then perform model training.
      2. Click Next and configure other information about the training project. Table 1 describes the parameters.
        Table 1 Project parameters

        Parameter

        Description

        Project name

        Project name (user-defined).

        • The name must start and end with a digit or letter.
        • Only letters, digits, hyphens (-), and underscores (_) are allowed.
        • The name contains a maximum of 64 characters.

        Project location

        Default path for saving a project (user-defined). (For users who use MindStudio IDE for the first time, the default value is $HOME/MindstudioProjects)

        More settings

        Module name: module name, the same as the Project name.

        Content root: path in the root directory.

        Module file location: module file path.

        Click the check box on the right of Project format. A drop-down list is displayed.
        • .idea (directory-based) (default option): During project creation, an .idea project directory is created to save the project information.
        • .ipr (file-based): project configuration file used to save the project configuration information.
      3. Click Create. The training project is created.
        If there is already an active project in the window, a confirmation message is displayed.
        • Click This Window to open the newly created project in the current window.
        • Click New Window to open the newly created project in a new window.
      4. View the directory structure and main files of the training project (subject to the actual creation result).
        ├── .idea
        ├── data                                  // Dataset directory, which needs to be created by yourself.
        ├── .project                                 // Project information file, including the project type, project description, target device type, and CANN version
        ├── train.py                                // Training script file, which is an empty file. You can create a training script here.
        ├── MyTraining.iml                         

Importing a Training Project

If a training project exists, you do not need to create one. Instead, directly import it through MindStudio IDE. The procedure is as follows:

  1. Use MindStudio IDE to import the training project.
    • On the MindStudio IDE welcome page, click Open, select the project to be imported, and click OK.
    • In the upper left corner of the MindStudio IDE project page, click the menu bar, choose File > Open... from the top menu bar or click on the toolbar, and select an existing project to open it.
      • If a project has code risks, the trust window is displayed when you open the project.
        • If the project source code is trusted and secure, click Trust Project. (You can select Trust projects in <workspace_directory> to trust all projects in the directory.)
        • If the project is not trusted and you only want to view the source code, click Preview in Safe Mode to preview the project in safe mode.
        • To cancel opening the project, click Don't Open.
      • If you import an NPU training project, you can directly perform model training. If you import a GPU training project, convert it into an NPU training script according to Analysis and Migration and then perform model training.
  2. If there is already an active project in the window, a confirmation message is displayed.
    • Click This Window to open the newly created project in the current window.
    • Click New Window to open the newly created project in a new window.
  3. After a project is imported, the project directory is displayed in a tree structure (subject to the actual result).

    If the imported project is not an Ascend project, you need to convert it to an Ascend project and then set run configurations. For details, see Project Conversion.

Setting Run Configurations

  1. Choose Run > Edit Configurations... on the project page or click Edit Configurations... on the menu shown in Figure 2 to access the run configuration page.
    Figure 2 Shortcut menu of the run configuration page
  2. Configure training parameters.

    Run configurations can be set in the Ascend Training and Python modes. (The Python mode is recommended.)

    • Python: supports project running, project debugging, and code redirection upon exceptions during training.
    • Ascend Training: supports only project running.
    • Use the Python configuration mode.
      If the source execution file in the training project is in xxx.py format, this configuration mode is recommended.
      1. Click + in the upper left corner, select Python from the prompted drop-down list, and add the run and debug configurations of the file.
      2. The configuration options are displayed on the right. Figure 3 shows a configuration example. Table 2 lists the key configuration parameters.
        Figure 3 Configuration options of the source file
        Table 2 Key debug and run parameters

        Parameter

        Description

        Name

        This parameter is user-defined.

        Script path

        Select the path of the Python source file to be debugged.

        Parameters

        Run parameters. Set them as required.

        Python interpreter

        Use SDK of module: Use a module-level Python SDK to parse data.

        For details, see Setting the Module-Level Python SDK (Ascend Project) or Setting the Module-Level Python SDK (Non-Ascend Project).

        Use specified interpreter: Use a specified interpreter that has been configured.

        For details, see Python SDK Settings.

        Working directory

        Working directory (defaults to the directory where the Python source file to be debugged is located).

      3. Click Apply, and then click OK to save the settings and close the debug configuration page.
    • Use the Ascend Training configuration mode.

      If the source execution file in the training project is in xxx.sh format, this configuration mode is recommended.

      1. Click + in the upper left corner, select Ascend Training from the prompted drop-down list, and add the run configurations of the file.
      2. The configuration options are displayed on the right. Figure 4 shows a configuration example. Table 3 lists the key configuration parameters.
        Figure 4 Run configuration page
        Table 3 Run configurations of the training project

        Parameter

        Description

        Name

        Project name (user-defined). This parameter is mandatory.

        • The name must start with a letter and end with a letter or digit.
        • Only letters, digits, hyphens (-), and underscores (_) are allowed.
        • It can contain a maximum of 64 characters.

        Executable

        Source execution file in a training project. This parameter is mandatory.

        Example: $home/AscendProjects/MyTraining/xxx

        Command Arguments

        Command-line arguments for training. This parameter is optional.

        Environment Variables

        Environment variables of the training project. This parameter is optional.

      3. Click Apply, and then click OK to save the settings and close the run configuration page.

Performing Training

  1. Choose Run > Run 'train' on the project page to perform training.
  2. View the training result.
    • If the training is successful, the real-time running information is displayed in the Run window at the bottom of the project page, as shown in Figure 5.
      Figure 5 Real-time running information
    • If the training fails, the network_analysis_timestamp.report of the training project is generated in out/reports in the root directory of the project. Figure 6 shows the report content.
      Figure 6 Network analysis report

      Figure 6 is the .report file generated in out/reports under the project root directory after a training project using MindSpore as the training framework fails. For training projects using TensorFlow and PyTorch as training frameworks, the .report files are returned to user-defined output paths after training failures.