Procedure

Main steps of MindStudio-based model training include:

Creating a Training Project (training for the first time)/Importing a Training Project (using an existing training project)
Setting Run Configurations
Performing Training

Prerequisites

If you select a MindSpore project or sample, set up the operating environment by following instructions on the Installation page at the MindSpore official website.
If you select a TensorFlow project or sample, ensure that the framework plugin package Ascend-cann-tfplugin_xxx.run has been installed in the local or remote operating environment for model training. For details about the installation method, see "Installing the Development Environment > Installing Software Packages on an Ascend Device > Installing the Framework Plugin Package" in the CANN Software Installation Guide.
If you select a PyTorch project or sample, install the PyTorch framework and mixed precision module. For details about the installation method, see "Environment Setup" in the PyTorch Network Porting and Training Guide.

Creating a Training Project

You can use the MindSpore, TensorFlow, and PyTorch training frameworks as templates to create training projects. The MindSpore training framework provides MindSpore Insight to visualize the training process. For details about how to use MindSpore Insight, see the official MindSpore Insight documents.

Navigate to the Create Ascend Training Project from Template page, as shown in Figure 1.
- On the MindStudio welcome page, click New Project to access the project creation page.
- On the MindStudio project page, choose File > New > Project... from the menu bar to access the project creation page.
Figure 1 Project creation page

Create a training project.

Select a project under Templates.

In the navigation tree on the left, choose Ascend Training, as shown in Figure 1.
1. Select a CANN Version on the right pane.
2. Select a project under Templates, for example, MindSpore Project.
  If you select a project under Templates, you need to prepare a training script.
  
  If an NPU training script is used, you can directly proceed with model training.
  
  If a GPU training script is used, refer to Analysis and Migration to convert it into an NPU training script and then perform model training.

Click Next and configure other information about the training project. Table 1 describes the parameters.

**Table 1** Project parameters
Parameter	Description
Project name	Project name (user-defined). The name must start and end with a digit or letter. Only letters, digits, hyphens (-), and underscores (_) are allowed. The name contains a maximum of 64 characters.
Project location	Default path for saving a project (user-defined). (For users who use MindStudio for the first time, the default value is $HOME/MindstudioProjects)
More settings	Module name: module name, the same as the Project name.
	Content root: path in the root directory.
	Module file location: module file path.
	Click the check box on the right of Project format. A drop-down list is displayed. .idea (directory-based) (default option): During project creation, an .idea project directory is created to save the project information. .ipr (file-based): project configuration file used to save the project configuration information.

Click Create. The training project is created.
If there is already an active project in the window, a confirmation message is displayed.
- Click This Window to open the newly created project in the current window.
- Click New Window to open the newly created project in a new window.

View the directory structure and main files of the training project (subject to the actual creation result).

├── .idea
├── data                                  // Dataset directory, which needs to be created by yourself.
├── .project                                 // Project information file, including the project type, project description, target device type, and CANN version
├── train.py                                // Training script file, which is an empty file. You can create a training script here.
├── MyTraining.iml

Select a project under Samples.
1. In the navigation tree on the left, choose Ascend Training, as shown in Figure 1.
  1. Select a CANN Version on the right pane.
  2. Select a project under Samples, for example, MindSpore.
2. Click Next. The corresponding code repository on Gitee is displayed.
3. On the Gitee code repository page, choose Clone or Download > Copy to copy the download link of the code package.
4. Run the git clone URL (URL is the copied download link of the code package) command in the development environment to clone the code package to the development environment.
```
git clone https://gitee.com/mindspore/models.git
```
  The links for downloading the TensorFlow and PyTorch code packages are as follows:
  - TensorFlow: https://gitee.com/ascend/ModelZoo-TensorFlow
  - PyTorch: https://gitee.com/ascend/ModelZoo-PyTorch
5. Select the required model sample from the downloaded folder and import it through MindStudio. For details, see Importing a Training Project.
  
  After downloading a model sample under Samples from the Gitee code repository, you can directly proceed with model training.

Importing a Training Project

If a training project exists, you do not need to create one. Instead, directly import it through MindStudio. The procedure is as follows:

Use MindStudio to import the training project.
- On the MindStudio welcome page, click Open, select the project to be imported, and click OK.
- On the MindStudio project page, choose File > Open... from the top menu bar or click on the menu bar to select an existing project and open it.
  - If a project has code risks, the trust window is displayed when you open the project.
    
    If the project source code is trusted and secure, click Trust Project. (You can select Trust projects in <workspace_directory> to trust all projects in the directory.)
    
    If the project is not trusted and you only want to view the source code, click Preview in Safe Mode to preview the project in safe mode.
    
    To cancel opening the project, click Don't Open.
  - If you import an NPU training project, you can directly perform model training. If you import a GPU training project, convert it into an NPU training script according to Analysis and Migration and then perform model training.
If there is already an active project in the window, a confirmation message is displayed.
- Click This Window to open the newly created project in the current window.
- Click New Window to open the newly created project in a new window.
After a project is imported, the project directory is displayed in a tree structure (subject to the actual result).

If the imported project is not an Ascend project, you need to convert it to an Ascend project and then set run configurations. For details, see Project Conversion.

Setting Run Configurations

Choose Run > Edit Configurations... on the project page or click Edit Configurations... on the menu shown in Figure 2 to access the run configuration page.
Figure 2 Shortcut menu of the run configuration page

Configure training parameters.

Run configurations can be set in the Ascend Training and Python modes. (The Python mode is recommended.)

Python: supports project running, project debugging, and code redirection upon exceptions during training.
Ascend Training: supports only project running.

Use the Python configuration mode.

If the source execution file in the training project is in xxx.py format, this configuration mode is recommended.

Click + in the upper left corner, select Python from the prompted drop-down list, and add the run and debug configurations of the file.

The configuration options are displayed on the right. Figure 3 shows a configuration example. Table 2 lists the key configuration parameters.

Figure 3 Configuration options of the source file

**Table 2** Key debug and run parameters
Parameter	Description
Name	This parameter is user-defined.
Script path	Select the path of the Python source file to be debugged.
Parameters	Run parameters. Set them as required.
Python interpreter	Use SDK of module: Use a module-level Python SDK to parse data. For details, see Setting the Module-Level Python SDK (Ascend Project) or Setting the Module-Level Python SDK (Non-Ascend Project).
Python interpreter	Use specified interpreter: Use a specified interpreter that has been configured. For details, see Python SDK Settings.
Working directory	Working directory (defaults to the directory where the Python source file to be debugged is located).

Click Apply, and then click OK to save the settings and close the debug configuration page.

Use the Ascend Training configuration mode.

If the source execution file in the training project is in xxx.sh format, this configuration mode is recommended.

Click + in the upper left corner, select Ascend Training from the prompted drop-down list, and add the run configurations of the file.

The configuration options are displayed on the right. Figure 4 and Figure 5 show two configuration examples. Table 3 lists the key configuration parameters.

Run Mode is set to Remote Run:
Figure 4 Run configuration page (Remote Run)
Run Mode is set to Local Run:
Figure 5 Run configuration page (Local Run)

**Table 3** Run configurations of the training project
Parameter	Description
Name	Project name (user-defined). This parameter is mandatory. The name must start with a letter and end with a letter or digit. Only letters, digits, hyphens (-), and underscores (_) are allowed. It can contain a maximum of 64 characters.
Run Mode	Run mode, either Remote Run (default) or Local Run.
Executable	Source execution file in a training project. This parameter is mandatory. Example: **$home/AscendProjects/MyTraining/xxx**
Deployment	Run configuration settings. This parameter is mandatory when Remote Run is selected. Set this parameter according to Ascend Deployment. You can synchronize the files and folders in a specified project to a specified directory on a remote device.
Command Arguments	Command-line arguments for training. This parameter is optional.
Environment Variables	Environment variables of the training project. This parameter is optional.

Click Apply, and then click OK to save the settings and close the run configuration page.

Performing Training

Choose Run > Run 'train' on the project page or click the button shown in Figure 6 to perform training.
Figure 6 Performing training using a shortcut
View the training result.
- If the training is successful, the real-time running information is displayed in the Run window at the bottom of the project page, as shown in Figure 7.
  Figure 7 Real-time running information
- If the training fails, the network_analysis_timestamp.report of the training project is generated in out/reports in the root directory of the project. Figure 8 shows the report content.
  Figure 8 Network analysis report
  
  The Figure 8 is the .report file generated in out/reports under the project root directory after a training project using MindSpore as the training framework fails. For training projects using TensorFlow and PyTorch as training frameworks, the .report files are returned to user-defined output paths after training failures.

Parent topic: Model Training