Model Accuracy Analyzer Deployment

Overview

To facilitate accuracy tuning in training, the One-Click Accuracy Analyzer provides common functions for accuracy analysis of training networks, including:

This tool encapsulates the run parameters of TF Adapter and extends functions of the Model Accuracy Analyzer (msaccucmp.py) in the Ascend-CANN-Toolkit, facilitating fast accuracy fault location.

Restrictions

  • Currently, this tool supports only TensorFlow 1.15 and 2.6.5 training. For details about accuracy tuning in TensorFlow 2.6.5, see TensorFlow 2.6.5 Model Porting Guide.
  • Overflow/Underflow data collection is mutually exclusive with accuracy data dump.
  • Set a proper epoch number, which helps avert running out of disk space caused by a large number of files generated during overflow/underflow data collection or accuracy data dump.

One-Click Accuracy Analyzer Deployment

This tool is installation-free. Download the precision_tool directory from https://gitee.com/ascend/tools and upload it to the training directory.

The directory structure is as follows:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
├── resnet                              // Training working directory.
    ├── __init__.py     
    ├── imagenet_main.py              // Script for training the network based on the ImageNet dataset.
    ├── imagenet_preprocessing.py     // ImageNet preprocessing module.
     ├── resnet_model.py               // ResNet model file.
    ├── resnet_run_loop.py            // Data input processing and run loop (for training, validation, and test).
    ├── cifar10_main                  // Training entry point file.
    ├── ...
    ├── precision_tool           // Directory of the one-click accuracy analyzer
        ├── cli.py                   
        ├── ...
  • If the CANN development and operating environments are set up on the same server, simply upload the precision_tool directory to the training directory.
  • If the CANN development and operating environments are set up on separate servers, upload the precision_tool directory to the training directory in the CANN operating environment and any directory in the CANN development environment.
    • CANN operating environment (where training is run on Ascend AI Processor): for dumping accuracy data during training accuracy tuning.
    • CANN development environment (where CANN Toolkit is installed): used for accuracy analysis during training accuracy tuning.

Typical Workflow

Figure 1 shows the workflow of using the One-Click Accuracy Analyzer precision_tool to analyze accuracy when the CANN development and operating environments are deployed on the same server.

Figure 1 CANN development and operating environments set up on the same server

Figure 2 shows the workflow of using the One-Click Accuracy Analyzer precision_tool to analyze accuracy when the CANN development and operating environments are deployed separately.

Figure 2 CANN development and operating environments set up separately