One-Click Accuracy Analyzer Deployment

Overview

One-Click Accuracy Analyzer analyzes the network accuracy from the following aspects:

This tool encapsulates the running parameters of TF Adapter and extends Model Accuracy Analyzer in CANN Toolkit, facilitating fast accuracy fault location.

Restrictions

  • Currently, this tool supports only TensorFlow 1.15 and 2.6 training. For details about accuracy tuning in TensorFlow 1.15, see TensorFlow 1.15 Model Porting Guide.
  • Overflow/Underflow data collection is mutually exclusive with accuracy data dump.
  • Set a proper epoch number, which helps avert running out of disk space caused by a large number of files generated during overflow/underflow data collection or accuracy data dump.

One-Click Accuracy Analyzer Deployment

This tool is installation-free. Download the precision_tool directory from https://gitee.com/ascend/tools and upload it to the training directory. The following shows the directory structure:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
├── resnet                              // Training working directory.
    ├── __init__.py     
    ├── imagenet_main.py              // Script for training the network based on the ImageNet dataset.
    ├── imagenet_preprocessing.py     // ImageNet preprocessing module.
     ├── resnet_model.py               // ResNet model file.
    ├── resnet_run_loop.py            // Data input processing and run loop (for training, validation, and test).
    ├── cifar10_main                  // Training entry point file.
    ├── ...
    ├── precision_tool           // Directory of the one-click accuracy analyzer
        ├── cli.py                   
        ├── ...
  • If the CANN development and operating environments are set up on the same server, simply upload the precision_tool directory to the training directory.
  • If the CANN development and operating environments are set up on separate servers, upload the precision_tool directory to the training directory in the CANN operating environment and any directory in the CANN development environment.

    CANN operating environment (where training is run on the Ascend AI Processor): for dumping accuracy data during training accuracy tuning.

    CANN development environment (where CANN Toolkit is installed): used for accuracy analysis during training accuracy tuning.

Typical Workflow

Figure 1 CANN development and operating environments set up on the same server
Figure 2 CANN development and operating environments set up separately