PyTorch Visualization for Comparison

Prerequisites

Procedure

  1. Create a configuration file for comparison.
    For example, create a compare.json configuration file in the directory where the training script is located and copy the following content to the file:
    1
    2
    3
    4
    5
    {
    "npu_path": "./dump_data_npu",
    "bench_path": "./dump_data_gpu",
    "is_print_compare_log": true
    }
    

    The paths specified by npu_path and bench_path must be in the same environment.

  2. Perform graph building for comparison.
    msprobe -f pytorch graph -i ./compare.json -o ./output

    After the comparison is complete, a .vis file is generated in ./output.

  3. Start TensorBoard.
    tensorboard --logdir ./output --bind_all

    The path specified by --logdir is the ./output path in 2.

    After the preceding command is executed, the following log is displayed:

    1
    TensorBoard 2.19.0 at http://ubuntu:6008/ (Press CTRL+C to quit)
    

    Open a browser on Windows and navigate to http://ubuntu:6008/. Replace ubuntu with the server's IP address, for example, http://192.168.1.10:6008/.

    If the access is successful, the TensorBoard page is displayed, as shown in the figure below.

    Figure 1 PyTorch Visualization for comparison

    In this example, level is set to L1 during data dumping. As a result, no model structure data is collected, and no data is available when performing PyTorch Visualization. The data shown in the preceding figure was collected when level was not set to L1.