Before You Start

Environment Setup

  • Configure environment variables by referring to Environment Setup.
  • To use MindStudio Insight, you need to install the MindStudio Insight software package separately. For details about the download link, see"Installation and Uninstallation".
  • To use the template library for simulation, you need to add the --simulator option to the compilation script to compile the operator in simulator mode. For details, click the link.
    bash scripts/build.sh --simulator 00_basic_matmul

    The template library scenario applies only to Atlas A2 training products/Atlas A2 inference products.

Restrictions

  • You are advised to collect profile data within 5 minutes and ensure that the set memory size is greater than 20 GB (for example, docker run --memory=20g container_name).
  • Ensure that the profile data is stored in the current user directory that does not contain soft links. Otherwise, security problems may occur.

Configuration of msprof op

To implement cache heatmap redirection, perform the following operations:

  1. Add the -g compilation option during operator compilation. For details, see Adding -g Compilation Option.
  2. Set --aic-metrics in Table 2 to Source.

Configurations of msprof op simulator

The simulation function of the msProf tool supports only single-device scenarios. The multi-device scenario cannot be simulated, and only device 0 can be set in the code. If the visible card number is changed, the simulation fails.

  • Before using the msProf tool to perform operator simulation tuning in --config mode, run the following command to configure environment variables:
    export LD_LIBRARY_PATH=${INSTALL_DIR}/tools/simulator/Ascendxxxyy/lib:$LD_LIBRARY_PATH 

    Modify the preceding environment variables based on the actual installation path of the CANN package and the Ascend AI Processor type.

  • Add -g compilation option to enable the operator code hot spot map and code call stack functions.
    • If the -g compilation option is added, the generated binary file contains debugging information. You are advised to restrict the access permission of user programs with debugging information to ensure that only authorized personnel can access the binary file.
    • If the functions provided by the llvm-symbolizer component are not used, do not include -g when compiling the program that is input to msProf. In this case, the msProf tool does not call the functions of the llvm-symbolizer component.
    • For an operator project created by using the msOpGen tool, edit the CMakeLists.txt file in the op_kernel directory of the operator project. For details, see Creating an Operator Project.
      1
      add_ops_compile_options(ALL OPTIONS -g)
      
    • For an operator project created by referring to a complete sample (link), add the following code to the cmake/npu_lib.cmake file in the sample project directory:
      • This example project does not support Atlas A3 Training Series Product.
      • When downloading the code sample, run the following command to specify the branch version:
        git clone https://gitee.com/ascend/samples.git -b master
      ascendc_compile_options(ascendc_kernels_${RUN_MODE} PRIVATE
      -g
      -O2
      )
    • For Triton operators, add -g by configuring the following environment variable.
      1
      export TRITON_DISABLE_LINE_INFO=0
      
  • When the msProf tool is used to perform simulation tuning on the operator of the PyTorch script, the built-in print function of Python cannot print the variables and values of the device.
  • For the simulators of the Atlas A3 training products/Atlas A3 inference products and Atlas A2 training products/Atlas A2 inference products, if the simulated blockdim exceeds the number of physical cores during running, the simulator may report the following error. You can resolve this issue by configuring the core_ostd_num parameter in the pem_config_cloud.toml file. The path of the pem_config_cloud.toml file is ${INSTALL_DIR}/tools/simulator/Ascendxxxyy/lib/pem_config_cloud.toml.
    [ARCH]
        cube_core_num           = 1
        vec_core_num            = 2
        core_ostd_num        = 2             # 2 early end  1 normal mode

Starting the Tool

Currently, msProf does not support the -O0 compile option.