Before You Start
Setting Up the Environment
- Configure environment variables by referring to Environment Setup.
Restrictions
- You are advised to collect profile data within 5 minutes and ensure that the set memory size is greater than 20 GB (for example, docker run --memory=20g container_name).
- Ensure that the profile data is stored in the current user directory that does not contain soft links. Otherwise, security problems may occur.
Configurations of msprof op simulator
The simulation function of the msProf tool supports only the single-card scenario and must run on card 0. If the visible card number is changed, the simulation fails.
- Before using the msProf tool to perform operator simulation-based tuning in --config mode, run the following command to configure environment variables:
export LD_LIBRARY_PATH=${INSTALL_DIR}/tools/simulator/Ascendxxxyy/lib:$LD_LIBRARY_PATH # xxxyy specifies the processor type.Modify the preceding environment variables based on the actual installation path of the CANN package and the Ascend AI Processor type.
- Add -g to the compilation option to enable the operator code hotspot map and code call stack functions.
- If the -g compilation option is added, the generated binary file contains debugging information. You are advised to restrict the access permission of user programs with debugging information to ensure that only authorized personnel can access the binary file.
- If the functions provided by the llvm-symbolizer component are not used, do not include -g when compiling the program that is input to msProf. In this case, the msProf tool does not invoke the functions of the llvm-symbolizer component.
- For an operator project created by using the msOpGen tool, edit the CMakeLists.txt file in the op_kernel directory of the operator project. For details, see Creating an Operator Project.
1add_ops_compile_options(ALL OPTIONS -g)
- For an operator project created by referring to a complete sample (sample), add the following code to the cmake/npu_lib.cmake file in the sample project directory:
- When downloading the code sample, run the following command to specify the branch version:
git clone https://gitee.com/ascend/samples.git -b v0.2-8.0.0.beta1
ascendc_compile_options(ascendc_kernels_${RUN_MODE} PRIVATE -g -O2 ) - When downloading the code sample, run the following command to specify the branch version:
- When the msProf tool is used to perform simulation-based tuning on the operator of the PyTorch script, the built-in print function of Python cannot print the variables and values of the device.
- When using msProf to perform operator simulation-based tuning for the , change the value of flush_level in the davinci_mini.spec and davinci_vec_core.spec files to info. That is, change flush_level = "3" to flush_level = "2". The davinci_mini.spec and davinci_vec_core.spec files are stored in ${INSTALL_DIR}/tools/simulator/Ascendxxxyy/lib/ davinci_mini.spec and
${INSTALL_DIR}/tools/simulator/Ascendxxxyy/lib/davinci_vec_core.spec.
- CANN 8.0.RC2 and later versions support thread acceleration and L2Cache simulation enhancement of the simulator of . You can modify the configuration as follows:
- You can configure config_stars.json to speed up multiple threads of the simulator. The path of the config_stars.json file is as follows:${INSTALL_DIR}/tools/simulator/Ascendxxxyy/lib/config_stars.json
1 2 3 4 5 6 7 8 9 10 11 12 13 14
{ "stars": { "ffts_mode": 1 }, "model_top": { "sim_type": 0, "num_aic": 24, "num_aiv": 48 }, "pem": { "parsim": 1, "parsim_thd_limit": 24 } }
- You can configure the config.json file to enhance L2Cache simulation. The path of the config.json file is as follows:${INSTALL_DIR}/tools/simulator/Ascendxxxyy/lib/config.json
1 2 3 4 5 6 7 8 9 10
{ "L2CACHE": { "cache_enable": 1, "cache_set_size": 24, "cache_way_size": 16384, "cache_line_size": 512, "cache_read_latency": 241, "cache_write_latency": 96 } }
- When using msProf to perform operator simulation-based tuning for and , change the value of flush_level in the config.json file to info. That is, change flush_level = "3" to flush_level = "2". The path of the config.json file is ${INSTALL_DIR}/tools/simulator/Ascendxxxyy/lib/config.json.
- You can configure config_stars.json to speed up multiple threads of the simulator. The path of the config_stars.json file is as follows:
Starting the Tool
- Enable onboard tuning of msProf by referring to msprof op.
- Configure simulation-based tuning by referring to Configurations of msprof op simulator, and then enable it by referring to msprof op simulator.
Currently, msProf does not support the -O0 compile option.
Parent topic: msProf (Operator Tuning)