Operator Code Hot Spot Map
The visualize_data.bin file generated by msprof op or msprof op simulator can be visualized using MindStudio Insight. On the GUI, you can view the mapping between operator source code and instruction sets, as well as the time consumption. This helps developers identify hot spot code distribution and analyze potential optimizations.
- To use MindStudio Insight, you need to install the MindStudio Insight software package separately. For details about the download link, see"Installation and Uninstallation".
- For details about how to import the visualize_data.bin file to MindStudio Insight, see Importing Profile Data.
- For details about the operations and fields on MindStudio Insight, see .
- If the -g compilation option is added, the generated binary file contains debugging information. You are advised to restrict the access permission of user programs with debugging information to ensure that only authorized personnel can access the binary file.
- Operator programs must be compiled with the -g option. Otherwise, msProf will not display the hot spot map or call the functions of the llvm-symbolizer component to implement mapping between code and program counters.
- The msprof op operator code hot spot map function is not applicable to
Atlas inference products . - The MC2 and LCCL operators do not support the generation of Operator Code Hot Spot Map.
msprof op Hot Spot Map
- On the top of the page, you can switch between compute units and kernel function files.
- The left pane shows the L2 cache hit ratio, GM-related data movement, and instruction count for each line of the operator kernel function code, making it easier to identify bottlenecks.
- The right pane shows the L2 cache hit ratio, GM-related data movement, runtimes, and code associations by instruction, helping developers find out why code runs slowly.
- For details about the differences between the L2 cache hit ratio on the timeline and details pages of MindStudio Insight, see Table 1.
Table 1 MindStudio Insight L2 cache hit ratio comparison table Page
Data Source
Dimension
Timeline
Tool simulation
Code lines and instructions
Details
Real
Kernel
NA is displayed if no GM unit is involved when Process Bytes is checked.
- For detailed feature support of msprof op, see Table 2.
|
Column |
|
|
|
Description |
|---|---|---|---|---|
|
Source |
Supported |
Supported |
Not supported |
- |
|
Address |
Supported |
Supported |
Not supported |
- |
|
Pipe |
Supported |
Supported |
Not supported |
- |
|
Instructions Executed |
Supported |
Supported |
Not supported |
Displays the number of times that the operator source code and instructions are executed. |
|
GPR Count |
Not supported |
Not supported |
Not supported |
Displays the register usage. |
|
L2Cache Hit Rate |
Supported |
Supported |
Not supported |
Simulates the L2 cache hit ratio in the code line and instruction dimensions. |
|
Process Bytes |
Supported |
Supported |
Not supported |
Displays the amount of data moved related to GM. |
msprof op simulator Hot Spot Map
- On the top of the page, you can switch between compute units and kernel function files.
- The left pane displays the time consumed by each line of code of the operator kernel, register usage, read and write conflicts of vector instructions on the UB Bank, Vector Unit usage, GM-related data movement, and the number of corresponding instructions, helping developers quickly locate bottlenecks.
- The right pane displays the time consumed by each instruction, register usage, GM-related data movement, read and write conflicts of vector instructions on the UB Bank, Vector Unit usage, number of execution times, and code associations, helping developers further analyze the cause of long code execution time.
- The maximum number of general-purpose registers is 32. When the number of used registers reaches 32, the simulation can be performed only after the used registers are released.
- You cannot use the TRACE_START and TRACE_STOP APIs to check the register usage of some operators.
- NA is displayed if no GM unit is involved when Process Bytes is checked.
- For detailed feature support of msprof op simulator, see Table 3.
|
Column |
|
|
|
Description |
|---|---|---|---|---|
|
Source |
Supported |
Supported |
Supported |
- |
|
Address |
Supported |
Supported |
Supported |
- |
|
Pipe |
Supported |
Supported |
Supported |
- |
|
Cycles |
Supported |
Supported |
Supported |
Displays the time consumed by the operator source code and instructions. |
|
Instructions Executed |
Supported |
Supported |
Supported |
Displays the number of times that the operator source code and instructions are executed. |
|
GPR Count |
Supported |
Supported |
Supported |
Displays the register usage. |
|
UB Read Conflict |
Supported |
Supported |
Supported |
- |
|
UB Write Conflict |
Supported |
Supported |
Supported |
- |
|
Vector Utilization Percentage |
Supported |
Supported |
Supported |
- |
|
Process Bytes |
Supported |
Supported |
Not supported |
Displays the amount of data moved related to GM. |