Introduction
Overview
MindStudio Insight is a visualization tuning tool for Ascend AI developers. It supports system, operator, serving, and memory tuning, enabling developers to quickly optimize performance in training, inference, and operator development scenarios.
MindStudio Insight provides various tuning analysis methods, displays real software and hardware running data, analyzes performance bottlenecks from multiple dimensions, and supports visualized performance analysis for clusters with hundreds and thousands of cards, and even beyond, enabling developers to complete performance tuning within days.
Advantages
- MindStudio Insight allows developers to view the profile data in the cluster scenario on the Timeline tab page, displays the data by single card, and automatically traverses .db files or all trace_view.json files (in PyTorch and MindSpore scenarios) and msprof*.json files (in TensorFlow and offline inference scenarios) in the input path. Developers do not need to manually merge files.
- With the database, MindStudio Insight supports large-scale profile data processing, analysis of 20 GB cluster profile data, and performance tuning in foundation model scenarios.
Scenario
- System tuning: MindStudio Insight provides the timeline view, memory usage, operator duration, and communication bottleneck analysis to help developers quickly locate model performance bottlenecks.
Table 1 Functions Page
Introduction
Description
Timeline
Displays the running status of the entire online inference and training process in the timeline view based on the scheduling process, and provides functions such as cluster timeline display and system view details viewing.
-
Memory
Provides a visualized display of memory information during collection. Displays the operator memory trend in an operator memory curve.
-
Operator
Provides operator duration statistics and analysis.
-
Summary
Displays the computing and communication operator duration analysis, and displays the analysis results in a bar chart, curve, and data pane.
PyTorch or MindSpore cluster scenarios are supported.
Communication
Displays the network link performance across the cluster and the communication performance of all nodes. By analyzing the overlapped duration between cluster communication and computation, slow hosts or nodes in the cluster training can be identified.
PyTorch or MindSpore cluster scenarios are supported.
RL (Reinforcement Learning)
Provides a visualized display of the pipeline diagram in each phase of the reinforcement learning process.
-
- Operator tuning: MindStudio Insight provides the instruction pipeline view, operator source code view, and operator running load analysis view to display the key performance metrics of operators running on the Ascend AI Processors in a visualized manner, allowing developers to quickly locate software and hardware performance bottlenecks of operators and improve operator performance analysis efficiency.
Table 2 Functions Page
Introduction
Remarks
Timeline
Displays the running status of instructions on the Ascend AI Processor in a timeline view, displays the overall running status based on the scheduling process, and allows users to view instruction details and search for instructions.
-
Source
Displays the operator instruction heatmap, and allows developers to view the mapping between the operator source code and instruction sets as well as the time consumption.
BIN files collected by msProf are supported.
Details
Displays the basic operator information, compute workload analysis, and memory workload analysis, as well as the analysis results in charts and data panes.
BIN files collected by msProf are supported.
Cache
Displays the L2 cache access of kernel functions in user programs, helping users optimize the cache hit rate.
BIN files collected by msProf are supported.
- Serving tuning: MindStudio Insight displays the end-to-end request execution in the timeline view, showing the duration of the request in each key phase and the status of the request. This helps users quickly identify service performance bottlenecks and adjust the tuning policy.
Table 3 Functions Page
Introduction
Description
Timeline
Displays the end-to-end request execution status in a timeline view, helping users intuitively view the duration of the request in each key phase and the current request status.
JSON files of trace data of inference service requests are supported.
Curve
Displays the end-to-end performance of the inference service process in a curve and data details table.
profiler.db files are supported.
- Memory tuning: MindStudio Insight displays the detailed memory allocation on the device on graphics, and marks memory allocation and usage details based on the Python call stack and custom trace tags, to locate and tune memory issues.
Table 4 Functions Page
Overview
Description
Leaks
The call stack graph, line & block graph, and memory disassembly graph are used to display the memory status, allowing developers to efficiently analyze and locate memory issues.
DB memory result files collected by msLeaks are supported.
Constraints
File Type |
Suggestion |
Specification Restriction |
|---|---|---|
.json |
Keep files under 1 GB each and 20 GB in total. |
Single file limit: 10 GB |
.bin |
Single file limit: 500 MB |
Single file limit: 10 GB |
.db |
|
|
.csv |
CSV files are stored in text data. Single CSV file limit: 500 MB |
Single file limit: 2 GB |