Introduction

Overview

MindStudio Insight is a visualization tuning tool for Ascend AI developers. It supports system, operator, serving, and memory tuning, enabling developers to quickly optimize performance in training, inference, and operator development scenarios.

MindStudio Insight provides various tuning analysis methods, displays real software and hardware running data, analyzes performance bottlenecks from multiple dimensions, and supports visualized performance analysis for clusters with hundreds and thousands of cards, and even beyond, enabling developers to complete performance tuning within days.

Advantages

MindStudio Insight allows developers to view the profile data in the cluster scenario on the Timeline tab page, displays the data by single card, and automatically traverses .db files or all trace_view.json files (in PyTorch and MindSpore scenarios) and msprof*.json files (in TensorFlow and offline inference scenarios) in the input path. Developers do not need to manually merge files.

With the database, MindStudio Insight supports large-scale profile data processing, analysis of 20 GB cluster profile data, and performance tuning in foundation model scenarios.

Scenario

System tuning: MindStudio Insight provides the timeline view, memory usage, operator duration, and communication bottleneck analysis to help developers quickly locate model performance bottlenecks.

**Table 1** Functions
Page	Introduction	Description
Timeline	Displays the running status of the entire online inference and training process in the timeline view based on the scheduling process, and provides functions such as cluster timeline display and system view details viewing.	-
Memory	Provides a visualized display of memory information during collection. Displays the operator memory trend in an operator memory curve.	-
Operator	Provides operator duration statistics and analysis.	-
Summary	Displays the computing and communication operator duration analysis, and displays the analysis results in a bar chart, curve, and data pane.	PyTorch or MindSpore cluster scenarios are supported.
Communication	Displays the network link performance across the cluster and the communication performance of all nodes. By analyzing the overlapped duration between cluster communication and computation, slow hosts or nodes in the cluster training can be identified.	PyTorch or MindSpore cluster scenarios are supported.
RL (Reinforcement Learning)	Provides a visualized display of the pipeline diagram in each phase of the reinforcement learning process.	-

Operator tuning: MindStudio Insight provides the instruction pipeline view, operator source code view, and operator running load analysis view to display the key performance metrics of operators running on the Ascend AI Processors in a visualized manner, allowing developers to quickly locate software and hardware performance bottlenecks of operators and improve operator performance analysis efficiency.

**Table 2** Functions
Page	Introduction	Remarks
Timeline	Displays the running status of instructions on the Ascend AI Processor in a timeline view, displays the overall running status based on the scheduling process, and allows users to view instruction details and search for instructions.	-
Source	Displays the operator instruction heatmap, and allows developers to view the mapping between the operator source code and instruction sets as well as the time consumption.	BIN files collected by msProf are supported.
Details	Displays the basic operator information, compute workload analysis, and memory workload analysis, as well as the analysis results in charts and data panes.	BIN files collected by msProf are supported.
Cache	Displays the L2 cache access of kernel functions in user programs, helping users optimize the cache hit rate.	BIN files collected by msProf are supported.

Serving tuning: MindStudio Insight displays the end-to-end request execution in the timeline view, showing the duration of the request in each key phase and the status of the request. This helps users quickly identify service performance bottlenecks and adjust the tuning policy.

**Table 3** Functions
Page	Introduction	Description
Timeline	Displays the end-to-end request execution status in a timeline view, helping users intuitively view the duration of the request in each key phase and the current request status.	JSON files of trace data of inference service requests are supported.
Curve	Displays the end-to-end performance of the inference service process in a curve and data details table.	profiler.db files are supported.

Memory tuning: MindStudio Insight displays the detailed memory allocation on the device on graphics, and marks memory allocation and usage details based on the Python call stack and custom trace tags, to locate and tune memory issues.

**Table 4** Functions
Page	Overview	Description
Leaks	The call stack graph, line & block graph, and memory disassembly graph are used to display the memory status, allowing developers to efficiently analyze and locate memory issues.	DB memory result files collected by msLeaks are supported.

Constraints

MindStudio Insight supports the import and display of profile data files in multiple formats. For details about the file specifications and restrictions, see Table 5.

**Table 5** File specifications
File Type	Suggestion	Specification Restriction
.json	Keep files under 1 GB each and 20 GB in total.	Single file limit: 10 GB
.bin	Single file limit: 500 MB	Single file limit: 10 GB
.db	System tuning: Keep files under 1 GB each. Serving tuning: Keep files under 1 GB each.	System tuning: Keep files under 20 GB each. Serving tuning: Keep files under 10 GB each.
.csv	CSV files are stored in text data. Single CSV file limit: 500 MB	Single file limit: 2 GB