GUI Description

Function

The Memory tab page displays the memory information collected during the collection process. You can view the overall memory trend in the memory curve. You can also select and zoom in on the peak area in the curve and combine the operator memory information to accurately locate the operator with high memory consumption.

GUI Display (Dynamic Image Scenario)

The Memory tab page consists of the parameter configuration area (area 1), operator memory curve (area 2), and memory allocation/release details table (area 3), as shown in Figure 1.

For more than 3 million data records, the Largest Triangle Three Buckets (LTTB) algorithm is used to perform downsampling to improve the memory curve rendering performance. Therefore, when the data volume is large, only the sampling result is displayed. When the chart is zoomed in to a small range, all data points can be displayed.

Figure 1 Dynamic image memory page
  • Area 1: parameter configuration area.
    • Rank ID: You can switch the options to view the memory information of different cards. The entire page is refreshed in real time after the switch.
    • Group By: You can switch different dimensions to display memory information. The dimensions include Overall, Stream, and Component.
    • Host Name: This parameter is available only when the imported DB file contains the HOST_INFO table.
  • Area 2: operator memory curve.
    • The Operator Allocated curve indicates the change trend of the allocated memory collected when the operator allocates or releases memory, that is, the total allocated memory of all operators. The collected memory data is allocated by PyTorch and Graph Engine (GE).
    • The Operator Reserved curve indicates the change trend of the reserved memory collected when the operator allocates or releases memory, that is, the total reserved memory of all operators. The collected memory data is allocated by PyTorch and GE.
    • The Operator Activated curve indicates the total memory held, including the memory that is reused by other streams but is not released. The collected memory data is allocated by streams in PyTorch. If no stream information is available, no Operator Activated curve is displayed.
    • The APP Reserved curve indicates the memory trend reserved by the entire process.
    • When Group By is set to Component, the curve displays the memory usage of PyTorch operators and GE.
  • Area 3: memory allocation/release details table, which displays the memory information of each operator. The table supports sorting, pagination, and redirection. You can click the table header of each column to display data in ascending, descending, or default order. You can click in the upper right corner of the table to copy the content displayed in the table and paste the content to an Excel file for analysis.

GUI Display (Static Image Scenario)

The Memory tab page consists of the parameter configuration area (area 1), operator memory curve (area 2), and memory allocation/release details table (area 3), as shown in Figure 2.

For more than 3 million data records, the Largest Triangle Three Buckets (LTTB) algorithm is used to perform downsampling to improve the memory curve rendering performance. Therefore, when the data volume is large, only the sampling result is displayed. When the chart is zoomed in to a small range, all data points can be displayed.

Figure 2 Static image memory page
  • Area 1: parameter configuration area. You can set Rank ID and Group By to view the memory information of different cards. After the setting, the entire page is refreshed immediately.
  • Area 2: operator memory curve, which consists of a dynamic curve and a static curve. The static curve exists only in the MindSpore data scenario.
    1. Dynamic curve:
      • The Operator Allocated curve indicates the change trend of the allocated memory collected when the operator allocates or releases memory, that is, the total allocated memory of all operators. The collected memory data is allocated by PyTorch and Graph Engine (GE).
      • The Operator Reserved curve indicates the change trend of the reserved memory collected when the operator allocates or releases memory, that is, the total reserved memory of all operators. The collected memory data is allocated by PyTorch and GE.
      • The Operator Activated curve indicates the total memory held, including the memory that is reused by other streams but is not released. The collected memory data is allocated by streams in PyTorch.
      • The APP Reserved curve indicates the memory trend reserved by the entire process.
    2. Static curve:

      This chart exists only in the MindSpore data scenario. You can switch Graph ID to view the memory allocation of the selected card.

      • Size: size of the memory dynamically allocated by index.
      • Total Size: maximum memory size that is automatically preset.
  • Area 3: memory allocation/release details table, which displays the memory information of each operator in the static curve. The table supports sorting, pagination, and redirection. You can click the table header of each column to display data in ascending, descending, or default order. You can copy the content displayed in the table and paste the content to an Excel file for analysis.