mstx Instrumentation Collection
The msLeaks tool uses the mstx instrumentation capability to analyze memory data and collect host-side memory data. In addition, the msLeaks tool highlights marker locations in the visualization trace to help users locate the problematic code lines.
The mstx instrumentation method for C scripts is slightly different from that for Python scripts. For details, see MindStudio mstx API Reference.
After the host-side memory collection function of msLeaks is enabled, the leaks_dump_{timestamp}.csv file contains a large number of malloc/free records of the host, and the cpu_trace_{timestamp}.json file contains host-side memory information.
The following uses a Python script as an example to describe how to use mstx and msLeaks to analyze memory.
- Memory analysisMark the start and end of a step in the training and inference scripts, and use the fixed information step start to identify the start of the step. The following is an example:
1 2 3 4 5 6
import mstx for epoch in range(15): id = mstx.range_start("step start", None) # Mark the start of a step and enable the memory analysis function. .... .... mstx.range_end(id) # Mark the end of the step.
- Host-side memory collection
Collecting the host-side memory for the entire process can result in large dump files that are difficult to analyze. Therefore, you can enable and disable memory collection via markers and add mstx instrumentation before and after the required host code segment, using the fixed information report host memory info start. An example is as follows:
1 2 3 4 5
import mstx id = mstx.range_start("report host memory info start", None) ... ... mstx.range_end(id) # Mark the stop of collecting the memory on the host.
- Only the memory data of a single ID can be collected.
- You can add PYTHONMALLOC=malloc before the required user program.
PYTHONMALLOC=malloc is a Python environment variable, which indicates that the default memory allocator of Python is not used. All memory allocations are performed using malloc. This configuration has some impact on small memory allocations.
- In the next version of MindStudio, the host-side memory collection function will not be supported.