Using msproftx APIs to Collect and Flush Profile Data

To collect the profile data of the user and upper-layer framework applications, instrument the code using msproftx APIs before enabling the profiling tool.

API Overview

Table 1 API overview

API

Description

aclprofCreateStamp

Creates a msproftx event stamp for an instantaneous event.

aclprofSetStampTraceMessage

Sets the description in a msproftx event stamp, which is displayed in the parsed msprof_tx summary data of the Profiling tool.

aclprofMark

Marks an instantaneous event by msproftx.

aclprofMarkEx

aclprofMarkEx API.

aclprofPush

Records the start time of the time span by msproftx when an event occurs. It is used with aclprofPop in pairs and can be used only in a single thread.

aclprofPop

Records the end time of the time span by msproftx when an event occurs. It is used with aclprofPush in pairs and can be used only in a single thread.

aclprofRangeStart

Records the start time of the time span by msproftx when an event occurs. It is used with aclprofRangeStop in pairs and can be used across threads.

aclprofRangeStop

Records the end time of the time span by msproftx when an event occurs. It is used with aclprofRangeStart in pairs and can be used across threads.

aclprofDestroyStamp

Destroys a msproftx event stamp.

If only the msproftx function is enabled, deviceIdList of the aclCreateProfConfig API must be left empty, and deviceNums must be set to 0.

For details about the APIs, see Profile Data Collection.

API Call Examples

  • Example 1 (aclprofMark)
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    // 1. Call aclInit to initialize.
    
    // 2. Allocate runtime resources, including setting the compute device and creating a context and a stream.
    
    // 3. Initialize profiling.
    // Set the data flush path.
    const char *aclProfPath = "./output";
    aclprofInit(aclProfPath, strlen(aclProfPath));
    
    // 4. Configure profiling.
    uint32_t deviceIdList[1] = {0};    // Set this parameter based on the device ID in the actual environment.
    // Create a configuration struct.
    aclprofConfig *config = aclprofCreateConfig(deviceIdList, 1, ACL_AICORE_ARITHMETIC_UTILIZATION, 
        nullptr,ACL_PROF_ACL_API | ACL_PROF_TASK_TIME | ACL_PROF_MSPROFTX);
    const char *memFreq = "15";
    ret = aclprofSetConfig(ACL_PROF_SYS_HARDWARE_MEM_FREQ, memFreq, strlen(memFreq));
    aclprofStart(config);
    
    aclprofStepInfo *stepInfo = aclprofCreateStepInfo();
    int ret = aclprofGetStepTimestamp(stepInfo, ACL_STEP_START, stream_);
    
    // 5. Load your model. After the model is successfully loaded, modelId that identifies the model is returned.
    stamp = aclprofCreateStamp();
    aclprofSetStampTraceMessage(stamp, "model_load_mark", strlen("model_load_mark"));
    aclprofMark(stamp);    // Mark the model loading event.
    aclprofDestroyStamp(stamp);
    
    // 6. Create data of type aclmdlDataset to describe the inputs and outputs of your model.
    
    // 7. Execute your model.
    stamp = aclprofCreateStamp();
    aclprofSetStampTraceMessage(stamp, "model_exec_mark", strlen("model_exec_mark"));
    aclprofMark(stamp);    // Mark the model execution event.
    aclprofDestroyStamp(stamp);
    ret = aclmdlExecute(modelId, input, output);
    
    // 8. Process the model inference result.
    
    // 9. Destroy the model input and output descriptions, free memory, and unload the model.
    int ret = aclprofGetStepTimestamp(stepInfo, ACL_STEP_END, stream_);
    aclprofDestroyStepInfo(stepInfo);
    
    // 10. Stop profiling and destroy the configuration and related resources.
    aclprofStop(config);
    aclprofDestroyConfig(config);
    aclprofFinalize();
    
    // 11. Destroy runtime allocations.
    
    // 12. Call aclFinalize to deinitialize.
    //......
    
  • Example 2 (aclprofMarkEx, identifying the user funcA API)
    1
    2
    3
    4
    5
    6
    7
    8
    9
    aclrtStream stream;
    aclrtCreateStream(&stream);
    aclError markRet;
    markRet = aclprofMarkEx("funcA", strlen("funcA"), stream);
    if (markRet != ACL_ERROR_NONE) {
        printf("mark execute start failed");
    }
    // User service API
    funcA();
    
  • Example 3 (aclprofPush/aclprofPop, applicable to single-thread scenarios)
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    // 1. Call aclInit to initialize.
    
    // 2. Allocate runtime resources, including setting the compute device and creating a context and a stream.
    
    // 3. Initialize profiling.
    // Set the data flush path.
    const char *aclProfPath = "./output";
    aclprofInit(aclProfPath, strlen(aclProfPath));
    
    // 4. Configure profiling.
    uint32_t deviceIdList[1] = {0};    // Set this parameter based on the device ID in the actual environment.
    // Create a configuration struct.
    aclprofConfig *config = aclprofCreateConfig(deviceIdList, 1, ACL_AICORE_ARITHMETIC_UTILIZATION, 
        nullptr,ACL_PROF_ACL_API | ACL_PROF_TASK_TIME | ACL_PROF_MSPROFTX);
    const char *memFreq = "15";
    ret = aclprofSetConfig(ACL_PROF_SYS_HARDWARE_MEM_FREQ, memFreq, strlen(memFreq));
    aclprofStart(config);
    
    aclprofStepInfo *stepInfo = aclprofCreateStepInfo();
    int ret = aclprofGetStepTimestamp(stepInfo, ACL_STEP_START, stream_);
    
    // 5. Load your model. After the model is successfully loaded, modelId that identifies the model is returned.
    
    // 6. Create data of type aclmdlDataset to describe the inputs and outputs of your model.
    
    // 7. Execute the model (only in a single thread).
    stamp = aclprofCreateStamp();
    aclprofSetStampTraceMessage(stamp, "aclmdlExecute_duration", strlen("aclmdlExecute_duration"));
    aclprofPush(stamp);
    ret = aclmdlExecute(modelId, input, output);
    aclprofPop();
    aclprofDestroyStamp(stamp);
    
    // 8. Process the model inference result.
    
    // 9. Destroy the model input and output descriptions, free memory, and unload the model.
    int ret = aclprofGetStepTimestamp(stepInfo, ACL_STEP_END, stream_);
    aclprofDestroyStepInfo(stepInfo);
    
    // 10. Stop profiling and destroy the configuration and related resources.
    aclprofStop(config);
    aclprofDestroyConfig(config);
    aclprofFinalize();
    
    // 11. Destroy runtime allocations.
    
    // 12. Call aclFinalize to deinitialize.
    //......
    
  • Example 4 (aclprofRangeStart/aclprofRangeStop, applicable to single-thread or cross-thread scenarios)
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    // 1. Call aclInit to initialize.
    
    // 2. Allocate runtime resources, including setting the compute device and creating a context and a stream.
    
    // 3. Initialize profiling.
    // Set the data flush path.
    const char *aclProfPath = "./output";
    aclprofInit(aclProfPath, strlen(aclProfPath));
    
    // 4. Configure profiling.
    uint32_t deviceIdList[1] = {0};    // Set this parameter based on the device ID in the actual environment.
    // Create a configuration struct.
    aclprofConfig *config = aclprofCreateConfig(deviceIdList, 1, ACL_AICORE_ARITHMETIC_UTILIZATION, 
        nullptr,ACL_PROF_ACL_API | ACL_PROF_TASK_TIME | ACL_PROF_MSPROFTX);
    const char *memFreq = "15";
    ret = aclprofSetConfig(ACL_PROF_SYS_HARDWARE_MEM_FREQ, memFreq, strlen(memFreq));
    aclprofStart(config);
    
    aclprofStepInfo *stepInfo = aclprofCreateStepInfo();
    int ret = aclprofGetStepTimestamp(stepInfo, ACL_STEP_START, stream_);
    
    // 5. Load your model. After the model is successfully loaded, modelId that identifies the model is returned.
    
    // 6. Create data of type aclmdlDataset to describe the inputs and outputs of your model.
    
    // 7. Execute the model (across threads).
    stamp = aclprofCreateStamp();
    aclprofSetStampTraceMessage(stamp, "aclmdlExecute_duration", strlen("aclmdlExecute_duration"));
    aclprofRangeStart(stamp, &rangeId);
    ret = aclmdlExecute(modelId, input, output);
    aclprofRangeStop(rangeId);
    aclprofDestroyStamp(stamp);
    
    // 8. Process the model inference result.
    
    // 9. Destroy the model input and output descriptions, free memory, and unload the model.
    int ret = aclprofGetStepTimestamp(stepInfo, ACL_STEP_END, stream_);
    aclprofDestroyStepInfo(stepInfo);
    
    // 10. Stop profiling and destroy the configuration and related resources.
    aclprofStop(config);
    aclprofDestroyConfig(config);
    aclprofFinalize();
    
    // 11. Destroy runtime allocations.
    
    // 12. Call aclFinalize to deinitialize.
    //......
    

msproftx APIs are called within the main function.