Collecting Profile Data with AscendCL APIs for Extension
To collect the profile data of the user and upper-layer framework program, call the msproftx APIs (Profiling AscendCL APIs for Extension) in the program to trace an application by dotting and output corresponding profile data.
Profiling AscendCL APIs for Extension
API |
Description |
|---|---|
aclprofCreateStamp |
Creates a msproftx event stamp for an instantaneous event. |
aclprofSetStampTraceMessage |
Sets the description in a msproftx event stamp, which is displayed in the parsed msprof_tx summary data of the Profiling tool. |
aclprofMark |
Marks an instantaneous event by msproftx. |
aclprofMarkEx |
aclprofMarkEx dotting API. |
aclprofPush |
Records the start time of the time span by msproftx when an event occurs. It is used with aclprofPop in pairs and can be used only in a single thread. |
aclprofPop |
Records the end time of the time span by msproftx when an event occurs. It is used with aclprofPush in pairs and can be used only in a single thread. |
aclprofRangeStart |
Records the start time of the time span by msproftx when an event occurs. It is used with aclprofRangeStop in pairs and can be used across threads. |
aclprofRangeStop |
Records the end time of the time span by msproftx when an event occurs. It is used with aclprofRangeStart in pairs and can be used across threads. |
aclprofDestroyStamp |
Destroys a msproftx event stamp. |
If only the msproftx function is enabled, deviceIdList of the aclCreateProfConfig API must be left empty, and deviceNums must be set to 0.
Calling Example of Profiling AscendCL APIs for Extension
The following is the sample code of the msproftx Profiling API:
- Example 1 (aclprofMark)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51
//1. Initialize AscendCL. //2. Allocate runtime resources, including setting the compute device and creating a context and a stream. //3. Initialize profiling. //Set the data flush path. const char *aclProfPath = "..."; aclprofInit(aclProfPath, strlen(aclProfPath)); //4. Configure profiling. uint32_t deviceIdList[1] = {0}; //Set this parameter based on the device ID in the actual environment. //Create a configuration struct. aclprofConfig *config = aclprofCreateConfig(deviceIdList, 1, ACL_AICORE_ARITHMETIC_UTILIZATION, nullptr,ACL_PROF_ACL_API | ACL_PROF_TASK_TIME | ACL_PROF_MSPROFTX); const char *memFreq = "15"; ret = aclprofSetConfig(ACL_PROF_SYS_HARDWARE_MEM_FREQ, memFreq, strlen(memFreq)); aclprofStart(config); aclprofStepInfo *stepInfo = aclprofCreateStepInfo(); int ret = aclprofGetStepTimestamp(stepInfo, ACL_STEP_START, stream_); //5. Load your model. After the model is successfully loaded, modelId that identifies the model is returned. stamp = aclprofCreateStamp(); aclprofSetStampTraceMessage(stamp, "model_load_mark", strlen("model_load_mark")); aclprofMark(stamp); //Mark the model loading event. aclprofDestroyStamp(stamp); //6. Create data of type aclmdlDataset to describe the inputs and outputs of your model. //7. Execute your model. stamp = aclprofCreateStamp(); aclprofSetStampTraceMessage(stamp, "model_exec_mark", strlen("model_exec_mark")); aclprofMark(stamp); //Mark the model execution event. aclprofDestroyStamp(stamp); ret = aclmdlExecute(modelId, input, output); //8. Process the model inference result. //9. Destroy the model input and output descriptions, free memory, and unload the model. int ret = aclprofGetStepTimestamp(stepInfo, ACL_STEP_END, stream_); aclprofDestroyStepInfo(stepInfo); //10. Stop profiling and destroy the configuration and related resources. aclprofStop(config); aclprofDestroyConfig(config); aclprofFinalize(); //11. Destroy runtime allocations. //12. Deinitialize AscendCL. //......
- Example 2 (aclprofMarkEx, identifying the user funcA API)
1 2 3 4 5 6 7 8 9
aclrtStream stream; aclrtCreateStream(&stream); aclError markRet; markRet = aclprofMarkEx("funcA", strlen("funcA"), stream); if (markRet != ACL_ERROR_NONE) { printf("mark execute start failed"); } // User service API funcA();
- Example 3 (aclprofPush/aclprofPop, applicable to single-thread scenarios)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
//1. Initialize AscendCL. //2. Allocate runtime resources, including setting the compute device and creating a context and a stream. //3. Initialize profiling. //Set the data flush path. const char *aclProfPath = "..."; aclprofInit(aclProfPath, strlen(aclProfPath)); //4. Configure profiling. uint32_t deviceIdList[1] = {0}; //Set this parameter based on the device ID in the actual environment. //Create a configuration struct. aclprofConfig *config = aclprofCreateConfig(deviceIdList, 1, ACL_AICORE_ARITHMETIC_UTILIZATION, nullptr,ACL_PROF_ACL_API | ACL_PROF_TASK_TIME | ACL_PROF_MSPROFTX); const char *memFreq = "15"; ret = aclprofSetConfig(ACL_PROF_SYS_HARDWARE_MEM_FREQ, memFreq, strlen(memFreq)); aclprofStart(config); aclprofStepInfo *stepInfo = aclprofCreateStepInfo(); int ret = aclprofGetStepTimestamp(stepInfo, ACL_STEP_START, stream_); //5. Load your model. After the model is successfully loaded, modelId that identifies the model is returned. //6. Create data of type aclmdlDataset to describe the inputs and outputs of your model. //7. Execute the model (only in a single thread). stamp = aclprofCreateStamp(); aclprofSetStampTraceMessage(stamp, "aclmdlExecute_duration", strlen("aclmdlExecute_duration")); aclprofPush(stamp); ret = aclmdlExecute(modelId, input, output); aclprofPop(); aclprofDestroyStamp(stamp); //8. Process the model inference result. //9. Destroy the model input and output descriptions, free memory, and unload the model. int ret = aclprofGetStepTimestamp(stepInfo, ACL_STEP_END, stream_); aclprofDestroyStepInfo(stepInfo); //10. Stop profiling and destroy the configuration and related resources. aclprofStop(config); aclprofDestroyConfig(config); aclprofFinalize(); //11. Destroy runtime allocations. //12. Deinitialize AscendCL. //......
- Example 4 (aclprofRangeStart/aclprofRangeStop, applicable to single-thread or cross-thread scenarios)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
//1. Initialize AscendCL. //2. Allocate runtime resources, including setting the compute device and creating a context and a stream. //3. Initialize profiling. //Set the data flush path. const char *aclProfPath = "..."; aclprofInit(aclProfPath, strlen(aclProfPath)); //4. Configure profiling. uint32_t deviceIdList[1] = {0}; //Set this parameter based on the device ID in the actual environment. //Create a configuration struct. aclprofConfig *config = aclprofCreateConfig(deviceIdList, 1, ACL_AICORE_ARITHMETIC_UTILIZATION, nullptr,ACL_PROF_ACL_API | ACL_PROF_TASK_TIME | ACL_PROF_MSPROFTX); const char *memFreq = "15"; ret = aclprofSetConfig(ACL_PROF_SYS_HARDWARE_MEM_FREQ, memFreq, strlen(memFreq)); aclprofStart(config); aclprofStepInfo *stepInfo = aclprofCreateStepInfo(); int ret = aclprofGetStepTimestamp(stepInfo, ACL_STEP_START, stream_); //5. Load your model. After the model is successfully loaded, modelId that identifies the model is returned. //6. Create data of type aclmdlDataset to describe the inputs and outputs of your model. //7. Execute the model (across threads). stamp = aclprofCreateStamp(); aclprofSetStampTraceMessage(stamp, "aclmdlExecute_duration", strlen("aclmdlExecute_duration")); aclprofRangeStart(stamp, &rangeId); ret = aclmdlExecute(modelId, input, output); aclprofRangeStop(rangeId); aclprofDestroyStamp(stamp); //8. Process the model inference result. //9. Destroy the model input and output descriptions, free memory, and unload the model. int ret = aclprofGetStepTimestamp(stepInfo, ACL_STEP_END, stream_); aclprofDestroyStepInfo(stepInfo); //10. Stop profiling and destroy the configuration and related resources. aclprofStop(config); aclprofDestroyConfig(config); aclprofFinalize(); //11. Destroy runtime allocations. //12. Deinitialize AscendCL. //......
Profiling AscendCL APIs for Extension are called within the main function.