Function: create_config

C Prototype

aclprofConfig *aclprofCreateConfig(uint32_t *deviceIdlist, uint32_t deviceNums, aclprofAicoreMetrics aicoreMetrics, aclprofAicoreEvents *aicoreEvents, uint64_t dataTypeConfig)

Python Function

prof_config = acl.prof.create_config(device_list, aicore_metrics, aicore_events, data_type_config)

Function Usage

Creates data of the aclprofConfig type as the profiling configuration.

Created aclProfConfig data can be reused in multiple calls. You need to ensure the consistency and accuracy of the data.

To destroy data of the aclprofConfig type, call Function: destroy_config.

Input Description

device_list: device ID list Set this parameter based on the actual device ID.

aicore_metrics: aclprofAicoreMetrics.

aicore_events: AI Core event, which is set to 0.

data_type_config:

Select from the following values in logical OR format (for example, ACL_PROF_ACL_API|ACL_PROF_AICORE_METRICS) as the parameter value of data_type_config. Each value indicates a type of profile data.

  • ACL_PROF_ACL_API: collects profile data of pyACL APIs, including the synchronous/asynchronous memory copy latency between the host and device.
  • ACL_PROF_TASK_TIME: collects operator delivery and execution duration data, as well as basic operator information, to provide more comprehensive performance analysis data.
  • ACL_PROF_TASK_TIME_L0: collects operator delivery and execution duration data. Compared with ACL_PROF_TASK_TIME, ACL_PROF_TASK_TIME_L0 does not collect basic operator information, so the performance overhead during collection is smaller, and this enables more accurate collection of statistics on time duration data.
  • ACL_PROF_AICORE_METRICS: collects AI Core metrics. This macro must be included in the OR logic for aicore_metrics to take effect.
  • ACL_PROF_TASK_MEMORY: specifies whether to collect the memory usage of CANN operators. Only data of GE operators is collected.
  • ACL_PROF_AICPU: collects traces of AI CPU tasks, including the start and end of each task.
  • ACL_PROF_L2CACHE: L2 Cache data collection.
  • ACL_PROF_HCCL_TRACE: HCCL data collection.
  • ACL_PROF_MSPROFTX: profile data output by the user and upper-layer framework program. You need to add Profiling pyACL APIs for Extension (Extension APIs) to the application script first.
  • ACL_PROF_TRAINING_TRACE: collects iteration traces.
  • ACL_PROF_RUNTIME_API: collects runtime API profile data.

Return Value

prof_config: int.

  • If a non-zero value is returned, the operation is successful. The return value is the pointer address of data of the aclprofConfig type.
  • If 0 is returned, the operation fails.

Restrictions

  • Use the acl.prof.destroy_config API to destroy data of the aclprofConfig type. If data is not destroyed, the memory cannot be freed.