task_time (Task Scheduler Information)

The timeline information of Task Scheduler data is displayed at the Ascend Hardware level in the msprof_*.json file, and the summary information is summarized in the task_time_*.csv file to identify the scheduling duration during AI task running.

Availability

Atlas 200/300/500 Inference Product

Atlas Training Series Product

Task Scheduler Data in msprof_*.json

The Task Scheduler data in the msprof_*.json file is displayed in each stream of Ascend Hardware. By recording the execution time of each task in different accelerators during AI task running, scheduling durations can be intuitively determined.

The following is an example of the Task Scheduler data in the msprof_*.json file:

Figure 1 Ascend Hardware

See the following table for more details.

Table 1 Field description

Field

Description

Title

API name of a component.

Start

Start point on the timeline, which is automatically aligned with that in chrome trace (ms).

Wall Duration

Time taken by the calls to an API (ms).

Task Time(us)

Task execution duration of an AI CPU operator, in μs.

Reduce Duration (μs)

Collective communication time of the ALL REDUCE operator, in μs.

Model Id

Model ID.

Task Type

Type of the accelerator that executes a task, including AI Core, AI Vector Core, and AI CPU.

Stream Id

ID of the stream where a task is located. The stream ID under Ascend Hardware is the complete logic stream ID of the task, and the stream ID attribute of each API in the timeline on the right is the physical stream ID of the API.

Task Id

Task ID.

Subtask Id

Subtask ID.

Aicore Time(ms)

Theoretical execution time of a task on the AI Core when all blocks are scheduled simultaneously and each block has an equal execution duration. The unit is ms. Typically, the scheduling start time of each block is slightly different, so the value of this field is slightly less than the actual execution time of the task on the AI Core. The data is inaccurate in the manual frequency modulation, dynamic frequency modulation (the power consumption exceeds the default value) scenarios. You are not advised referring to it.

Total Cycle

Total number of execution cycles of a task on the AI Core, which is the sum of the execution cycles of all blocks.

Receive Time

Time when the device receives information about a memory copy task (μs). This field is displayed only for the MemcopyAsync API.

Start Time

Time when a memory copy task starts to copy data (μs). This field is displayed only for the MemcopyAsync API.

End Time

Time when a memory copy task ends to copy data (μs). This field is displayed only for the MemcopyAsync API.

task_time_*.csv File ( Atlas 200/300/500 Inference Product )

The file content is formatted as follows.

Figure 2 task_time_*.csv

By examining the task duration percentage, average duration, minimum duration, maximum duration, wait time, and execution duration, you can pinpoint the causes for prolonged task execution durations.

Table 2 Field description

Field

Description

Device_id

Device ID.

Time(%)

Percentage of duration.

Time(us)

Total duration (μs).

Count

Number of times that a task is executed.

Avg/Min/Max

Average duration, minimum duration, and maximum duration (μs).

Waiting

Total wait time of a task (μs).

Running

Total run time of a task (μs). An abnormally large value indicates that the operator implementation needs to be improved.

Pending

Total pending time of a task (μs).

Type

Task type.

API

API name.

Task ID

Task ID.

OP Name

Operator name.

Stream ID

ID of the stream where a task is located.

task_time_*.csv File ( Atlas Training Series Product )

The file content is formatted as follows.

Figure 3 task_time_*.csv

By examining the operator with highest time consumption in a task, you can determine whether the operator is faulty based on its specific implementation.

Table 3 Field description

Field

Description

Device_id

Device ID.

kernel_name

Kernel name. If N/A is displayed, the operator is a non-compute operator.

kernel_type

Kernel type, including KERNEL_AICORE, KERNEL_AICPU, and more.

stream_id

ID of the stream where a task is located.

task_id

Task ID.

task_time(us)

Task duration, including scheduling time to the accelerator, execution time on the accelerator, and response end time. The unit is μs.

task_start(us)

Task start time (μs).

task_stop(us)

Task end time (μs)