PipeUtilization (Percentages of Time Taken by Compute Units and MTEs)
The time consumption and percentage data of compute units and MTEs is collected in PipeUtilization.csv. You are advised to optimize the data transfer logic to improve bandwidth utilization. For details, see the field description in the following table.
|
Field |
Description |
|---|---|
|
block_id |
Number of running task blocks, which corresponds to the number of cores configured during task running. |
|
sub_block_id |
Name and sequence number of each block used for task running. |
|
aic_time(us) |
Execution time of each AI Core compute unit after the task is allocated to the unit, in μs. |
|
aic_total_cycles |
Total number of cycles executed on each AI Core compute unit after the task is allocated to the unit. |
|
aiv_time(us) |
Execution time of each AI Vector Core compute unit after the task is allocated to the unit, in μs. |
|
aiv_total_cycles |
Total number of cycles executed on each AI Vector Core compute unit after the task is allocated to the unit. |
|
aiv_vec_time(us) |
Time taken to execute Vector instructions |
|
aiv_vec_ratio |
Ratio of cycles taken to execute Vector instructions to the total cycles. |
|
aic_cube_time(us) |
Time taken to execute Cube instructions (fp16 and s16). |
|
aic_cube_ratio |
Ratio of cycles taken to execute Cube instructions (fp16 and s16) to the total cycles. |
|
ai*_scalar_time(us) |
Time taken to execute Scalar instructions |
|
ai*_scalar_ratio |
Ratio of cycles taken to execute Scalar instructions to the total cycles. |
|
aic_fixpipe_time(us) |
Time taken to execute fixpipe instructions (L0C-to-GM/L1 movement) |
|
aic_fixpipe_ratio |
Ratio of cycles taken to execute fixpipe instructions (L0C-to-GM/L1 movement) to the total cycles. |
|
aic_mte1_time(us) |
Time taken to execute MTE1 instructions (L1-to-L0A/L0B movement), excluding the movement wait time. |
|
aic_mte1_ratio |
Ratio of cycles taken to execute MTE1 instructions (L1-to-L0A/L0B transfer) to the total cycles. |
|
ai*_mte2_time(us) |
Time taken to execute MTE2 instructions (GM-to-AI Core movement) |
|
ai*_mte2_ratio |
Ratio of cycles taken to execute MTE2 instructions (GM-to-AI Core movement) to the total cycles. |
|
ai*_mte3_time(us) |
Time taken to execute MTE3 instructions (AI Core-to-GM movement) |
|
ai*_mte3_ratio |
Ratio of cycles taken to execute MTE3 instructions (AI Core-to-GM transfer) to the total cycles. |
|
ai*_icache_miss_rate |
iCache miss rate, that is, L1 cache that does not hit instructions. The smaller the value, the better. |
|
Field |
Description |
|---|---|
|
aic_time(us) |
Execution time of each AI Core compute unit after the task is allocated to the unit, in μs. |
|
aic_total_cycles |
Total number of cycles executed on each AI Core compute unit after the task is allocated to the unit. |
|
aic_cube_time(us) |
Time taken to execute Cube instructions (fp16 and s16). |
|
aic_cube_ratio |
Ratio of cycles taken to execute Cube instructions (fp16 and s16) to the total cycles. |
|
aic_scalar_time(us) |
Time taken to execute Scalar instructions |
|
aic_scalar_ratio |
Ratio of cycles taken to execute Scalar instructions to the total cycles. |
|
aic_mte1_time(us) |
Time taken to execute MTE1 instructions (L1-to-L0A/L0B movement), excluding the movement wait time. |
|
aic_mte1_ratio |
Ratio of cycles taken to execute MTE1 instructions (L1-to-L0A/L0B transfer) to the total cycles. |
|
aic_mte2_time(us) |
Time taken to execute MTE2 instructions (GM-to-AI Core movement) |
|
aic_mte2_ratio |
Ratio of cycles taken to execute MTE2 instructions (GM-to-AI Core movement) to the total cycles. |
|
aic_mte3_time(us) |
Time taken to execute MTE3 instructions (AI Core-to-GM movement) |
|
aic_mte3_ratio |
Ratio of cycles taken to execute MTE3 instructions (AI Core-to-GM transfer) to the total cycles. |
|
aic_icache_miss_rate |
iCache miss rate, that is, L1 cache that does not hit instructions. The smaller the value, the better. |
|
aic_vec_time(us) |
Time taken to execute Vector instructions |
|
aic_vec_ratio |
Ratio of cycles taken to execute Vector instructions to the total cycles. |