ai_vector_core_utilization (AI Vector Core Instruction Proportion)

The AI Vector Core instruction proportion data does not contain the timeline information. The summary information is summarized in the ai_vector_core_utilization_*.csv file.

Availability

Atlas 200/500 A2 Inference Product

Atlas A2 Training Series Product/Atlas 800I A2 Inference Product

Atlas A3 Training Series Product

ai_vector_core_utilization_*.csv File

The file content is formatted as follows.

Figure 1 ai_vector_core_utilization_*.csv
Table 1 Field description

Field

Description

vec_ratio

Ratio of cycles taken to execute Vector instructions to the total cycles. For the Atlas 200/500 A2 Inference Product, this field is not supported and defaults to N/A.

mac_ratio

Ratio of cycles taken to execute Cube instructions (fp16 and s16) to the total cycles.

scalar_ratio

Ratio of cycles taken to execute Scalar instructions to the total cycles.

mte1_ratio

Ratio of cycles taken to execute MTE1 instructions (L1-to-L0A/L0B transfer) to the total cycles.

mte2_ratio

Ratio of cycles taken to execute MTE2 instructions (DDR-to-AI Core transfer) to the total cycles. (Atlas 200/500 A2 Inference Product)

mte2_ratio

Ratio of cycles taken to execute MTE2 instructions (on-chip memory to AI Core movement) to the total cycles. (Atlas A2 Training Series Product/Atlas 800I A2 Inference Product) (Atlas A3 Training Series Product)

mte3_ratio

Ratio of cycles taken to execute MTE3 instructions (AI Core-to-DDR transfer) to the total cycles. (Atlas 200/500 A2 Inference Product)

mte3_ratio

Ratio of cycles taken to execute MTE3 instructions (AI Core to on-chip memory movement) to the total cycles. (Atlas A2 Training Series Product/Atlas 800I A2 Inference Product) (Atlas A3 Training Series Product)

icache_miss_rate

iCache miss rate, that is, L1 cache that does not hit instructions. The smaller the value, the better.

memory_bound

AI Core memory bound, calculated as: mte2_ratio/max(mac_ratio, vec_ratio). If the value is less than 1, no memory bound exists. If the value is greater than 1, a memory bound exists. A greater value indicates a severer bound.

See ai_core_utilization (AI Core Instruction Proportion) for more details. The AI Vector Core metric PipeUtilization in sample-based mode is taken as an example.