set_flag

Function

Ensures the synchronization of different instructions between pipelines in a core. After pipe_src is scheduled, pipe_dst is unblocked. After set_flag and wait_flag are set, the Instruction Pipeline Chart will better meet the user's calling expectation.

Prototype

set_flag(pipe_src, pipe_dst, event_id)

Parameters

Parameter

Input/Output

Description

pipe_src

Input

Source pipeline. After pipe_src is scheduled, set event_id.

The input format is aicore_PIPE, for example, aic0_PIPE-MTE1. For the value range of aicore, see Core. The value range of pipeline is PIPE-MTE1, PIPE-MTE2, PIPE-MTE3, PIPE-FIX, PIPE-M, PIPE-V, and PIPE-S. If aicore is not specified, you can directly enter the value of pipeline.

Data type: string.

This parameter is required.

pipe_dst

Input

Destination pipeline. After pipe_src is scheduled, pipe_dst is unblocked.

The input format is aicore_PIPE, for example, aic0_PIPE-MTE1. For the value range of aicore, see Core. The value range of pipeline is PIPE-MTE1, PIPE-MTE2, PIPE-MTE3, PIPE-FIX, PIPE-M, PIPE-V, and PIPE-S. If aicore is not specified, you can directly enter the value of pipeline.

Data type: string.

This parameter is required.

event_id

Input

Unique value of the dependency between synchronization instructions.

Value range: [0, 65535]

Data type: int.

This parameter is required.

Constraints

The number of set_flag and wait_flag in the same core must match.
The set_flag instruction cannot be duplicate in the same core.
In the same core, if pipe_src and pipe_dst in set_flag and wait_flag are the same, the value of event_id must be unique.

Example

from mskpp import Tensor, Chip, set_flag, wait_flag
with Chip("Ascendxxyy") as chip:
    gm_weight = Tensor("GM", "FP16", [128, 256], format="ND")
    l1_weight = Tensor("L1", "FP16", [128, 256], format="ND")
    for conv_idx in range(4):  # Before data is loaded to L0A, the GM is loaded to L1 in batches.
        gm_weight_part = gm_weight[:, 64]
        l1_weight_part = l1_weight[:, 64]
        l1_weight_part.load(gm_weight_part)
        if conv_idx == 3:
            set_flag("PIPE-MTE2", "PIPE-MTE1", 1)  # MTE1 can be executed only after MTE2 is complete.
    x = Tensor("L0A")   # L0A
    # MTE2 is being executed. MTE1 can be executed only after MTE2 is complete.
    l1_weight.set_valid()  # Manually enable L1.
    wait_flag("PIPE-MTE2", "PIPE-MTE1", 1)
    x.load(l1_weight)

Parent topic: Synchronization instruction APIs