for_range

Description

Indicates the for loop statement of the TIK. N buffers and multiple blocks can be enabled in the for loop.

Prototype

for_range(begint, endt, name="i", thread_num=1, thread_type="whole", block_num=1,dtype="int32", for_type="serial")

Parameters

Table 1 Parameter description

Parameter

Input/Output

Description

begint

Input

Start value of the for_range loop's variable.

begint and endt are immediates (of type int or uint), Scalars (of type int or uint), or Exprs. If Exprs are passed, the simplified values must be integers and the following condition must be met: 0 <= begint <= endt <= 2147483647.

begint <= The variable value < endt

endt

Input

End value of the for_range loop's variable.

begint and endt are immediates (of type int or uint), Scalars (of type int or uint), or Exprs. If Exprs are passed, the simplified values must be integers and the following condition must be met: 0 <= begint <= endt <= 2147483647.

begint <= The variable value < endt

NOTE:

The performance deteriorates when begint and endt are Scalars.

name

Input

Name of a variable in the for_range loop.

NOTE:
  • You are advised to configure this parameter and keep it unique to facilitate problem locating during operator building.
  • If the operator implementation code contains multiple for_range statements and the name parameter is not configured, the variable names on the loop are set to i, j, k, ... by default.

thread_num

Input

An immediate specifying whether to enable N buffers in the for_range loop.

  • 0: forcibly disables N buffers. In this case, even if N buffers are enabled for the outer for_range() loop, the current loop will not be affected and N buffers will not be enabled.
  • 1: disables N buffers. However, if N buffers are enabled for the outer for_range() loop, the current loop will be affected and N buffers will be enabled on the premise of sufficient memory.
  • 2: enables two buffers.
  • 3: enables three buffers.
  • 4: enables four buffers.
NOTE:

3 and 4 are used only when the operator is for movement purposes and ultimate performance is required. For common operators, enabling two buffers is recommended for performance optimization.

thread_type

Input

Thread type of the for_range loop. This parameter is reserved and has no impact on system running. Must be whole.

block_num

Input

Number of blocks used in the for_range loop. An immediate or Scalar (int, uint). Must be in the range of [1, 65535].

  • If the number of blocks configured is greater than the number of available blocks, Runtime will perform scheduling in batches.
  • If the number of blocks configured is less than or equal to the number of available blocks, Runtime will perform scheduling as required. The number of running blocks may be less than or equal to the number of blocks configured.

dtype

Input

Variable type of the for_range loop. This parameter is reserved and has no impact on system running. Must be int32.

for_type

Input

Type of the for_range loop. Must be serial.

Applicability

Atlas 200/300/500 Inference Product

Atlas Training Series Product

Restrictions

  1. In a for_range loop, if there are nested loops, multiple blocks can be enabled only for the outermost loop.
  2. If multiple blocks are enabled, the start value of the multi-block loop must be 0, and the number of blocks must be of the same value and type (int or scalar) as the end value of the multi-block loop. If block_num is a scalar, thread_num must be the immediate 1.
  3. In a for_range loop, multiple blocks and N buffers are mutually exclusive. To enable them both, use multiple loops.
  4. If multiple blocks are enabled, a required tensor must be defined in the multiple-block loop. If the tensor memory allocation both inside and outside the multiple-block loop starts at 0, address overlapping and data errors may occur.
  5. In an operator, for_range can be called only once to enable multiple blocks (block_num ≥ 2). Note that the number of blocks must be the same as the loop count.
  6. When N buffers are enabled, multiple buffers are allocated for tensors defined in the for_range loop.
  7. Do not change the value of endt in for_range in the loop body. Otherwise, the operator execution task will be suspended.
  8. When begint and endt are Scalars, the range check automatically counts begint from 0. The final endt value is endt minus begint. The following condition must be met in the end: 0 <= begint <= endt <= 2147483647.
  9. When 0 < endtbegint < thread_num, an error is reported at build time.
  10. When using loop variables, pay attention to the following:
    # Use loop variables.
    with self.tik_instance.for_range(0,10) as i:
        with self.tik_instance.if_scope(i==0):    #  Do not use if i==0.
            do_something
        with self.tik_instance.else_scope():    # Do not use else.
            do_something

Returns

A TikWithScope object.

Example

 with self.tik_instance.for_range(0,1,thread_num=1):
    do_something

# Enable double buffering. Note that two buffers are allocated only for tensors defined in for_range.
 with self.tik_instance.for_range(0,2,thread_num=2): 
    Tensor definition
    do_something

# Enable multiple blocks.
 with self.tik_instance.for_range(0,2,block_num=2): 
    do_something