for_range
Description
Indicates the for loop statement of the TIK. N buffers and multiple blocks can be enabled in the for loop.
Prototype
for_range(begint, endt, name="i", thread_num=1, thread_type="whole", block_num=1,dtype="int32", for_type="serial")
Parameters
Parameter |
Input/Output |
Description |
|---|---|---|
begint |
Input |
Start value of the for_range loop's variable. begint and endt are immediates (of type int or uint), Scalars (of type int or uint), or Exprs. If Exprs are passed, the simplified values must be integers and the following condition must be met: 0 <= begint <= endt <= 2147483647. begint <= The variable value < endt |
endt |
Input |
End value of the for_range loop's variable. begint and endt are immediates (of type int or uint), Scalars (of type int or uint), or Exprs. If Exprs are passed, the simplified values must be integers and the following condition must be met: 0 <= begint <= endt <= 2147483647. begint <= The variable value < endt NOTE:
The performance deteriorates when begint and endt are Scalars. |
name |
Input |
Name of a variable in the for_range loop. NOTE:
|
thread_num |
Input |
An immediate specifying whether to enable N buffers in the for_range loop.
NOTE:
3 and 4 are used only when the operator is for movement purposes and ultimate performance is required. For common operators, enabling two buffers is recommended for performance optimization. |
thread_type |
Input |
Thread type of the for_range loop. This parameter is reserved and has no impact on system running. Must be whole. |
block_num |
Input |
Number of blocks used in the for_range loop. An immediate or Scalar (int, uint). Must be in the range of [1, 65535].
|
dtype |
Input |
Variable type of the for_range loop. This parameter is reserved and has no impact on system running. Must be int32. |
for_type |
Input |
Type of the for_range loop. Must be serial. |
Applicability
Restrictions
- In a for_range loop, if there are nested loops, multiple blocks can be enabled only for the outermost loop.
- If multiple blocks are enabled, the start value of the multi-block loop must be 0, and the number of blocks must be of the same value and type (int or scalar) as the end value of the multi-block loop. If block_num is a scalar, thread_num must be the immediate 1.
- In a for_range loop, multiple blocks and N buffers are mutually exclusive. To enable them both, use multiple loops.
- If multiple blocks are enabled, a required tensor must be defined in the multiple-block loop. If the tensor memory allocation both inside and outside the multiple-block loop starts at 0, address overlapping and data errors may occur.
- In an operator, for_range can be called only once to enable multiple blocks (block_num ≥ 2). Note that the number of blocks must be the same as the loop count.
- When N buffers are enabled, multiple buffers are allocated for tensors defined in the for_range loop.
- Do not change the value of endt in for_range in the loop body. Otherwise, the operator execution task will be suspended.
- When begint and endt are Scalars, the range check automatically counts begint from 0. The final endt value is endt minus begint. The following condition must be met in the end: 0 <= begint <= endt <= 2147483647.
- When 0 < endt – begint < thread_num, an error is reported at build time.
- When using loop variables, pay attention to the following:
# Use loop variables. with self.tik_instance.for_range(0,10) as i: with self.tik_instance.if_scope(i==0): # Do not use if i==0. do_something with self.tik_instance.else_scope(): # Do not use else. do_something
Returns
A TikWithScope object.
Example
with self.tik_instance.for_range(0,1,thread_num=1):
do_something
# Enable double buffering. Note that two buffers are allocated only for tensors defined in for_range.
with self.tik_instance.for_range(0,2,thread_num=2):
Tensor definition
do_something
# Enable multiple blocks.
with self.tik_instance.for_range(0,2,block_num=2):
do_something