autotune
Function
Traverses the search space, tests different parameter combinations, and displays the running time of each combination and the optimal combination.
Function Prototype
def autotune(configs: List[Dict], warmup: int = 300, repeat: int = 1, device_ids = [0]):
Parameters
Parameter |
Input/Output |
Description |
|---|---|---|
configs |
Input |
Search space definition. Data type: list[dict]. This parameter is required. |
warmup |
Input |
Preheating time before performance collection. Longer preheating times typically lead to more stable operator performance. Unit: µs. This parameter is optional. The default value is 1000. The value is an integer ranging from 1 to 100000. |
repeat |
Input |
Number of repeat times. The average running duration of multiple repeats is used as the operator duration. This parameter is optional. The default value is 1. The value is an integer ranging from 1 to 10000. |
device_ids |
Input |
Device ID list. Currently, only the single-device mode is supported. If multiple device IDs are entered, only the first device ID takes effect. This parameter is optional. The default value is [0]. |
Return Value
None.
Example
@mskpp.autotune(configs=[
{'L1TileShape': 'MatmulShape<64, 64, 64>', 'L0TileShape': 'MatmulShape<128, 256, 64>'},
{'L1TileShape': 'MatmulShape<64, 64, 128>', 'L0TileShape': 'MatmulShape<128, 256, 64>'},
{'L1TileShape': 'MatmulShape<64, 128, 128>', 'L0TileShape': 'MatmulShape<128, 256, 64>'},
{'L1TileShape': 'MatmulShape<64, 128, 128>', 'L0TileShape': 'MatmulShape<64, 256, 64>'},
{'L1TileShape': 'MatmulShape<128, 128, 128>', 'L0TileShape': 'MatmulShape<128, 256, 64>'},
], warmup=500, repeat=10, device_ids=[0])
def basic_matmul(problem_shape, a, layout_a, b, layout_b, c, layout_c):
kernel = get_kernel()
blockdim = 20
return kernel[blockdim](problem_shape, a, layout_a, b, layout_b, c, layout_c)