autotune_v2
Function
Traverses the search space, tests different parameter combinations, and displays the running time of each combination and the optimal combination.
Function Prototype
def autotune_v2(configs: List[Dict], warmup_times = 5)
Parameters
Parameter |
Input/Output |
Description |
|---|---|---|
configs |
Input |
Search space definition. Data type: list[dict]. This parameter is required. |
warmup_times |
Input |
Number of device preheating times before performance collection. This parameter is optional. The default value is 5. The value is an integer ranging from 1 to 500. |
Return Value
None.
Example
@mskpp.autotune_v2(configs=[
{'L1TileShape': 'GemmShape<128, 256, 256>', 'L0TileShape': 'GemmShape<128, 256, 64>'},
{'L1TileShape': 'GemmShape<256, 128, 256>', 'L0TileShape': 'GemmShape<256, 128, 64>'},
{'L1TileShape': 'GemmShape<128, 128, 256>', 'L0TileShape': 'GemmShape<128, 128, 64>'},
{'L1TileShape': 'GemmShape<128, 128, 512>', 'L0TileShape': 'GemmShape<128, 128, 64>'},
{'L1TileShape': 'GemmShape<64, 256, 128>', 'L0TileShape': 'GemmShape<64, 256, 64>'},
], warmup_times=10)
def run_executable(m, n, k, device_id):
src_file = "./basic_matmul.cpp"
build_script = "./jit_build_executable.sh" # executable compile script
executable = mskpp.compile_executable(build_script=build_script, src_file=src_file, use_cache=False)
return executable(m, n, k, device_id)
Parent topic: API List