算子二进制配置

PyTorch框架提供与算子编译相关的二进制配置参数，可设置模型编译时是否优先在线编译，以此优化模型训练性能。参数配置代码如下，可设置True或False，若不添加以下代码则默认为True。

torch_npu.npu.set_compile_mode(jit_compile=True)

用户在模型训练后，可根据模型训练为固定shape还是动态shape场景来选择如下配置：

固定shape场景：推荐保持默认设置True。根据当前获得的算子信息，进行融合和优化，在线编译出运行性能更优的算子。若设置为False，则编译优化少，性能降低。
动态shape场景：推荐配置为False，优先查找当前编译好的算子二进制配置文件，若存在则不在线编译算子；若不存在，再进行在线编译。此时虽然编译优化少，但是没有编译时间，模型训练性能大概率比配置为True时高。

若将开关设置为False，则需安装二进制算子包，安装请参考《CANN 软件安装指南》中“常用操作 > 安装、升级和卸载二进制算子包”章节。

动态shape场景下，在模型脚本的main_worker函数中配置进程级别的开关，配置为False。
PyTorch 1.8.1/1.11.0版本：
```
torch_npu.npu.set_compile_mode(jit_compile=False)
```
配置位置根据不同的训练拉起方式存在差异，此处以PyTorch1.8.1版本为例，说明具体使能位置。
- 单卡训练。正常拉起方式需将代码使能在main函数开始位置，mp.spawn方式拉起需配置在main_worker函数中，保证全部进程拉起时配置生效。
  - 正常拉起：
```
if __name__ == '__main__':
    torch_npu.npu.set_compile_mode(jit_compile=False)
    main()
```
  - mp.spawn方式：
```
    mp.spawn(main_worker,...)
...
def main_worker():
    torch_npu.npu.set_compile_mode(jit_compile=False)
```
- 多卡训练。shell脚本、Python方式与单卡正常拉起方式配置相同，mp.spawn方式拉起需配置在main_worker函数中，保证全部进程拉起时配置生效。
  - shell脚本、Python方式：
```
if __name__ == '__main__':
    torch_npu.npu.set_compile_mode(jit_compile=False)
    main()
```
  - mp.spawn方式：
```
    mp.spawn(main_worker,...)
...
def main_worker():
    torch_npu.npu.set_compile_mode(jit_compile=False)
```

父主题： 模型迁移与训练