数据拷贝
MindIE Torch支持将名为"npu"的设备注册到原生Torch中,可采用Torch的to接口实现Tensor的同步或异步拷贝。

"npu"设备仅支持Host和Device之间的数据拷贝功能,不支持其他操作。对于Device为"npu"的Tensor需要将其拷贝到CPU后方可进行运算或打印。为了顺利释放数据拷贝时所申请的"npu"设备资源,推荐用户在代码中使用try catch方式捕获异常并保证程序正常退出。
同步拷贝
- 同步拷贝C++伪代码:
auto tensorCpu = at::randn({ 10, 10, 10 }, torch::kFloat); auto tensorNpu = tensorCpu.to("npu:0"); // copy data from cpu to npu:0 auto tensorCpuNew = tensorNpu.to("cpu"); // copy data from npu:0 to cpu
- 同步拷贝Python伪代码:
tensor_cpu = torch.randn((10, 10, 10), dtype=torch.float) tensor_npu = tensor_cpu.to("npu:0") tensor_cpu_new = tensor_npu.to("cpu")
异步拷贝
异步数据拷贝时需要CPU的Tensor使用pinned_memory=True,否则会没有异步数据拷贝的效果。
- 异步拷贝C++伪代码:
auto optionCpu = torch::TensorOptions().device(at::Device("cpu")).layout(torch::kStrided).pinned_memory(true); auto tensorCpu = at::randn({ 100, 1024, 1024 }, optionCpu); auto tensorCpuNew = at::empty({ 100, 1024, 1024 }, optionCpu); auto npu = at::Device("npu:0"); // create stream c10::Stream stream = c10::Stream(c10::Stream::DEFAULT, npu); c10::StreamGuard streamGuard(stream); // set stream // copy data from cpu to npu:0 auto tensorDevice = tensorCpu.to(npu, /*non_blocking=*/true); stream.synchronize(); // copy data from npu:0 to cpu tensorCpuNew.copy_(tensorDevice, /*non_blocking=*/true); stream.synchronize();
- 异步拷贝Python伪代码:
input_cpu = torch.rand((1, 100, 1024, 1024), pin_memory = True) output_cpu = torch.empty((1, 100, 1024, 1024), pin_memory = True) # create stream stream = mindietorch.npu.Stream("npu:0") # copy data from cpu to npu:0 with mindietorch.npu.stream(stream): output_npu = input_cpu.to("npu:0", non_blocking = True) stream.synchronize() # copy data from npu:0 to cpu with mindietorch.npu.stream(stream): output_cpu.copy_(output_npu, non_blocking = True) stream.synchronize()
父主题: 数据拷贝和模型推理