昇腾社区首页
中文
注册

数据拷贝

MindIE Torch支持将名为"npu"的设备注册到原生Torch中,可采用Torch的to接口实现Tensor的同步或异步拷贝。

"npu"设备仅支持Host和Device之间的数据拷贝功能,不支持其他操作。对于Device为"npu"的Tensor需要将其拷贝到CPU后方可进行运算或打印。为了顺利释放数据拷贝时所申请的"npu"设备资源,推荐用户在代码中使用try catch方式捕获异常并保证程序正常退出。

同步拷贝

  • 同步拷贝C++伪代码:
    auto tensorCpu = at::randn({ 10, 10, 10 }, torch::kFloat);
    auto tensorNpu = tensorCpu.to("npu:0"); // copy data from cpu to npu:0
    auto tensorCpuNew = tensorNpu.to("cpu"); // copy data from npu:0 to cpu
  • 同步拷贝Python伪代码:
    tensor_cpu = torch.randn((10, 10, 10), dtype=torch.float)
    tensor_npu = tensor_cpu.to("npu:0")
    tensor_cpu_new = tensor_npu.to("cpu")

异步拷贝

异步数据拷贝时需要CPU的Tensor使用pinned_memory=True,否则会没有异步数据拷贝的效果。

  • 异步拷贝C++伪代码:
    auto optionCpu = torch::TensorOptions().device(at::Device("cpu")).layout(torch::kStrided).pinned_memory(true);
    auto tensorCpu = at::randn({ 100, 1024, 1024 }, optionCpu);
    auto tensorCpuNew = at::empty({ 100, 1024, 1024 }, optionCpu);
    auto npu = at::Device("npu:0");
    
    // create stream
    c10::Stream stream = c10::Stream(c10::Stream::DEFAULT, npu);
    c10::StreamGuard streamGuard(stream); // set stream
    
    // copy data from cpu to npu:0
    auto tensorDevice = tensorCpu.to(npu, /*non_blocking=*/true);
    stream.synchronize();
    
    // copy data from npu:0 to cpu
    tensorCpuNew.copy_(tensorDevice, /*non_blocking=*/true);
    stream.synchronize();
  • 异步拷贝Python伪代码:
    input_cpu = torch.rand((1, 100, 1024, 1024), pin_memory = True)
    output_cpu = torch.empty((1, 100, 1024, 1024), pin_memory = True)
    
    # create stream
    stream = mindietorch.npu.Stream("npu:0")
    
    # copy data from cpu to npu:0
    with mindietorch.npu.stream(stream):
         output_npu = input_cpu.to("npu:0", non_blocking = True)
         stream.synchronize()
    
    # copy data from npu:0 to cpu
    with mindietorch.npu.stream(stream):
         output_cpu.copy_(output_npu, non_blocking = True)
         stream.synchronize()