Inter-Device Data Transfer

Note:

  • Call acl.rt.device_can_access_peer to query whether data exchange between two devices is supported. If data exchange is supported, use two acl.rt.device_enable_peer_access calls to enable data exchange: one for enabling data exchange from device 0 to device 1, and the other for enabling data exchange from device 1 to device 0. Then, call acl.rt.memcpy (synchronous mode) or acl.rt.memcpy_async (asynchronous mode) to transfer data via memory copy.
  • Only data exchange between devices in the same PCIe switch is supported.
  • Only data exchange between devices from the same thread or different threads in the same process is supported.

After APIs are called, add an exception handling branch, and record error logs and warning logs. The following is a code snippet of key steps only, which is not ready to be built or run.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
import acl

ACL_MEM_MALLOC_NORMAL_ONLY_P2P = 5 # Allocate P2P memory.
ACL_MEMCPY_DEVICE_TO_DEVICE = 3    # Data exchange within a device or between devices.

dev0, dev1 = 0, 1

ret = acl.init("")

ret = acl.rt.set_device(dev0)
# Query whether data exchange is supported between device 0 and device 1.
can_access_peer, ret = acl.rt.device_can_access_peer(dev0, dev1)
# Enable data exchange between device 0 and device 1.
ret = acl.rt.device_enable_peer_access(dev1, 0)
size = 1024
# Allocate P2P memory:
dev0_mem, ret = acl.rt.malloc(size, ACL_MEM_MALLOC_NORMAL_ONLY_P2P)

ret = acl.rt.set_device(dev1)
# Enable data exchange between device 1 and device 0.
ret = acl.rt.device_enable_peer_access(dev0, 0)
dev1_mem, ret = acl.rt.malloc(size, ACL_MEM_MALLOC_NORMAL_ONLY_P2P)

ret = acl.rt.memcpy(dev0_mem, size, dev1_mem, size, ACL_MEMCPY_DEVICE_TO_DEVICE)
ret = acl.rt.memcpy(dev1_mem, size, dev0_mem, size, ACL_MEMCPY_DEVICE_TO_DEVICE)

ret = acl.rt.free(dev1_mem)
ret = acl.rt.free(dev0_mem)

ret = acl.rt.reset_device(dev1)
ret = acl.rt.set_device(dev0)
ret = acl.rt.reset_device(dev0)

ret = acl.finalize()